summaryrefslogtreecommitdiffstats
path: root/src/video_core/renderer_vulkan/vk_shader_decompiler.cpp (unfollow)
Commit message (Collapse)AuthorFilesLines
2021-02-15Review 1Kelebek11-2/+2
2021-02-15Implement texture offset support for TexelFetch and TextureGather and add offsets for TldsKelebek11-7/+24
Formatting
2021-02-13video_core: Fix clang build issuesReinUsesLisp1-1/+5
2021-02-13video_core: Reimplement the buffer cacheReinUsesLisp1-0/+3
Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits.
2021-01-16vk_shader_decompiler: Show comments as OpUndef with a typeReinUsesLisp1-1/+4
Silence the new validation layer error about SPIR-V not allowing OpUndef on a OpTypeVoid, even when the SPIR-V spec doesn't say anything against it. They will be inserted as an undefined int to avoid SPIRV-Cross and validation errors, but only when a debugging tool is attached.
2021-01-04renderer_vulkan: Move device abstraction to vulkan_commonReinUsesLisp1-1/+1
2021-01-03renderer_vulkan: Rename VKDevice to DeviceReinUsesLisp1-3/+3
The "VK" prefix predates the "Vulkan" namespace. It was carried around the codebase for consistency. "VKDevice" currently is a bad alias with "VkDevice" (only an upcase character of difference) that can cause confusion. Rename all instances of it.
2020-12-31vulkan_instance: Allow different Vulkan versions and enforce 1.1ReinUsesLisp1-9/+2
For listing the available physical devices we can use Vulkan 1.0. Now that MoltenVK supports 1.1 we can require it for running games. Add missing documentation.
2020-12-30video_core: Rewrite the texture cacheReinUsesLisp1-3/+3
The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues.
2020-12-25vk_shader_decompiler: Silence warning when compiling without assertsReinUsesLisp1-0/+1
2020-12-07video_core: Make use of ordered container contains() where applicableLioncash1-2/+1
With C++20, we can use the more concise contains() member function instead of comparing the result of the find() call with the end iterator.
2020-12-07video_core: Remove unnecessary enum class casting in logging messagesLioncash1-4/+4
fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.
2020-12-05video_core: Resolve more variable shadowing scenarios pt.2Lioncash1-14/+14
Migrates the video core code closer to enabling variable shadowing warnings as errors. This primarily sorts out shadowing occurrences within the Vulkan code.
2020-11-26vk_shader_decompiler: Implement force early fragment testsReinUsesLisp1-3/+3
Force early fragment tests when the 3D method is enabled. The established pipeline cache takes care of recompiling if needed. This is implemented only on Vulkan to avoid invalidating the shader cache on OpenGL.
2020-11-25cleanup unneeded comments and newlinesameerj1-6/+0
2020-11-25Refactor MaxwellToSpirvComparison. Use Common::BitCastameerj1-27/+29
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-11-25Address PR feedback from Reinameerj1-28/+22
2020-11-25vulkan_renderer: Alpha Test Culling Implementationameerj1-2/+52
Used by various textures in many titles, e.g. SSBU menu.
2020-08-20vk_device: Use Vulkan 1.0 properlyReinUsesLisp1-2/+10
Enable the required capabilities to use Vulkan 1.0 without validation errors and disable those that are not compatible with it.
2020-07-21video_core: Remove unused variablesLioncash1-3/+2
Silences several compiler warnings about unused variables.
2020-07-16renderer_{opengl,vulkan}: Clamp shared memory to host's limitReinUsesLisp1-3/+9
This stops shaders from failing to build when the exceed host's shared memory size limit. An error is logged.
2020-06-02vk_shader_decompiler: Implement atomic image operationsReinUsesLisp1-40/+24
Implement atomic operations on images. On GLSL these are atomicImage* functions (e.g. atomicImageAdd).
2020-06-02vk_rasterizer: Implement storage texelsReinUsesLisp1-26/+49
This is the equivalent of an image buffer on OpenGL. - Used by Octopath Traveler
2020-05-27shader/other: Implement MEMBAR.CTSReinUsesLisp1-3/+4
This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
2020-05-22shader/other: Implement BAR.SYNC 0x0ReinUsesLisp1-0/+17
Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
2020-05-22shader/other: Implement thread comparisons (NV_shader_thread_group)ReinUsesLisp1-0/+23
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-22shader_decompiler: Visit source nodes even when they assign to RZReinUsesLisp1-1/+3
Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.
2020-05-22vk_shader_decompiler: Don't assert for void returnsReinUsesLisp1-2/+1
Atomic instructions can be used without returning anything and this is valid code. Remove the assert.
2020-05-13vk_rasterizer: Implement constant attributesReinUsesLisp1-10/+20
Constant attributes (in OpenGL known disabled attributes) are not supported on Vulkan, even with extensions. To emulate this behavior we return zero on reads from disabled vertex attributes in shader code. This has no caching cost because attribute formats are not dynamic state on Vulkan and we have to store it in the pipeline cache anyway. - Fixes Animal Crossing: New Horizons terrain borders
2020-05-09shader_ir: Separate float-point comparisons in ordered and unorderedReinUsesLisp1-1/+26
This allows us to use native SPIR-V instructions without having to manually check for NAN.
2020-04-26shader/arithmetic_integer: Implement CC for IADDReinUsesLisp1-0/+11
2020-04-23shader_ir: Turn classes into data structuresReinUsesLisp1-36/+36
2020-04-11renderer_vulkan: Drop Vulkan-HppReinUsesLisp1-1/+1
2020-04-06shader/memory: Implement RED.E.ADDReinUsesLisp1-25/+40
Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.
2020-04-02shader_decompiler: Remove FragCoord.w hack and change IPA implementationReinUsesLisp1-10/+7
Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-03-30vk_decompiler: add atomic op and handler function.Nguyen Dac Nam1-6/+25
2020-03-15vk_shader_decompiler: fix linux buildmakigumo1-1/+1
2020-03-13vk/gl_shader_decompiler: Silence assertion on computeReinUsesLisp1-3/+6
2020-03-13vk_shader_decompiler: Fix default varying regressionReinUsesLisp1-2/+6
2020-03-13vk_shader_decompiler: Fix implicit type conversionRodrigo Locatti1-1/+1
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13vk_shader_decompiler: Add XFB decorations to generic varyingsReinUsesLisp1-16/+89
2020-03-13vk_device: Shrink formatless capability name sizeReinUsesLisp1-2/+2
2020-03-13vk_shader_decompiler: Use registry for specializationReinUsesLisp1-12/+22
2020-02-24vk_shader_decompiler: Implement indexed texturesReinUsesLisp1-8/+18
Implement accessing textures through an index. It uses the same interface as OpenGL, the main difference is that Vulkan bindings are forced to be arrayed (the binding index doesn't change for stacked textures in SPIR-V).
2020-02-24video_core: Implement more scaler attribute formatsReinUsesLisp1-4/+2
While changing this, fix assert in vk_shader_decompiler. We now know scaled formats are expected to be float in shaders attributes.
2020-02-20clang-formatNguyen Dac Nam1-1/+1
2020-02-20shader_decompiler: only add StorageImageReadWithoutFormat when availableNguyen Dac Nam1-1/+4
2020-02-19shader_decompiler: add check in case of device not support ShaderStorageImageReadWithoutFormatNguyen Dac Nam1-0/+4
2020-02-19vk_shader: add Capability StorageImageReadWithoutFormatNguyen Dac Nam1-0/+1
2020-02-19vk_shader: Implement function ImageLoad (Used by Kirby Start Allies)Nguyen Dac Nam1-2/+6
Please enter the commit message for your changes. Lines starting
2020-02-16vk_shader_decompiler: Implement Layer output attributeReinUsesLisp1-6/+30
SPIR-V's Layer is GLSL's gl_Layer. It lets the application choose from a shader stage (vertex, tessellation or geometry) which framebuffer layer write the output fragments to.
2020-02-14vk_shader_decompiler: Fix vertex id and instance idReinUsesLisp1-4/+13
Vulkan's VertexIndex and InstanceIndex don't match with hardware. This is because Nvidia implements gl_VertexID and gl_InstanceID. The math that relates these is: gl_VertexIndex = gl_BaseVertex + gl_VertexID gl_InstanceIndex = gl_InstanceIndex + gl_InstanceID To emulate it using what Vulkan's SPIR-V offers (the *Index variants) this commit substracts gl_Base* from gl_*Index to obtain the OpenGL and hardware's equivalent.
2020-01-26shader/memory: Implement ATOM.ADDReinUsesLisp1-33/+33
ATOM operates atomically on global memory. For now only add ATOM.ADD since that's what was found in commercial games. This asserts for ATOM.ADD.S32 (handling the others as unimplemented), although ATOM.ADD.U32 shouldn't be any different. This change forces us to change the default type on SPIR-V storage buffers from float to uint. We could also alias the buffers, but it's simpler for now to just use uint. While we are at it, abstract the code to avoid repetition.
2020-01-25Shader_IR: Address feedback.Fernando Sahmkow1-2/+3
2020-01-24Shader_IR: Correct Custom Variable assignment.Fernando Sahmkow1-0/+2
2020-01-24Shader_IR: Implement Injectable Custom Variables to the IR.Fernando Sahmkow1-0/+16
2020-01-24vk_shader_decompiler: Disable default values on unwritten render targetsReinUsesLisp1-13/+16
Some games like The Legend of Zelda: Breath of the Wild assign render targets without writing them from the fragment shader. This generates Vulkan validation errors, so silence these I previously introduced a commit to set "vec4(0, 0, 0, 1)" for these attachments. The problem is that this is not what games expect. This commit reverts that change.
2020-01-19vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-VReinUsesLisp1-3/+11
Also updates sirit to include atomic instructions.
2020-01-16shader/memory: Implement ATOMS.ADD.U32ReinUsesLisp1-0/+7
2020-01-04Shader_IR: Address FeedbackFernando Sahmkow1-14/+4
2019-12-30Shader_IR: add the ability to amend code in the shader ir.Fernando Sahmkow1-0/+18
This commit introduces a mechanism by which shader IR code can be amended and extended. This useful for track algorithms where certain information can derived from before the track such as indexes to array samplers.
2019-12-21vk_shader_decompiler: Use Visit instead of reimplementing itReinUsesLisp1-23/+1
ExprCondCode visit implements the generic Visit. Use this instead of that one. As an intended side effect this fixes unwritten memory usages in cases when a negation of a condition code is used.
2019-12-19vk_shader_decompiler: Fix full decompilationReinUsesLisp1-3/+5
When full decompilation was enabled, labels were not being inserted and instructions were misused. Fix these bugs.
2019-12-19vk_shader_decompiler: Skip NDC correction when it is nativeReinUsesLisp1-1/+1
Avoid changing gl_Position when the NDC used by the game is [0, 1] (Vulkan's native).
2019-12-19vk_shader_decompiler: Normalize output fragment attachmentsReinUsesLisp1-12/+9
Some games write from fragment shaders to an unexistant framebuffer attachment or they don't write to one when it exists in the framebuffer. Fix this by skipping writes or adding zeroes.
2019-12-19vk_shader_decompiler: Update sirit and implement Texture AOFFIReinUsesLisp1-22/+30
2019-12-10shader: Implement MEMBAR.GLReinUsesLisp1-0/+14
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10vk_shader_decompiler: Fix build issues on old gcc versionsReinUsesLisp1-2/+3
2019-12-10vk_shader_decompiler: Reduce YNegate's severityReinUsesLisp1-1/+1
2019-12-10shader_ir/other: Implement S2R InvocationIdReinUsesLisp1-0/+1
2019-12-10vk_shader_decompiler: Misc changesReinUsesLisp1-677/+1594
Update Sirit and its usage in vk_shader_decompiler. Highlights: - Implement tessellation shaders - Implement geometry shaders - Implement some missing features - Use native half float instructions when available.
2019-11-23video_core: Unify ProgramType and ShaderStage into ShaderTypeReinUsesLisp1-14/+15
2019-11-14Shader_IR: Implement TXD instruction.Fernando Sahmkow1-0/+6
2019-11-14Shader_IR: Implement FLO instruction.Fernando Sahmkow1-0/+2
2019-11-08shader_ir/warp: Implement FSWZADDReinUsesLisp1-0/+6
2019-11-08gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsicsReinUsesLisp1-40/+3
2019-10-25Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.Fernando Sahmkow1-0/+7
2019-10-18vk_shader_decompiler: Mark operator() function parameters as const referencesLioncash1-21/+23
These parameters aren't actually modified in any way, so they can be made const references.
2019-10-16vk_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()Lioncash1-0/+3
This would previously result in NeverExecute and UnusedIndex being treated as regular predicates.
2019-10-05Shader_Ir: Address Feedback and clang format.Fernando Sahmkow1-25/+18
2019-10-05vk_shader_decompiler: Correct Branches inside conditionals.Fernando Sahmkow1-1/+11
2019-10-05vk_shader_decompiler: Clean code and be const correct.Fernando Sahmkow1-7/+5
2019-10-05vk_shader_compiler: Don't enclose branches with if(true) to avoid crashing AMDFernando Sahmkow1-16/+33
2019-10-05vk_shader_compiler: Correct SPIR-V AST DecompilingFernando Sahmkow1-4/+11
2019-10-05Shader_IR: allow else derivation to be optional.Fernando Sahmkow1-2/+4
2019-10-05vk_shader_compiler: Implement the decompiler in SPIR-VFernando Sahmkow1-22/+276
2019-09-21gl_shader_decompiler: Use uint for images and fix SUATOMReinUsesLisp1-12/+0
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop.
2019-09-21shader/image: Implement SULD and remove irrelevant codeReinUsesLisp1-0/+7
* Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-17shader_ir/warp: Implement SHFLReinUsesLisp1-0/+50
2019-09-13vk_device: Add miscellaneous features and minor style changesReinUsesLisp1-3/+3
* Increase minimum Vulkan requirements * Require VK_EXT_vertex_attribute_divisor * Require depthClamp, samplerAnisotropy and largePoints features * Search and expose VK_KHR_uniform_buffer_standard_layout * Search and expose VK_EXT_index_type_uint8 * Search and expose native float16 arithmetics * Track current driver with VK_KHR_driver_properties * Query and expose SSBO alignment * Query more image formats * Improve logging overall * Minor style changes * Minor rephrasing of commentaries
2019-09-11shader/image: Implement SUATOM and fix SUSTReinUsesLisp1-0/+42
2019-08-21shader_ir: Implement VOTEReinUsesLisp1-0/+25
Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
2019-07-20Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.Fernando Sahmkow1-0/+18
This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.
2019-07-20shader/half_set_predicate: Fix HSETP2 implementationReinUsesLisp1-13/+4
2019-07-09shader_ir: Implement BRX & BRA.CCFernando Sahmkow1-0/+9
2019-07-08gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shadersReinUsesLisp1-8/+6
This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
2019-06-21shader: Decode SUST and implement backing image functionalityReinUsesLisp1-0/+7
2019-06-07shader: Split SSY and PBK stackReinUsesLisp1-12/+37
Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT;
2019-06-06shader: Use shared_ptr to store nodes and move initialization to fileReinUsesLisp1-25/+25
Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class.
2019-05-26vk_shader_decompiler: Misc fixesReinUsesLisp1-43/+61
Fix missing OpSelectionMerge instruction. This caused devices loses on most hardware, Intel didn't care. Fix [-1;1] -> [0;1] depth conversions. Conditionally use VK_EXT_scalar_block_layout. This allows us to use non-std140 layouts on UBOs. Update external Vulkan headers.
2019-05-20shader: Implement S2R Tid{XYZ} and CtaId{XYZ}ReinUsesLisp1-0/+18
2019-05-10renderer_vulkan/vk_shader_decompiler: Remove unused variable from DeclareInternalFlags()Lioncash1-1/+0
2019-05-03shader: Remove unused AbufNode Ipa modeReinUsesLisp1-4/+3
2019-04-16vk_shader_decompiler: Add missing operationsReinUsesLisp1-0/+7
2019-04-16shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmeticReinUsesLisp1-5/+7
Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall.
2019-04-16shader_ir/decode: Implement half float saturationReinUsesLisp1-0/+6
2019-04-14shader_ir: Implement STG, keep track of global memory usage and flushReinUsesLisp1-6/+8
2019-04-10vk_shader_decompiler: Implement flow primitivesReinUsesLisp1-5/+82
2019-04-10vk_shader_decompiler: Implement most common texture primitivesReinUsesLisp1-8/+65
2019-04-10vk_shader_decompiler: Implement texture decompilation helper functionsReinUsesLisp1-0/+32
2019-04-10vk_shader_decompiler: Implement Assign and LogicalAssignReinUsesLisp1-2/+64
2019-04-10vk_shader_decompiler: Implement non-OperationCode visitsReinUsesLisp1-7/+129
2019-04-10vk_shader_decompiler: Implement OperationCode decompilation interfaceReinUsesLisp1-1/+411
2019-04-10vk_shader_decompiler: Implement VisitReinUsesLisp1-1/+50
2019-04-10vk_shader_decompiler: Implement labels tree and flowReinUsesLisp1-0/+71
2019-04-10vk_shader_decompiler: Implement declarationsReinUsesLisp1-3/+457
2019-04-10vk_shader_decompiler: Declare and stub interface for a SPIR-V decompilerReinUsesLisp1-0/+45