summaryrefslogtreecommitdiffstats
path: root/src/video_core/renderer_opengl/gl_shader_decompiler.cpp (unfollow)
Commit message (Collapse)AuthorFilesLines
2019-10-05Shader_ir: Address feedbackFernando Sahmkow1-4/+8
2019-10-05vk_shader_decompiler: Clean code and be const correct.Fernando Sahmkow1-1/+1
2019-10-05gl_shader_decompiler: Refactor and address feedback.Fernando Sahmkow1-17/+18
2019-10-05Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes.Fernando Sahmkow1-3/+5
2019-10-05gl_shader_decompiler: Implement AST decompilingFernando Sahmkow1-29/+242
2019-09-24gl_shader_decompiler: Add tailing return for HUnpack2ReinUsesLisp1-0/+2
2019-09-24gl_shader_decompiler: Fix clang build issuesReinUsesLisp1-26/+23
2019-09-21gl_shader_decompiler: Use uint for images and fix SUATOMReinUsesLisp1-97/+35
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop.
2019-09-21shader/image: Implement SULD and remove irrelevant codeReinUsesLisp1-2/+15
* Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-19VideoCore: Corrections to the MME Inliner and removal of hacky instance management.Fernando Sahmkow1-1/+9
2019-09-19Video Core: initial Implementation of InstanceDraw PackagingFernando Sahmkow1-1/+1
2019-09-17shader_ir/warp: Implement SHFLReinUsesLisp1-8/+55
2019-09-11shader/image: Implement SUATOM and fix SUSTReinUsesLisp1-26/+119
2019-09-06gl_shader_decompiler: Avoid writing output attribute when unimplementedReinUsesLisp1-10/+14
2019-09-06gl_shader_decompiler: Keep track of written images and mark them as modifiedReinUsesLisp1-6/+12
2019-09-05gl_shader_decompiler: Implement shared memoryReinUsesLisp1-0/+23
2019-09-04gl_shader_decompiler: Fixup slow pathReinUsesLisp1-1/+1
2019-09-04gl_device: Disable precise in fragment shaders on bugged driversReinUsesLisp1-1/+8
2019-09-04gl_shader_decompiler: Fixup AMD's slow path typeReinUsesLisp1-1/+1
2019-09-04gl_shader_decompiler: Rework GLSL decompiler type systemReinUsesLisp1-416/+505
GLSL decompiler type system was broken. We converted all return values to float except for some cases where returning we couldn't and implicitly broke the rule of returning floats (e.g. for bools or bool pairs). Instead of doing this introduce class Expression that knows what type a return value has and when a consumer wants to use the string it asks for it with a required type, emitting a runtime error if types are incompatible. This has the disadvantage that there's more C++ code, but we can emit better GLSL code that's easier to read.
2019-08-21shader_ir: Implement VOTEReinUsesLisp1-0/+47
Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
2019-07-20Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.Fernando Sahmkow1-0/+18
This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.
2019-07-20shader/half_set_predicate: Fix HSETP2 implementationReinUsesLisp1-12/+4
2019-07-18gl_shader_decompiler: Rename bufferImage to imageBufferReinUsesLisp1-1/+1
The online OpenGL documentation is wrong. The type definition is imageBuffer.
2019-07-15gl_shader_decompiler: Stub local memory sizeReinUsesLisp1-8/+14
2019-07-15gl_rasterizer: Implement compute shadersReinUsesLisp1-26/+34
2019-07-11gl_shader_decompiler: Fix gl_PointSize redeclarationReinUsesLisp1-1/+1
2019-07-11gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_arrayReinUsesLisp1-2/+3
2019-07-09shader_ir: Unify blocks in decompiled shaders.Fernando Sahmkow1-4/+6
2019-07-09shader_ir: Implement BRX & BRA.CCFernando Sahmkow1-0/+9
2019-07-08gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shadersReinUsesLisp1-29/+75
This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
2019-07-06gl_rasterizer: Minor style changesReinUsesLisp1-1/+1
2019-06-24gl_shader_decompiler: Address feedbackReinUsesLisp1-11/+12
2019-06-21gl_shader_decompiler: Implement image binding settingsReinUsesLisp1-0/+3
2019-06-21shader: Decode SUST and implement backing image functionalityReinUsesLisp1-0/+70
2019-06-21gl_shader_decompiler: Allow 1D textures to be texture buffersReinUsesLisp1-4/+38
2019-06-07shader: Split SSY and PBK stackReinUsesLisp1-4/+27
Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT;
2019-06-06shader: Use shared_ptr to store nodes and move initialization to fileReinUsesLisp1-31/+31
Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class.
2019-06-03gl_shader_decompiler: Remove guest "position" varyingReinUsesLisp1-19/+19
"position" was being written but not read anywhere besides geometry shaders, where it had the same value as gl_Position. This commit replaces "position" with gl_Position, reducing the complexity of our code and the emitted GLSL code.
2019-05-30gl_rasterizer: Move alpha testing to the OpenGL pipelineReinUsesLisp1-19/+1
Removes the alpha testing code from each fragment shader invocation.
2019-05-24gl_shader_decompiler: Use an if based cbuf indexing for broken driversReinUsesLisp1-3/+20
The following code is broken on AMD's proprietary GLSL compiler: ```glsl uint idx = ...; vec4 values = ...; float some_value = values[idx & 3]; ``` It index the wrong components, to fix this the following pessimized code is emitted when that bug is present: ```glsl uint idx = ...; vec4 values = ...; float some_value; if ((idx & 3) == 0) some_value = values.x; if ((idx & 3) == 1) some_value = values.y; if ((idx & 3) == 2) some_value = values.z; if ((idx & 3) == 3) some_value = values.w; ```
2019-05-21renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format stringLioncash1-1/+1
This accidentally slipped through a rebase.
2019-05-20shader: Implement S2R Tid{XYZ} and CtaId{XYZ}ReinUsesLisp1-0/+16
2019-05-20gl_shader_decompiler: Make GetSwizzle constexprReinUsesLisp1-7/+7
2019-05-20gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenationLioncash1-21/+20
2019-05-20gl_shader_decompiler: Replace individual overloads with the fmt-based oneLioncash1-28/+16
Gets rid of the need to special-case brace handling depending on the overload used, and makes it consistent across the board with how fmt handles them. Strings with compile-time deducible strings are directly forwarded to std::string's constructor, so we don't need to worry about the performance difference here, as it'll be identical.
2019-05-20gl_shader_decompiler: Utilize fmt overload of AddLine() where applicableLioncash1-136/+152
2019-05-19gl_shader_decompiler: Add AddLine() overload that forwards to fmtLioncash1-0/+11
In a lot of places throughout the decompiler, string concatenation via operator+ is used quite heavily. This is usually fine, when not heavily used, but when used extensively, can be a problem. operator+ creates an entirely new heap allocated temporary string and given we perform expressions like: std::string thing = a + b + c + d; this ends up with a lot of unnecessary temporary strings being created and discarded, which kind of thrashes the heap more than we need to. Given we utilize fmt in some AddLine calls, we can make this a part of the ShaderWriter's API. We can make an overload that simply acts as a passthrough to fmt. This way, whenever things need to be appended to a string, the operation can be done via a single string formatting operation instead of discarding numerous temporary strings. This also has the benefit of making the strings themselves look nicer and makes it easier to spot errors in them.
2019-05-10video_core/renderer_opengl/gl_shader_decompiler: Remove unused Composite() functionLioncash1-11/+0
This isn't used at all, so it can be removed.
2019-05-03gl_shader_decompiler: Skip physical unused attributesReinUsesLisp1-18/+27
2019-05-03shader: Add physical attributes commentariesReinUsesLisp1-0/+2
2019-05-03gl_shader_decompiler: Implement GLSL physical attributesReinUsesLisp1-65/+100
2019-05-03gl_shader_decompiler: Abstract generic attribute operationsReinUsesLisp1-29/+26
2019-05-03gl_shader_decompiler: Declare all possible varyings on physical attribute usageReinUsesLisp1-26/+65
2019-05-03shader: Remove unused AbufNode Ipa modeReinUsesLisp1-2/+1
2019-04-16shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmeticReinUsesLisp1-28/+23
Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall.
2019-04-16gl_shader_decompiler: Fix MrgH0 decompilationReinUsesLisp1-2/+2
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-16shader_ir/decode: Implement half float saturationReinUsesLisp1-4/+11
2019-04-16renderer_opengl: Implement half float NaN comparisonsReinUsesLisp1-18/+42
2019-04-14gl_shader_decompiler: Use variable AOFFI on supported hardwareReinUsesLisp1-5/+13
2019-04-14shader_ir: Implement STG, keep track of global memory usage and flushReinUsesLisp1-11/+25
2019-04-10Remove bounding in LD_CFernando Sahmkow1-2/+1
2019-04-05gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()Lioncash1-12/+12
Temporal generally indicates a relation to time, but this is just creating a temporary, so this isn't really an accurate name for what the function is actually doing.
2019-04-05gl_shader_decompiler: Fix TXQ typesReinUsesLisp1-2/+3
TXQ returns integer types. Shaders usually do: R0 = TXQ(); // => int R0 = static_cast<float>(R0); If we don't treat it as an integer, it will cast a binary float value as float - resulting in a corrupted number.
2019-04-03gl_shader_decompiler: Return early when an operation is invalidReinUsesLisp1-1/+6
2019-03-31gl_shader_decompiler: Hide local definitions inside an anonymous namespaceReinUsesLisp1-6/+8
2019-03-30gl_shader_decompiler: Add AOFFI backing implementationReinUsesLisp1-38/+85
2019-02-26shader/decode: Remove extras from MetaTextureReinUsesLisp1-21/+35
2019-02-14shader_decompiler: Improve Accuracy of Attribute Interpolation.Fernando Sahmkow1-27/+17
2019-02-12gl_shader_decompiler: Re-implement TLDS lodReinUsesLisp1-21/+34
2019-02-07shader_ir: Remove F4 prefix to texture operationsReinUsesLisp1-12/+12
This was originally included because texture operations returned a vec4. These operations now return a single float and the F4 prefix doesn't mean anything.
2019-02-07shader_ir: Clean texture management codeReinUsesLisp1-32/+41
Previous code relied on GLSL parameter order (something that's always ill-formed on an IR design). This approach passes spatial coordiantes through operation nodes and array and depth compare values in the the texture metadata. It still contains an "extra" vector containing generic nodes for bias and component index (for example) which is still a bit ill-formed but it should be better than the previous approach.
2019-02-07gl_shader_disk_cache: Save GLSL and entries into the precompiled fileReinUsesLisp1-3/+4
2019-02-07gl_shader_decompiler: Remove name entriesReinUsesLisp1-5/+3
2019-02-03shader_ir: Rename BasicBlock to NodeBlockReinUsesLisp1-3/+3
It's not always used as a basic block. Rename it for consistency.
2019-01-30shader_ir: Unify constant buffer offset valuesReinUsesLisp1-2/+3
Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries.
2019-01-30gl_shader_cache: Use explicit bindingsReinUsesLisp1-3/+8
2019-01-30shader_decode: Implement LDG and basic cbuf trackingReinUsesLisp1-6/+38
2019-01-15gl_shader_decompiler: replace std::get<> with std::get_if<> for macOS compatibilityReinUsesLisp1-44/+58
2019-01-15gl_shader_decompiler: Inline textureGather componentReinUsesLisp1-15/+16
2019-01-15shader_ir: Remove composite primitives and use temporals insteadReinUsesLisp1-66/+37
2019-01-15gl_shader_decompiler: Fixup AssignCompositeHalfReinUsesLisp1-1/+1
2019-01-15shader_decode: Use proper primitive namesReinUsesLisp1-10/+8
2019-01-15shader_decode: Use BitfieldExtract instead of shift + andReinUsesLisp1-0/+7
2019-01-15shader_ir: Remove Ipa primitiveReinUsesLisp1-8/+0
2019-01-15gl_shader_decompiler: Use rasterizer's UBO size limitReinUsesLisp1-1/+3
2019-01-15gl_shader_gen: Fixup code formattingReinUsesLisp1-1/+1
2019-01-15video_core: Rename glsl_decompiler to gl_shader_decompilerReinUsesLisp1-1/+1
2019-01-15shader_ir: Remove RZ and use Register::ZeroIndex insteadReinUsesLisp1-4/+5
2019-01-15shader_decode: Implement TEXS.F16ReinUsesLisp1-0/+26
2019-01-15glsl_decompiler: Fixup TLDSReinUsesLisp1-1/+0
2019-01-15glsl_decompiler: Fixup geometry shadersReinUsesLisp1-10/+16
2019-01-15glsl_decompiler: Fixup permissive member function declarationsReinUsesLisp1-133/+133
2019-01-15video_core: Implement IR based geometry shadersReinUsesLisp1-2/+68
2019-01-15shader_decode: Implement HSET2ReinUsesLisp1-0/+6
2019-01-15shader_decode: Rework HSETP2ReinUsesLisp1-26/+33
2019-01-15shader_decode: Implement HFMA2ReinUsesLisp1-4/+5
2019-01-15glsl_decompiler: Remove HNegate inliningReinUsesLisp1-10/+0
2019-01-15shader_decode: Implement POPCReinUsesLisp1-0/+7
2019-01-15shader_decode: Implement TLDS (untested)ReinUsesLisp1-2/+27
2019-01-15shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompilingReinUsesLisp1-9/+20
2019-01-15video_core: Return safe values after an assert hitsReinUsesLisp1-0/+5
2019-01-15video_core: Address feedbackReinUsesLisp1-1/+1
2019-01-15glsl_decompiler: ImplementationReinUsesLisp1-0/+1393