summaryrefslogtreecommitdiffstats
path: root/src/video_core/engines/shader_bytecode.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* shader: Primitive Vulkan integrationReinUsesLisp2021-07-231-2298/+0
|
* video_core: Remove unnecessary enum class casting in logging messagesLioncash2020-12-071-4/+2
| | | | | | | fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.
* shader_bytecode: Make use of [[nodiscard]] where applicableLioncash2020-11-201-73/+79
| | | | Ensures that all queried values are made use of.
* shader_bytecode: Eliminate variable shadowingLioncash2020-11-201-15/+17
|
* shader/arithmetic: Implement FCMP immediate + register variantReinUsesLisp2020-10-281-0/+2
| | | | Trivially add the encoding for this.
* shader/half_set: Implement HSET2_IMMReinUsesLisp2020-06-231-0/+8
| | | | | | Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone.
* shader_ir: Separate float-point comparisons in ordered and unorderedReinUsesLisp2020-05-091-12/+16
| | | | | This allows us to use native SPIR-V instructions without having to manually check for NAN.
* shader/arithmetic_integer: Implement IADD.XReinUsesLisp2020-04-261-0/+4
| | | | | IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.
* Merge pull request #3734 from ReinUsesLisp/half-float-modsbunnei2020-04-251-2/+0
|\ | | | | decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
| * decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bitsReinUsesLisp2020-04-231-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68 That is itself tested against nvdisasm (Nvidia's official disassembler).
* | Fix -Wdeprecated-copy warning.Markus Wick2020-04-241-0/+1
|/
* CMakeLists: Specify -Wextra on linux buildsLioncash2020-04-161-1/+1
| | | | | | | | | | | Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.
* Merge pull request #3612 from ReinUsesLisp/redFernando Sahmkow2020-04-151-0/+8
|\ | | | | shader/memory: Implement RED.E.ADD and minor changes to ATOM
| * shader/memory: Implement RED.E.ADDReinUsesLisp2020-04-061-0/+8
| | | | | | | | | | | | | | | | Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.
* | shader/arithmetic: Add FCMP_CR variantReinUsesLisp2020-04-151-2/+4
| | | | | | | | Adds another variant of FCMP.
* | Merge pull request #3578 from ReinUsesLisp/vmnmxFernando Sahmkow2020-04-121-1/+56
|\ \ | | | | | | shader/video: Partially implement VMNMX
| * | shader/video: Partially implement VMNMXReinUsesLisp2020-04-121-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).
| * | shader_bytecode: Fix I2I_IMM encodingReinUsesLisp2020-03-281-1/+1
| | |
* | | shader_bytecode: Rename MOV_SYS to S2RReinUsesLisp2020-04-041-2/+2
| | |
* | | shader_bytecode: Add encoding for BARReinUsesLisp2020-04-041-0/+2
| | |
* | | shader_bytecode: Add encoding for VOTE.VTGReinUsesLisp2020-04-041-0/+2
| |/ |/|
* | shader_decode: merge GlobalAtomicOp to AtomicOpnamkazy2020-03-301-13/+1
|/
* Merge pull request #3520 from ReinUsesLisp/legacy-varyingsbunnei2020-03-261-0/+6
|\ | | | | gl_shader_decompiler: Implement legacy varyings
| * shader/shader_ir: Track usage in input attribute and of legacy varyingsReinUsesLisp2020-03-161-0/+6
| |
* | shader_bytecode: update BFE instructions struct.Nguyen Dac Nam2020-03-131-8/+3
|/
* Merge pull request #3379 from ReinUsesLisp/cbuf-offsetbunnei2020-02-141-2/+2
|\ | | | | shader/decode: Fix constant buffer offsets
| * shader/decode: Fix constant buffer offsetsReinUsesLisp2020-02-051-2/+2
| | | | | | | | | | | | Some instances were using cbuf34.offset instead of cbuf34.GetOffset(). This returned the an invalid offset. Address those instances and rename offset to "shifted_offset" to avoid future bugs.
* | Merge pull request #3369 from ReinUsesLisp/shfbunnei2020-02-081-0/+20
|\ \ | |/ |/| shader/shift: Implement SHF
| * shader/shift: Implement SHF_LEFT_{IMM,R}ReinUsesLisp2020-02-021-0/+20
| | | | | | | | Shifts a pair of registers to the left and returns the high register.
* | Merge pull request #3357 from ReinUsesLisp/bfi-rcbunnei2020-02-041-0/+2
|\ \ | | | | | | shader/bfi: Implement register-constant buffer variant
| * | shader/bfi: Implement register-constant buffer variantReinUsesLisp2020-01-271-0/+2
| | | | | | | | | | | | | | | It's the same as the variant that was implemented, but it takes the operands from another source.
* | | Merge pull request #3356 from ReinUsesLisp/fcmpbunnei2020-02-041-0/+7
|\ \ \ | |_|/ |/| | shader/arithmetic: Implement FCMP
| * | shader/arithmetic: Implement FCMPReinUsesLisp2020-01-271-0/+7
| |/ | | | | | | | | Compares the third operand with zero, then selects between the first and second.
* / shader/memory: Implement ATOM.ADDReinUsesLisp2020-01-261-0/+30
|/ | | | | | | | | | | | | ATOM operates atomically on global memory. For now only add ATOM.ADD since that's what was found in commercial games. This asserts for ATOM.ADD.S32 (handling the others as unimplemented), although ATOM.ADD.U32 shouldn't be any different. This change forces us to change the default type on SPIR-V storage buffers from float to uint. We could also alias the buffers, but it's simpler for now to just use uint. While we are at it, abstract the code to avoid repetition.
* shader/memory: Implement ATOMS.ADD.U32ReinUsesLisp2020-01-161-3/+34
|
* Merge pull request #3239 from ReinUsesLisp/p2rbunnei2020-01-011-1/+3
|\ | | | | shader/p2r: Implement P2R Pr
| * shader/r2p: Refactor P2R to support P2RReinUsesLisp2019-12-201-1/+3
| |
* | Merge pull request #3228 from ReinUsesLisp/ptpbunnei2019-12-271-6/+6
|\ \ | |/ |/| shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
| * shader/texture: Implement TLD4.PTPReinUsesLisp2019-12-161-6/+6
| |
* | shader_bytecode: Fix TLD4S encodingReinUsesLisp2019-12-181-1/+1
|/
* Shader_Ir: Correct TLD4S encoding and implement f16 flag.Fernando Sahmkow2019-12-121-1/+2
|
* shader: Implement MEMBAR.GLReinUsesLisp2019-12-101-1/+17
| | | | Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
* shader_ir/memory: Implement patch storesReinUsesLisp2019-12-101-1/+2
|
* shader_bytecode: Remove corrupted characterReinUsesLisp2019-12-071-1/+1
|
* Shader_IR: Implement TXD instruction.Fernando Sahmkow2019-11-141-0/+20
|
* Shader_IR: Implement FLO instruction.Fernando Sahmkow2019-11-141-0/+6
|
* Shader_Bytecode: Add encodings for FLO, SHF and TXDFernando Sahmkow2019-11-141-0/+18
|
* Merge pull request #3081 from ReinUsesLisp/fswzadd-shufflesFernando Sahmkow2019-11-141-0/+10
|\ | | | | shader: Implement FSWZADD and reimplement SHFL
| * shader_ir/warp: Implement FSWZADDReinUsesLisp2019-11-081-0/+10
| |
* | video_core: Silence implicit conversion warningsReinUsesLisp2019-11-081-5/+7
|/
* Shader_IR: Fix TLD4 and add Bindless Variant.Fernando Sahmkow2019-10-301-1/+29
| | | | | | This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant.
* shader_bytecode: Make Matcher constexpr capableLioncash2019-10-241-13/+13
| | | | | | | | Greatly shrinks the amount of generated code for GetDecodeTable(). Collapses an assembly output of 9000+ lines down to ~3621 with Clang, and 6513 down to ~2616 with GCC, given it's now allowed to construct all the entries as a sequence of constant data.
* Merge pull request #2869 from ReinUsesLisp/suldbunnei2019-09-241-3/+5
|\ | | | | shader/image: Implement SULD and fix SUATOM
| * gl_shader_decompiler: Use uint for images and fix SUATOMReinUsesLisp2019-09-211-2/+2
| | | | | | | | | | | | In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop.
| * shader/image: Implement SULD and remove irrelevant codeReinUsesLisp2019-09-211-1/+1
| | | | | | | | | | * Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
| * shader_bytecode: Add SULD encodingReinUsesLisp2019-09-211-0/+2
| |
* | Merge pull request #2878 from FernandoS27/icmpRodrigo Locatti2019-09-211-0/+13
|\ \ | |/ |/| shader_ir: Implement ICMP
| * Shader_IR: ICMP corrections and fixesFernando Sahmkow2019-09-211-0/+2
| |
| * Shader_IR: Implement ICMP.Fernando Sahmkow2019-09-201-0/+11
| |
* | shader_ir/warp: Implement SHFLReinUsesLisp2019-09-171-0/+18
|/
* shader/image: Implement SUATOM and fix SUSTReinUsesLisp2019-09-111-0/+32
|
* shader/shift: Implement SHR wrapped and clamped variantsReinUsesLisp2019-09-041-0/+4
| | | | | | Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires.
* Merge pull request #2812 from ReinUsesLisp/f2i-selectorbunnei2019-09-041-1/+7
|\ | | | | shader_ir/conversion: Implement F2I and F2F F16 selector
| * shader_ir/conversion: Split int and float selector and implement F2F H1ReinUsesLisp2019-08-281-1/+8
| |
| * shader_ir/conversion: Implement F2I F16 Ra.H1ReinUsesLisp2019-08-281-2/+1
| |
* | Merge pull request #2811 from ReinUsesLisp/fsetp-fixbunnei2019-09-041-0/+1
|\ \ | | | | | | float_set_predicate: Add missing negation bit for the second operand
| * | float_set_predicate: Add missing negation bit for the second operandReinUsesLisp2019-08-281-0/+1
| |/
* / shader_ir: Implement VOTEReinUsesLisp2019-08-211-0/+16
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
* Merge pull request #2753 from FernandoS27/float-convertbunnei2019-08-211-2/+0
|\ | | | | Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
| * Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.Fernando Sahmkow2019-07-201-2/+0
| | | | | | | | | | This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.
* | shader_ir: Implement NOPReinUsesLisp2019-08-041-0/+7
|/
* shader/half_set_predicate: Implement missing HSETP2 variantsReinUsesLisp2019-07-201-6/+20
|
* Merge pull request #2695 from ReinUsesLisp/layer-viewportFernando Sahmkow2019-07-151-1/+1
|\ | | | | gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
| * gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shadersReinUsesLisp2019-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
* | Merge pull request #2692 from ReinUsesLisp/tlds-f16Fernando Sahmkow2019-07-141-1/+2
|\ \ | | | | | | shader/texture: Add F16 support for TLDS
| * | shader/texture: Add F16 support for TLDSReinUsesLisp2019-07-071-1/+2
| |/
* / shader_ir: Implement BRX & BRA.CCFernando Sahmkow2019-07-091-0/+16
|/
* shader_bytecode: Include missing <array>ReinUsesLisp2019-06-241-0/+1
|
* shader: Decode SUST and implement backing image functionalityReinUsesLisp2019-06-211-2/+64
|
* shader: Implement texture buffersReinUsesLisp2019-06-211-0/+16
|
* shader_bytecode: Mark EXIT as flow instructionFernando Sahmkow2019-06-041-1/+1
|
* shader/memory: Implement ST (generic memory)ReinUsesLisp2019-05-211-0/+1
|
* shader/memory: Implement LD (generic memory)ReinUsesLisp2019-05-211-4/+15
|
* shader_ir/other: Implement IPA.IDXReinUsesLisp2019-05-031-0/+1
|
* shader_ir/memory: Implement physical input attributesReinUsesLisp2019-05-031-0/+4
|
* shader_bytecode: Add AL2P decodingReinUsesLisp2019-05-031-2/+15
|
* Merge pull request #2407 from FernandoS27/f2fbunnei2019-04-201-7/+20
|\ | | | | Do some corrections in conversion shader instructions.
| * Do some corrections in conversion shader instructions.Fernando Sahmkow2019-04-161-7/+20
| | | | | | | | | | | | Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs.
* | Merge pull request #2348 from FernandoS27/guest-bindlessbunnei2019-04-181-0/+36
|\ \ | | | | | | Implement Bindless Textures on Shader Decompiler and GL backend
| * | Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.Fernando Sahmkow2019-04-081-1/+1
| | |
| * | Implement TXQ_BFernando Sahmkow2019-04-081-0/+2
| | |
| * | Corrections to TEX_BFernando Sahmkow2019-04-081-0/+32
| | |
| * | Implement Bindless Samplers and TEX_B in the IR.Fernando Sahmkow2019-04-081-0/+2
| | |
* | | Merge pull request #2315 from ReinUsesLisp/severity-decompilerbunnei2019-04-171-1/+15
|\ \ \ | | | | | | | | shader_ir/decode: Reduce the severity of common assertions
| * | | shader_ir/memory: Reduce severity of LD_L cache management and log itReinUsesLisp2019-04-031-0/+7
| | | |
| * | | shader_ir/memory: Reduce severity of ST_L cache management and log itReinUsesLisp2019-04-031-1/+8
| |/ /
* | / shader_ir: Implement STG, keep track of global memory usage and flushReinUsesLisp2019-04-141-0/+6
| |/ |/|
* | Merge pull request #2366 from FernandoS27/xmad-fixbunnei2019-04-101-0/+3
|\ \ | | | | | | Correct XMAD mode, psl and high_b on different encodings.
| * | Correct XMAD mode, psl and high_b on different encodings.Fernando Sahmkow2019-04-081-0/+3
| |/
* / Correct LOP_IMN encodingFernando Sahmkow2019-04-081-1/+1
|/
* Merge pull request #2147 from ReinUsesLisp/texture-cleanbunnei2019-03-101-12/+13
|\ | | | | shader_ir: Remove "extras" from the MetaTexture
| * shader/decode: Remove extras from MetaTextureReinUsesLisp2019-02-261-4/+4
| |
| * shader/decode: Split memory and texture instructions decodingReinUsesLisp2019-02-261-8/+9
| |
* | video_core/engines: Remove unnecessary includesLioncash2019-03-061-1/+0
|/ | | | | | | | | Removes a few unnecessary dependencies on core-related machinery, such as the core.h and memory.h, which reduces the amount of rebuilding necessary if those files change. This also uncovered some indirect dependencies within other source files. This also fixes those.
* shader_decompiler: Improve Accuracy of Attribute Interpolation.Fernando Sahmkow2019-02-141-3/+3
|
* Corrected F2I None mode to RoundEven.Fernando Sahmkow2019-02-111-1/+1
|
* Merge pull request #2081 from ReinUsesLisp/lmem-64bunnei2019-02-051-3/+3
|\ | | | | shader_ir/memory: Add LD_L 64 bits loads
| * shader_bytecode: Rename BytesN enums to BitsNReinUsesLisp2019-02-031-3/+3
| |
* | Merge pull request #2082 from FernandoS27/txq-stlbunnei2019-02-051-0/+4
|\ \ | |/ |/| Fix TXQ not using the component mask.
| * Update src/video_core/engines/shader_bytecode.hMat M2019-02-041-1/+1
| | | | | | Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
| * Fix TXQ not using the component mask.Fernando Sahmkow2019-02-031-0/+4
| |
* | shader_ir: Unify constant buffer offset valuesReinUsesLisp2019-01-301-0/+8
|/ | | | | | | Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries.
* shader_decode: Implement LDG and basic cbuf trackingReinUsesLisp2019-01-301-0/+8
|
* shader_decode: Implement VMAD and VSETPReinUsesLisp2019-01-151-2/+3
|
* shader_decode: Implement HFMA2ReinUsesLisp2019-01-151-0/+1
|
* shader_decode: Fixup clang-formatReinUsesLisp2019-01-151-1/+1
|
* shader_ir: Initial implementationReinUsesLisp2019-01-151-0/+4
|
* shader_bytecode: Fixup encodingReinUsesLisp2019-01-151-1/+1
|
* shader_bytecode: Fixup TEXS.F16 encodingReinUsesLisp2018-12-261-1/+1
|
* Fixed uninitialized memory due to missing returns in canaryDavid Marcec2018-12-191-0/+2
| | | | Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
* shader_bytecode: Fixup half float's operator B encodingReinUsesLisp2018-12-181-1/+1
|
* Implement postfactor multiplication/division for fmul instructionsheapo2018-12-171-1/+1
|
* gl_shader_decompiler: Implement TEXS.F16ReinUsesLisp2018-12-051-1/+2
|
* Merge pull request #1763 from ReinUsesLisp/bfibunnei2018-11-261-0/+3
|\ | | | | gl_shader_decompiler: Implement BFI_IMM_R
| * gl_shader_decompiler: Implement BFI_IMM_RReinUsesLisp2018-11-211-0/+3
| |
* | Merge pull request #1760 from ReinUsesLisp/r2pbunnei2018-11-261-0/+14
|\ \ | | | | | | gl_shader_decompiler: Implement R2P_IMM
| * | gl_shader_decompiler: Implement R2P_IMMReinUsesLisp2018-11-211-0/+14
| |/
* | Merge pull request #1783 from ReinUsesLisp/clip-distancesbunnei2018-11-261-0/+2
|\ \ | | | | | | gl_shader_decompiler: Implement clip distances
| * | gl_shader_decompiler: Implement clip distancesReinUsesLisp2018-11-231-0/+2
| |/
* | Merge pull request #1769 from ReinUsesLisp/ccbunnei2018-11-241-4/+3
|\ \ | | | | | | gl_shader_decompiler: Rename cc to condition code and name internal flags
| * | gl_shader_decompiler: Rename control codes to condition codesReinUsesLisp2018-11-221-4/+3
| |/
* / Added predicate comparison LessEqualWithNan (#1736)Hexagon122018-11-231-0/+1
|/ | | | | | | | * Added predicate comparison LessEqualWithNan * oops * Clang fix
* Merge pull request #1527 from FernandoS27/assert-flowbunnei2018-11-011-0/+1
|\ | | | | Assert Control Flow Instructions using Control Codes
| * Assert Control Flow Instructions using Control CodesFernandoS272018-10-291-1/+2
| |
* | Merge pull request #1528 from FernandoS27/assert-control-codesbunnei2018-11-011-1/+5
|\ \ | | | | | | Assert Control Codes Generation on Shader Instructions
| * | Assert Control Codes GenerationFernandoS272018-10-301-1/+5
| |/
* / global: Use std::optional instead of boost::optional (#1578)Frederic L2018-10-301-4/+4
|/ | | | | | | | | | | | | | | | * get rid of boost::optional * Remove optional references * Use std::reference_wrapper for optional references * Fix clang format * Fix clang format part 2 * Adressed feedback * Fix clang format and MacOS build
* Implemented LD_L and ST_LFernandoS272018-10-241-0/+31
|
* Implement PointSizeFernandoS272018-10-231-0/+1
|
* Merge pull request #1519 from ReinUsesLisp/vsetpbunnei2018-10-231-3/+15
|\ | | | | gl_shader_decompiler: Implement VSETP
| * gl_shader_decompiler: Implement VSETPReinUsesLisp2018-10-231-0/+2
| |
| * gl_shader_decompiler: Abstract VMAD into a video subsetReinUsesLisp2018-10-231-3/+13
| |
* | Merge pull request #1512 from ReinUsesLisp/brkbunnei2018-10-231-3/+7
|\ \ | |/ |/| gl_shader_decompiler: Implement PBK and BRK
| * gl_shader_decompiler: Implement PBK and BRKReinUsesLisp2018-10-181-3/+7
| |
* | Added Saturation to FMUL32IFernandoS272018-10-231-0/+4
| |
* | Fixed FSETP and FSETFernandoS272018-10-221-2/+0
| |
* | Merge pull request #1501 from ReinUsesLisp/half-floatbunnei2018-10-201-0/+145
|\ \ | |/ |/| gl_shader_decompiler: Implement H* instructions
| * gl_shader_decompiler: Implement HSET2_RReinUsesLisp2018-10-151-0/+18
| |
| * gl_shader_decompiler: Implement HSETP2_RReinUsesLisp2018-10-151-0/+20
| |
| * gl_shader_decompiler: Implement HFMA2 instructionsReinUsesLisp2018-10-151-0/+32
| |
| * gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMMReinUsesLisp2018-10-151-0/+30
| |
| * gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructionsReinUsesLisp2018-10-151-0/+25
| |
| * gl_shader_decompiler: Setup base for half float unpacking and settingReinUsesLisp2018-10-151-0/+20
| |
* | shader_bytecode: Add Control Code enum 0xfReinUsesLisp2018-10-151-1/+1
|/ | | | | | Control Code 0xf means to unconditionally execute the instruction. This value is passed to most BRA, EXIT and SYNC instructions (among others) but this may not always be the case.
* gl_shader_decompiler: Implement VMADReinUsesLisp2018-10-111-0/+36
|
* gl_shader_decompiler: Implement geometry shadersReinUsesLisp2018-10-071-0/+112
|
* shader_bytecode: Lay out the Ipa-related enums betterLioncash2018-09-211-2/+12
| | | | This is more consistent with the surrounding enums.
* shader_bytecode: Make operator== and operator!= of IpaMode const qualifiedLioncash2018-09-211-6/+7
| | | | | These don't affect the state of the struct and can be const member functions.
* Merge pull request #1279 from FernandoS27/csetpbunnei2018-09-191-0/+47
|\ | | | | shader_decompiler: Implemented (Partialy) Control Codes and CSETP
| * Implemented I2I.CC on the NEU control code, used by SMOFernandoS272018-09-171-1/+1
| |
| * Implemented CSETPFernandoS272018-09-171-0/+11
| |
| * Implemented Control CodesFernandoS272018-09-171-0/+36
| |
* | Added texture misc modes to texture instructionsFernandoS272018-09-171-1/+147
|/
* Merge pull request #1326 from FearlessTobi/port-4182bunnei2018-09-171-9/+9
|\ | | | | Port #4182 from Citra: "Prefix all size_t with std::"
| * Port #4182 from Citra: "Prefix all size_t with std::"fearlessTobi2018-09-151-9/+9
| |
* | Shaders: Implemented multiple-word loads and stores to and from attribute memory.Subv2018-09-151-1/+9
|/ | | | This seems to be an optimization performed by nouveau.
* Merge pull request #1263 from FernandoS27/tex-modebunnei2018-09-121-0/+10
|\ | | | | shader_decompiler: Implemented (Partially) Texture Processing Modes
| * Implemented Texture Processing ModesFernandoS272018-09-121-0/+10
| |
* | Implemented encodings for LEA and PSETFernandoS272018-09-111-0/+64
|/
* Implemented TMMLFernandoS272018-09-101-5/+19
|
* Implemented TXQ dimension query type, used by SMO.FernandoS272018-09-091-1/+16
|
* Change name of TEXQ to TXQ, in order to match NVIDIA's namingFernandoS272018-09-091-2/+2
|
* Implemented IPA ProperlyFernandoS272018-09-061-0/+12
|
* Merge pull request #1215 from ogniK5377/texs-nodep-assertbunnei2018-09-021-0/+1
|\ | | | | Added assert for TEXS nodep
| * Added assert for TEXS nodepDavid Marcec2018-09-011-0/+1
| |
* | Merge pull request #1214 from ogniK5377/ipa-assertbunnei2018-09-021-2/+5
|\ \ | | | | | | Added better asserts to IPA, Renamed IPA modes to match mesa
| * | Added better asserts to IPA, Renamed IPA modes to match mesaDavid Marcec2018-09-011-2/+5
| |/ | | | | | | | | | | | | | | | | | | IpaMode is changed to IpaInterpMode IpaMode is suppose to be 2 bits not 3 Added IpaSampleMode Added Saturate Renamed modes based on https://github.com/mesa3d/mesa/blob/d27c7918916cdc8092959124955f887592e37d72/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp#L2530
* | Merge pull request #1216 from ogniK5377/ffma-assertbunnei2018-09-021-0/+3
|\ \ | | | | | | Added FFMA asserts and missing fields
| * | Removed saturate assertDavid Marcec2018-09-011-1/+0
| | | | | | | | | | | | Saturate already implemented
| * | Added FFMA assertsDavid Marcec2018-09-011-0/+4
| |/
* | Removed saturate assertDavid Marcec2018-09-011-1/+0
| | | | | | | | Unneeded as we already implement it
* | Added FMUL assertsDavid Marcec2018-09-011-0/+5
|/
* Added predicate comparison GreaterEqualWithNanHexagon122018-08-311-0/+1
|
* gl_shader_decompiler: Implement POPC (#1203)Laku2018-08-311-0/+10
| | | | | | * Implement POPC * implement invert
* Merge pull request #1200 from bunnei/improve-ipabunnei2018-08-301-0/+6
|\ | | | | gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
| * gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.bunnei2018-08-291-0/+6
| |
* | Shaders: Implemented IADD3tech4me2018-08-291-1/+23
|/
* fix SEL_IMM bitstringLaku2018-08-241-1/+1
|
* Shaders: Added decodings for IADD3 instructionstech4me2018-08-231-0/+6
|
* implement lop3Laku2018-08-221-0/+19
|
* shader_bytecode: Parenthesize conditional expression within GetTextureType()Lioncash2018-08-211-1/+1
| | | | Resolves a -Wlogical-op-parentheses warning.
* shader_bytecode: Replace some UNIMPLEMENTED logs.bunnei2018-08-211-2/+6
|
* Merge pull request #1112 from Subv/sampler_typesbunnei2018-08-201-4/+72
|\ | | | | Shaders: Use the correct shader type when sampling textures.
| * Shader: Added bitfields for the texture type of the various sampling instructions.Subv2018-08-191-1/+65
| |
| * Shaders: Added decodings for TLD4 and TLD4SSubv2018-08-191-3/+7
| |
* | Merge pull request #1089 from Subv/neg_bitsbunnei2018-08-191-0/+4
|\ \ | | | | | | Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
| * | Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.Subv2018-08-181-0/+4
| | | | | | | | | | | | We should definitely audit our shader generator for more errors like this.
* | | Shaders/TEXS: Fixed the component mask in the TEXS instruction.Subv2018-08-191-6/+11
| |/ |/| | | | | Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
* | Merge pull request #1109 from Subv/ldg_decodebunnei2018-08-191-0/+4
|\ \ | | | | | | Shaders: Added decodings for the LDG and STG instructions.
| * | Shaders: Added decodings for the LDG and STG instructions.Subv2018-08-191-0/+4
| | |
* | | Merge pull request #1108 from Subv/front_facingbunnei2018-08-191-0/+3
|\ \ \ | | | | | | | | Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
| * | | Shaders: Implemented the gl_FrontFacing input attribute (attr 63).Subv2018-08-191-0/+3
| |/ /
* / / Shader: Implemented the predicate and mode arguments of LOP.Subv2018-08-181-1/+6
|/ / | | | | | | | | | | The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)). This is used by Super Mario Odyssey.
* / Added predcondition GreaterThanWithNanDavid Marcec2018-08-181-0/+1
|/
* gl_shader_decompiler: Implement XMAD instruction.bunnei2018-08-131-4/+25
|
* Merge pull request #1010 from bunnei/unk-vert-attrib-shaderbunnei2018-08-121-2/+1
|\ | | | | gl_shader_decompiler: Improve handling of unknown input/output attributes.
| * gl_shader_decompiler: Improve handling of unknown input/output attributes.bunnei2018-08-121-2/+1
| |
* | Merge pull request #1018 from Subv/ssy_syncbunnei2018-08-121-0/+7
|\ \ | |/ |/| GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
| * GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY).Subv2018-08-111-0/+7
| |
* | video_core: Use variable template variants of type_traits interfaces where applicableLioncash2018-08-101-2/+1
|/
* Merge pull request #982 from bunnei/stub-unk-63bunnei2018-08-091-0/+2
|\ | | | | gl_shader_decompiler: Stub input attribute Unknown_63.
| * gl_shader_decompiler: Stub input attribute Unknown_63.bunnei2018-08-081-0/+2
| |
* | gl_shader_decompiler: Let OpenGL interpret floats.bunnei2018-08-081-9/+4
|/ | | | | - Accuracy is lost in translation to string, e.g. with NaN. - Needed for Super Mario Odyssey.
* shader_bytecode: Implement other TEXS masks.bunnei2018-07-221-5/+9
|
* gl_shader_decompiler: Implement SEL instruction.bunnei2018-07-221-0/+11
|
* video_core: Use nested namespaces where applicableLioncash2018-07-211-8/+4
| | | | Compresses a few namespace specifiers to be more compact.
* Merge pull request #655 from bunnei/pred-lt-nanbunnei2018-07-131-0/+1
|\ | | | | gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
| * gl_shader_decompiler: Implement PredCondition::LessThanWithNan.bunnei2018-07-131-0/+1
| |
* | gl_shader_decompiler: Use FlowCondition field in EXIT instruction.bunnei2018-07-131-0/+9
|/
* Merge pull request #652 from Subv/fadd32iSebastian Valle2018-07-131-0/+9
|\ | | | | GPU: Implement the FADD32I shader instruction.
| * GPU: Implement the FADD32I shader instruction.Subv2018-07-121-0/+9
| |
* | Merge pull request #651 from Subv/ffma_decodebunnei2018-07-121-1/+1
|\ \ | | | | | | GPU: Corrected the decoding of FFMA for immediate operands.
| * | GPU: Corrected the decoding of FFMA for immediate operands.Subv2018-07-121-1/+1
| |/
* | Merge pull request #625 from Subv/imnmxbunnei2018-07-081-3/+17
|\ \ | |/ |/| GPU: Implemented the IMNMX shader instruction.
| * GPU: Implemented the IMNMX shader instruction.Subv2018-07-041-3/+17
| | | | | | | | It's similar to the FMNMX instruction but it works on integers.
* | Merge pull request #626 from Subv/shader_syncbunnei2018-07-051-0/+5
|\ \ | | | | | | GPU: Stub the shader SYNC and DEPBAR instructions.
| * | GPU: Stub the shader SYNC and DEPBAR instructions.Subv2018-07-041-0/+5
| |/ | | | | | | It is unknown at this moment if we actually need to do something with these instructions or if the GLSL compiler takes care of that for us.
* | Merge pull request #622 from Subv/unused_texbunnei2018-07-051-1/+1
|\ \ | | | | | | GPU: Ignore unused textures and corrected the TEX shader instruction decoding.
| * | GPU: Corrected the decoding for the TEX shader instruction.Subv2018-07-041-1/+1
| |/
* / GPU: Implemented the PSETP shader instruction.Subv2018-07-041-0/+13
|/ | | | It's similar to the isetp and fsetp instructions but it works on predicates instead.
* GPU: Implemented MUFU suboperation 8, sqrt.Subv2018-07-031-0/+1
|
* Merge pull request #602 from Subv/mufu_subopbunnei2018-07-011-2/+1
|\ | | | | GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
| * GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.Subv2018-06-301-2/+1
| |
* | gl_shader_decompiler: Implement predicate NotEqualWithNan.bunnei2018-06-301-0/+1
|/
* Build: Fixed some MSVC warnings in various parts of the code.Subv2018-06-201-2/+2
|
* GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.Subv2018-06-191-2/+3
| | | | | Like the MOV32I and FMUL32I instructions. This fixes a potential crash when using these instructions.
* gl_shader_decompiler: Implement LOP instructions.bunnei2018-06-171-0/+14
|
* gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP.bunnei2018-06-171-3/+2
|
* gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I.bunnei2018-06-161-1/+2
|
* Merge pull request #558 from Subv/iadd32ibunnei2018-06-121-2/+10
|\ | | | | GPU: Implemented the iadd32i shader instruction.
| * GPU: Implemented the iadd32i shader instruction.Subv2018-06-121-2/+10
| |
* | gl_shader_decompiler: Implement saturate for float instructions.bunnei2018-06-121-2/+1
|/
* GPU: Implement the iset family of shader instructions.Subv2018-06-091-0/+9
|
* GPU: Added decodings for the ISET family of instructions.Subv2018-06-091-0/+7
|
* Merge pull request #550 from Subv/ssybunnei2018-06-091-0/+2
|\ | | | | GPU: Stub the SSY shader instruction.
| * GPU: Stub the SSY shader instruction.Subv2018-06-091-0/+2
| | | | | | | | This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
* | Merge pull request #551 from bunnei/shrbunnei2018-06-091-0/+4
|\ \ | | | | | | gl_shader_decompiler: Implement SHR instruction.
| * | gl_shader_decompiler: Implement SHR instruction.bunnei2018-06-091-0/+4
| |/
* | gl_shader_decompiler: Implement IADD instruction.bunnei2018-06-091-5/+11
| |
* | gl_shader_decompiler: Add missing asserts for saturate_a instructions.bunnei2018-06-091-1/+1
|/
* gl_shader_decompiler: Implement BFE_IMM instruction.bunnei2018-06-071-3/+15
|
* gl_shader_decompiler: F2F: Implement rounding modes.bunnei2018-06-071-3/+12
|
* shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD.bunnei2018-06-071-0/+20
|
* gl_shader_decompiler: Implement LD_C instruction.bunnei2018-06-071-0/+16
|
* gl_shader_decompiler: Refactor uniform handling to allow different decodings.bunnei2018-06-061-6/+10
|
* Merge pull request #516 from Subv/f2i_rbunnei2018-06-061-4/+20
|\ | | | | GPU: Implemented the F2I_R shader instruction.
| * GPU: Implemented the F2I_R shader instruction.Subv2018-06-051-4/+20
| |
* | Merge pull request #521 from Subv/brabunnei2018-06-051-4/+5
|\ \ | | | | | | GPU: Corrected the branch targets for the shader bra instruction.
| * | GPU: Corrected the branch targets for the shader bra instruction.Subv2018-06-051-4/+5
| | |
* | | gl_shader_decompiler: Implement SHL instruction.bunnei2018-06-051-13/+17
|/ /
* | GPU: Implement the ISCADD shader instructions.Subv2018-06-051-0/+16
| |
* | GPU: Added decodings for the ISCADD instructions.Subv2018-06-051-0/+7
|/
* Merge pull request #514 from Subv/lop32ibunnei2018-06-051-1/+15
|\ | | | | GPU: Implemented the LOP32I instruction.
| * GPU: Implemented the LOP32I instruction.Subv2018-06-041-1/+15
| |
* | Merge pull request #510 from Subv/isetpbunnei2018-06-051-0/+10
|\ \ | | | | | | GPU: Implemented the ISETP_R and ISETP_C instructions
| * | GPU: Implemented the ISETP_R and ISETP_C shader instructions.Subv2018-06-041-0/+10
| |/
* | Merge pull request #512 from Subv/fsetbunnei2018-06-051-1/+1
|\ \ | | | | | | GPU: Corrected the FSET and I2F instructions.
| * | GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f.Subv2018-06-041-1/+1
| |/
* | GPU: Partially implemented the shader BRA instruction.Subv2018-06-041-0/+13
| |
* | GPU: Added decoding for the BRA instruction.Subv2018-06-041-0/+2
|/
* gl_shader_decompiler: Implement TEXS component mask.bunnei2018-06-031-2/+16
|
* Merge pull request #494 from bunnei/shader-texbunnei2018-06-031-0/+15
|\ | | | | gl_shader_decompiler: Implement TEX, fixes for TEXS.
| * gl_shader_decompiler: Implement TEX instruction.bunnei2018-06-011-0/+10
| |
| * gl_shader_decompiler: Support multi-destination for TEXS.bunnei2018-06-011-0/+5
| |
* | gl_shader_decompiler: Implement RRO as a register move.bunnei2018-06-031-3/+7
|/
* Merge pull request #489 from Subv/vertexidbunnei2018-05-301-0/+4
|\ | | | | Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
| * Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.Subv2018-05-301-0/+4
| |
* | gl_shader_decompiler: Partially implement F2F_R instruction.bunnei2018-05-301-3/+3
|/
* shader_bytecode: Implement other variants of FMNMX.bunnei2018-05-261-3/+7
|
* Merge pull request #458 from Subv/fmnmxbunnei2018-05-211-0/+5
|\ | | | | Shaders: Implemented the FMNMX shader instruction.
| * Shaders: Implemented the FMNMX shader instruction.Subv2018-05-211-0/+5
| |
* | ShadersDecompiler: Added decoding for the PSETP instruction.Subv2018-05-191-0/+3
|/
* shader_bytecode: Add decoding for FMNMX instruction.bunnei2018-04-291-0/+2
|
* gl_shader_decompiler: Partially implement I2I_R, and I2F_R.bunnei2018-04-291-8/+8
|
* shader_bytecode: Add decodings for i2i instructions.bunnei2018-04-291-3/+20
|
* gl_shader_decompiler: Implement MOV32_IMM instruction.bunnei2018-04-291-2/+2
|
* gl_shader_decompiler: Boilerplate for handling integer instructions.bunnei2018-04-261-1/+9
|
* Shaders: Added bit decodings for the I2I instruction.Subv2018-04-251-0/+6
|
* Shaders: Added decodings for the FSET instructions.Subv2018-04-251-8/+29
|
* shader_bytecode: Add several more instruction decodings.bunnei2018-04-211-5/+52
|
* shader_bytecode: Decode instructions based on bit strings.bunnei2018-04-211-185/+172
|
* ShaderGen: Implemented predicated instruction execution.Subv2018-04-211-1/+5
| | | | Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
* ShaderGen: Implemented the fsetp instruction.Subv2018-04-211-3/+40
| | | | | | | | | | Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id. These predicate variables are initialized to false on shader startup and are set via the fsetp instructions. TODO: * Not all the comparison types are implemented. * Only the single-predicate version is implemented.
* ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO).Subv2018-04-201-0/+3
|
* ShaderGen: Implemented the fmul32i shader instruction.Subv2018-04-191-3/+14
|
* shader_bytecode: Make ctor's constexpr and explicit.bunnei2018-04-181-7/+7
|
* gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions.bunnei2018-04-171-0/+14
|
* gl_shader_decompiler: Add support for TEXS instruction.bunnei2018-04-171-5/+14
|
* shaders: Add NumTextureSamplers const, remove unused #pragma.bunnei2018-04-151-2/+0
|
* shaders: Address PR review feedback.bunnei2018-04-141-1/+1
|
* shaders: Fix GCC and clang build issues.bunnei2018-04-141-3/+3
|
* gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup.bunnei2018-04-141-20/+39
|
* shader_bytecode: Add FSETP and KIL to GetInfo.bunnei2018-04-141-0/+3
|
* shader_bytecode: Add SubOp decoding.bunnei2018-04-141-0/+10
|
* shader_bytecode: Add initial module for shader decoding.bunnei2018-04-141-0/+297