summaryrefslogtreecommitdiffstats
path: root/src/video_core/shader (follow)
Commit message (Collapse)AuthorAgeFilesLines
* shader_ir: Add internal flag gettersReinUsesLisp2019-01-152-0/+10
|
* shader_ir: Add attribute gettersReinUsesLisp2019-01-152-0/+26
|
* shader_ir: Add constant buffer gettersReinUsesLisp2019-01-152-0/+25
|
* shader_ir: Add register getterReinUsesLisp2019-01-152-0/+9
|
* shader_ir: Add immediate node constructorsReinUsesLisp2019-01-152-1/+34
|
* shader_ir: Initial implementationReinUsesLisp2019-01-1528-0/+1542
|
* Remove references to PICA and rasterizers in video_coreJames Rowe2018-01-139-2453/+0
|
* Improved performance of FromAttributeBufferHuw Pascoe2017-09-171-1/+2
| | | | | | | Ternary operator is optimized by the compiler whereas std::min() is meant to return a value. I've noticed a 5%-10% emulation speed increase.
* pica/shader/jit: implement SETEMIT and EMITwwylele2017-08-192-2/+49
|
* correct constnesswwylele2017-08-192-2/+4
|
* pica/shader/interpreter: implement SETEMIT and EMITwwylele2017-08-191-0/+16
|
* pica/shader: extend UnitState for GSwwylele2017-08-192-0/+84
| | | | | Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access. This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
* pica/shader_interpreter: fix off-by-one in LOOPwwylele2017-07-271-1/+1
|
* Stop using reserved operator names (and/or/xor) with XbyakYuri Kunde Schlesner2017-06-171-13/+13
| | | | Also has the Dynarmic upgrade with the same change
* Pica: Set program code / swizzle data limit to 4096Jannik Vogel2017-05-115-13/+16
| | | | | | | | | | | | | One of the later commits will enable writing to GS regs. It turns out that on startup, most games will write 4096 GS program words. The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages: ``` HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024 ``` New constants have been introduced to represent these limits. The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
* Doxygen: Amend minor issues (#2593)Mat M2017-02-272-2/+4
| | | | | | | | | Corrects a few issues with regards to Doxygen documentation, for example: - Incorrect parameter referencing. - Missing @param tags. - Typos in @param tags. and a few minor other issues.
* video_core/shader: Document sanitized MUL operationYuri Kunde Schlesner2017-02-121-0/+8
|
* Merge pull request #2550 from yuriks/pica-refactor2Yuri Kunde Schlesner2017-02-122-2/+4
|\ | | | | Small VideoCore cleanups
| * VideoCore: Split regs.h inclusionsYuri Kunde Schlesner2017-02-092-2/+4
| |
* | video_core: Fix benign out-of-bounds indexing of array (#2553)Yuri Kunde Schlesner2017-02-111-2/+1
|/ | | | | | The resulting pointer wasn't written to unless the index was verified as valid, but that's still UB and triggered debug checks in MSVC. Reported by garrettboast on IRC
* VideoCore: Move Regs to its own fileYuri Kunde Schlesner2017-02-042-2/+2
|
* VideoCore: Split shader regs from Regs structYuri Kunde Schlesner2017-02-044-6/+6
|
* VideoCore: Split rasterizer regs from Regs structYuri Kunde Schlesner2017-02-042-13/+13
|
* Merge pull request #2476 from yuriks/shader-refactor3Yuri Kunde Schlesner2017-02-044-78/+58
|\ | | | | Oh No! More shader changes!
| * VideoCore: Extract swrast-specific data from OutputVertexYuri Kunde Schlesner2017-01-302-37/+14
| |
| * VideoCore/Shader: Clean up OutputVertex::FromAttributeBufferYuri Kunde Schlesner2017-01-301-9/+14
| | | | | | | | | | | | This also fixes a long-standing but neverthless harmless memory corruption bug, whech the padding of the OutputVertex struct would get corrupted by unused attributes.
| * VideoCore: Split shader output writing from semantic loadingYuri Kunde Schlesner2017-01-302-18/+16
| |
| * VideoCore: Consistently use shader configuration to load attributesYuri Kunde Schlesner2017-01-304-12/+12
| |
| * VideoCore: Rename some types to more accurate namesYuri Kunde Schlesner2017-01-304-6/+6
| |
* | ShaderJIT: add 16 dummy bytes at the bottom of the stackwwylele2017-02-031-2/+5
| |
* | Common/x64: remove legacy emitter and abi (#2504)Weiyi Wang2017-01-311-1/+0
| | | | | | These are not used any more since we moved shader JIT to xbyak.
* | shader_jit_x64_compiler: esi and edi should be persistent (#2500)Merry2017-01-311-0/+2
|/
* VideoCore/Shader: Move entry_point to SetupBatchYuri Kunde Schlesner2017-01-265-22/+23
|
* VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetupYuri Kunde Schlesner2017-01-265-40/+36
|
* Shader: Remove OutputRegisters structYuri Kunde Schlesner2017-01-263-19/+13
|
* Shader: Initialize conditional_code in interpreterYuri Kunde Schlesner2017-01-262-3/+3
| | | | | | | This doesn't belong in LoadInputVertex because it also happens for non-VS invocations. Since it's not used by the JIT it seems adequate to initialize it in the interpreter which is the only thing that cares about them.
* Shader: Don't read ShaderSetup from global stateYuri Kunde Schlesner2017-01-261-3/+3
|
* shader_jit_x64: Don't read program from global stateYuri Kunde Schlesner2017-01-263-22/+22
|
* VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngineYuri Kunde Schlesner2017-01-264-19/+10
|
* VideoCore/Shader: Split interpreter and JIT into separate ShaderEnginesYuri Kunde Schlesner2017-01-266-96/+150
|
* VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h}Yuri Kunde Schlesner2017-01-263-2/+2
|
* VideoCore/Shader: Split shader uniform state and shader engineYuri Kunde Schlesner2017-01-263-16/+46
| | | | | Currently there's only a single dummy implementation, which will be split in a following commit.
* VideoCore/Shader: Add constness to methodsYuri Kunde Schlesner2017-01-262-4/+4
|
* VideoCore/Shader: Use only entry_point as ShaderSetup paramYuri Kunde Schlesner2017-01-262-9/+11
| | | | | This removes all implicit dependency of ShaderState on global PICA state.
* VideoCore/Shader: Use self instead of g_state.vs in ShaderSetupYuri Kunde Schlesner2017-01-262-11/+8
|
* VideoCore/Shader: Extract input vertex loading code into functionYuri Kunde Schlesner2017-01-262-20/+22
|
* video_core: fix shader.cpp signed / unsigned warningKloen2017-01-231-2/+2
|
* Fix some warnings (#2399)Jonathan Hao2017-01-041-2/+0
|
* VideoCore/Shader: Extract DebugData out from UnitStateYuri Kunde Schlesner2016-12-167-101/+97
|
* Remove unnecessary castYuri Kunde Schlesner2016-12-161-3/+1
|
* VideoCore/Shader: Extract evaluate_condition lambda to function scopeYuri Kunde Schlesner2016-12-161-26/+24
|
* VideoCore/Shader: Extract call lambda up a scope and remove unused paramYuri Kunde Schlesner2016-12-161-21/+17
|
* VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffsetYuri Kunde Schlesner2016-12-162-18/+11
|
* VideoCore/Shader: Move DebugData to a separate fileYuri Kunde Schlesner2016-12-163-172/+188
|
* shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexingYuri Kunde Schlesner2016-12-151-1/+1
|
* VideoCore: Eliminate an unnecessary copy in the drawcall loopYuri Kunde Schlesner2016-12-152-2/+2
|
* shader_jit_x64: Use Reg32 for LOOP* registers, eliminating castsYuri Kunde Schlesner2016-12-151-16/+16
|
* VideoCore: Convert x64 shader JIT to use Xbyak for assemblyYuri Kunde Schlesner2016-12-152-223/+225
|
* shader_jit: Fix non-SSE4.1 path where FLR would not truncateJannik Vogel2016-12-041-1/+1
|
* shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shiftedJannik Vogel2016-12-021-6/+9
|
* VideoCore: Shader interpreter cleanupsYuri Kunde Schlesner2016-09-301-32/+42
|
* VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfoYuri Kunde Schlesner2016-09-301-3/+1
| | | | | | As far as I can tell, memset was replaced by a fill without correcting the parameter type, causing an out-of-bounds array read in the Vec4 constructor.
* Remove special rules for Windows.h and library includesYuri Kunde Schlesner2016-09-211-1/+1
|
* Use negative priorities to avoid special-casing the self-includeYuri Kunde Schlesner2016-09-213-3/+3
|
* Remove empty newlines in #include blocks.Emmanuel Gil Peyrot2016-09-215-22/+3
| | | | | | | This makes clang-format useful on those. Also add a bunch of forgotten transitive includes, which otherwise prevented compilation.
* Manually tweak source formatting and then re-run clang-formatYuri Kunde Schlesner2016-09-194-9/+6
|
* Sources: Run clang-format on everything.Emmanuel Gil Peyrot2016-09-186-311/+335
|
* VideoCore: Fix dangling lambda context in shader interpreterYuri Kunde Schlesner2016-09-161-1/+1
| | | | | | The static meant that after the first execution, these lambda context would be pointing to a random location on the stack. Fixes a random crash when using the interpreter.
* Retrieve shader result from new OutputRegisters-typeJannik Vogel2016-05-163-56/+68
|
* Use new shader-jit signature for interpreterJannik Vogel2016-05-133-8/+8
|
* Refactor access to state in shader-jitJannik Vogel2016-05-134-24/+42
|
* Move program_counter and call_stack from UnitState to interpreterJannik Vogel2016-05-123-45/+42
|
* Move default_attributes into Pica stateJannik Vogel2016-05-121-2/+0
|
* Merge pull request #1690 from JayFoxRox/tex-type-3bunnei2016-05-121-1/+2
|\ | | | | Pica: Implement texture type 3 (Projection2D)
| * Pica: Add tc0.w to OutputVertexJannik Vogel2016-05-111-1/+2
| |
* | Turn ShaderSetup into structJannik Vogel2016-05-112-52/+53
|/
* Pica: Replace logic in shader.cpp with loopJannik Vogel2016-05-031-34/+4
|
* VideoCore: Run include-what-you-use and fix most includes.Emmanuel Gil Peyrot2016-04-306-14/+43
|
* Merge pull request #1730 from hrydgard/vertex-loaderbunnei2016-04-291-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove late accesses to attribute_config * Refactor: Extract VertexLoader from command_processor.cpp. Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached. * Move "&" to their proper place, add missing includes and make some properly relative. * Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached). * Optimize the vertex loader, nearly doubling its speed. * Debugger fix * Move and rename the MemoryAccesses class to MemoryAccessTracker.
| * Refactor: Extract VertexLoader from command_processor.cpp.Henrik Rydgard2016-04-281-1/+1
| | | | | | | | Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
* | Common: Remove section measurement from profiler (#1731)Yuri Kunde Schlesner2016-04-291-3/+0
| | | | | | | | This has been entirely superseded by MicroProfile. The rest of the code can go when a simpler frametime/FPS meter is added to the GUI.
* | shader: Shader size is long uint, not uint.Sam Spilsbury2016-04-241-1/+1
| |
* | shader: Handle non-CALL opcodes with a breakSam Spilsbury2016-04-241-0/+2
| |
* | shader: Format string must be provided inline and not as a variableSam Spilsbury2016-04-241-1/+1
|/
* shader_jit_x64: Rename RuntimeAssert to Compile_Assert.bunnei2016-04-142-5/+5
|
* shader_jit_x64.cpp: Rename JitCompiler to JitShader.bunnei2016-04-143-92/+92
|
* shader_jit_x64: Free memory that's no longer needed after compilation.bunnei2016-04-141-0/+6
|
* shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses.bunnei2016-04-142-5/+8
|
* shader_jit_x64: Use CALL/RET instead of JMP for subroutines.bunnei2016-04-141-17/+7
|
* shader_jit_x64: Separate initialization and code generation for readability.bunnei2016-04-141-9/+8
|
* shader_jit_x64: Get rid of unnecessary last_program_counter variable.bunnei2016-04-142-6/+2
|
* shader_jit_x64: Execute certain asserts at runtime.bunnei2016-04-142-5/+19
| | | | - This is because we compile the full shader code space, and therefore its common to compile malformed instructions.
* shader: Remove unused 'state' argument from 'Setup' function.bunnei2016-04-142-3/+2
|
* shader_jit_x64: Specify shader main offset at runtime.bunnei2016-04-143-10/+6
|
* shader_jit_x64: Allocate each program independently and persist for emu session.bunnei2016-04-143-38/+28
|
* shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions.bunnei2016-04-142-35/+119
|
* shader_jit_x64: Fix strict memory aliasing issues.bunnei2016-04-141-1/+3
|
* Merge pull request #1643 from MerryMage/make_uniqueMathew Maidment2016-04-061-1/+0
|\ | | | | Common: Remove Common::make_unique, use std::make_unique
| * Common: Remove Common::make_unique, use std::make_uniqueMerryMage2016-04-051-1/+0
| |
* | Merge pull request #1508 from JayFoxRox/vs-output-mapbunnei2016-03-221-4/+14
|\ \ | |/ |/| Respect vs output map
| * Respect vs output mapJannik Vogel2016-03-141-4/+14
| |
* | Merge pull request #1538 from lioncash/dotbunnei2016-03-201-5/+3
|\ \ | | | | | | shader_interpreter: use std::inner_product for the dot product
| * | shader_interpreter: use std::inner_product for the dot productLioncash2016-03-171-5/+3
| | | | | | | | | | | | Same thing, less code.
* | | video_core: Don't cast away constLioncash2016-03-171-1/+1
|/ /
* | Merge pull request #1503 from bunnei/clear-jit-cachebunnei2016-03-163-7/+27
|\ \ | | | | | | Clear JIT cache
| * | shader_jit_x64: Clear cache after code space fills up.bunnei2016-03-123-2/+19
| | |
| * | shader_jit_x64: Make assert outputs more useful & cleanup formatting.bunnei2016-03-121-4/+7
| | |
| * | shader: Update log message to use proper log class.bunnei2016-03-121-1/+1
| |/
* / PICA: Fix MAD/MADI encodingJannik Vogel2016-03-152-29/+33
|/
* Common: Get rid of alignment macrosLioncash2016-03-091-4/+4
| | | | | The gl rasterizer already uses alignas, so we may as well move everything over.
* Add immediate mode vertex submissionDwayne Slater2016-03-034-2/+22
|
* pica: Implement decoding of basic fragment lighting components.bunnei2016-02-052-5/+9
| | | | | | | - Diffuse - Distance attenuation - float16/float20 types - Vertex Shader 'view' output
* Merge pull request #1367 from yuriks/jit-jmpbunnei2016-01-272-6/+6
|\ | | | | Shader JIT: Fix off-by-one error when compiling JMPs
| * Shader JIT: Fix off-by-one error when compiling JMPsYuri Kunde Schlesner2016-01-242-6/+6
| | | | | | | | | | | | | | There was a mistake in the JMP code which meant that one instruction at the destination would be skipped when the jump was taken. This commit also changes the meaning of the culprit parameter to make it less confusing and avoid similar mistakes in the future.
* | Shader: Implement "invert condition" feature of IFU instructionYuri Kunde Schlesner2016-01-252-2/+5
|/ | | | | | If the bit 0 of the JMPU instruction is set, then the jump condition will be inverted. That is, a jump will happen when the boolean is false instead of when it is true.
* video_core: Reorganize headersLioncash2015-09-113-6/+4
|
* video_core: Remove unnecessary includes from headersLioncash2015-09-111-2/+0
|
* video_core: Remove unused variablesLioncash2015-09-102-2/+0
|
* Shader JIT: Use SCALE constant from emitteraroulin2015-09-071-4/+4
|
* Shader: Fix size_t to int casts of register offsetsaroulin2015-09-072-15/+21
|
* Merge pull request #1088 from aroulin/x64-emitter-abi-callbunnei2015-09-022-28/+18
|\ | | | | x64: Proper stack alignment in shader JIT function calls
| * x64: Proper stack alignment in shader JIT function callsaroulin2015-09-012-28/+18
| | | | | | | | | | Import Dolphin stack handling and register saving routines Also removes the x86 parts from abi files
* | video_core: Fix format specifiers warningsaroulin2015-09-021-1/+2
|/
* Shader JIT: Fix SGE/SGEI NaN behavioraroulin2015-08-311-3/+3
| | | | | SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE instruction was used with NLT
* Merge pull request #1065 from yuriks/shader-fpYuri Kunde Schlesner2015-08-283-56/+87
|\ | | | | Shader FP compliance fixes
| * Shader JIT: Tiny micro-optimization in DPHYuri Kunde Schlesner2015-08-241-4/+4
| |
| * Shaders: Fix multiplications between 0.0 and infYuri Kunde Schlesner2015-08-242-39/+45
| | | | | | | | | | | | | | | | The PICA200 semantics for multiplication are so that when multiplying inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by IEEE. This is relied upon by games. Fixes #1024 (missing OoT interface items)
| * Shaders: Explicitly conform to PICA semantics in MAX/MINYuri Kunde Schlesner2015-08-242-2/+10
| |
| * Shader JIT: Add name to second scratch register (XMM4)Yuri Kunde Schlesner2015-08-241-3/+5
| |
| * Shader JIT: Fix CMP NaN behavior to match hardwareYuri Kunde Schlesner2015-08-241-8/+23
| |
* | Shader JIT: Fix float to integer rounding in MOVAaroulin2015-08-271-2/+2
| | | | | | | | MOVA converts new address register values from floats to integers using truncation
* | Shader JIT: ifdef out reference to ifdef'd out shader_maparchshift2015-08-271-0/+2
| | | | | | | | | | shader_map was only defined on x86 architectures, but was cleared on shutdown with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
* | Integrate the MicroProfile profiling libraryYuri Kunde Schlesner2015-08-251-0/+3
| | | | | | | | | | This brings goodies such as a configurable user interface and multi-threaded timeline view.
* | shader_jit: Replace two MDisp usages with MatRLioncash2015-08-241-2/+2
|/
* Merge pull request #1062 from aroulin/shader-rcp-rsqbunnei2015-08-232-10/+10
|\ | | | | Shader: RCP and RSQ computes only the 1st component
| * Shader: Use std::sqrt for float instead of sqrtaroulin2015-08-231-1/+1
| |
| * Shader: RCP and RSQ computes only the 1st componentaroulin2015-08-232-10/+10
| |
* | Shader: implement DPH/DPHI in JITaroulin2015-08-222-2/+36
| |
* | Shader: implement DPH/DPHI in interpreteraroulin2015-08-221-1/+8
|/ | | | | Tests revealed that the component with w=1 is SRC1 and not SRC2, it is now fixed on 3dbrew.
* Shader: implement SGE, SGEI and SLT in JITaroulin2015-08-192-15/+36
|
* Shader: implement SGE, SGEI in interpreteraroulin2015-08-191-0/+14
|
* Shader: Save caller-saved registers in JIT before a CALLaroulin2015-08-192-0/+33
|
* Shader: implement EX2 and LG2 in JITaroulin2015-08-172-2/+22
|
* Shader: implement EX2 and LG2 in interpreteraroulin2015-08-161-0/+36
|
* Build fix for Debug configurations.Tony Wasserka2015-08-161-1/+1
|
* Introduce a shader tracer to allow inspection of input/output values for each processed instruction.Tony Wasserka2015-08-165-37/+322
|
* citra-qt: Improve shader debugger.Tony Wasserka2015-08-161-6/+0
| | | | Now supports dumping the current shader and recognizes a larger number of output semantics.
* Shader: Use a POD struct for registers.bunnei2015-08-165-40/+43
|
* Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64.bunnei2015-08-161-6/+5
|
* Common: Cleanup CPU capability detection code.bunnei2015-08-161-5/+5
|
* Common: Move cpu_detect to x64 directory.bunnei2015-08-161-2/+1
|
* x64: Refactor to remove fake interfaces and general cleanups.bunnei2015-08-165-144/+22
|
* JIT: Support negative address offsets.bunnei2015-08-161-26/+25
|
* Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders.bunnei2015-08-166-2/+924
| | | | | - Config: Add an option for selecting to use shader JIT or interpreter. - Qt: Add a menu option for enabling/disabling the shader JIT.
* Common: Added MurmurHash3 hash function for general-purpose use.bunnei2015-08-151-1/+1
|
* Shader: Define a common interface for running vertex shader programs.bunnei2015-08-154-184/+278
|
* Shader: Move shader code to its own subdirectory, "shader".bunnei2015-08-152-0/+701