summaryrefslogtreecommitdiffstats
path: root/src/video_core (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-03-09OpenGL: Prefer glClientWaitSync for OGLSync objectsameerj5-10/+16
At least on Nvidia, glClientWaitSync with a timeout of 0 (non-blocking) is faster than glGetSynciv of GL_SYNC_STATUS.
2023-03-08core: Promote CPU/GPU threads to time criticalMorph1-1/+1
And also demote Audren and CoreTiming to High thread priority.
2023-03-08general: fix type inconsistenciesLiam1-2/+2
2023-03-07gl_rasterizer: Implement AccelerateDMA DmaBufferImageCopyameerj2-9/+52
2023-03-07Refactor AccelerateDMA codeameerj8-251/+156
2023-03-05Engines: Implement Accelerate DMA Texture.Fernando Sahmkow15-97/+658
2023-03-05core_timing: Use higher precision sleeps on WindowsMorph1-1/+1
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows. Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution. This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
2023-03-04Check all swizzle components for red, not just [0], pass float border color rather than intKelebek13-10/+16
2023-03-02vulkan_common: disable vertexInputDynamicState on unsupported driverLiam1-0/+1
2023-03-01nvnflinger: fix nameLiam1-2/+2
2023-02-27Partially apply LTO to only core and video_core projects.Matías Locatti1-0/+4
2023-02-25buffer_cache: Add logic for non-NVN storage buffer trackingameerj1-4/+20
2023-02-23configuration: Add async ASTC decode settingameerj4-8/+21
2023-02-22texture_cache: Add async texture decodingameerj3-0/+88
2023-02-21svc: Fix type consistency (exposed on macOS)Merry1-1/+1
2023-02-14Reimplement the invalidate_texture_data_cache registerKelebek11-0/+4
2023-02-14Allow >1 cpu threads on video decoding, disable multi-frame decodingKelebek11-0/+2
2023-02-14remove static from pointer sized or smaller types for aesthetics, change constexpr static to static constexpr for consistencyarades7926-66/+60
Signed-off-by: arades79 <scravers@protonmail.com>
2023-02-14add static lifetime to constexpr values to force compile time evaluation where possiblearades7924-58/+64
Signed-off-by: arades79 <scravers@protonmail.com>
2023-02-11texture_cache: OpenGL: Implement MSAA uploads and copiesameerj12-14/+136
2023-02-11video_core: Speed up video frame data copyFengChen1-9/+5
2023-02-09buffer_base: Partially revert changes from #9559ameerj1-6/+8
This fixes a regression where Yoshi's Crafted World (and potentially other titles) would enter an infinite loop when GPU Accuracy was set to "Normal"
2023-02-08Remove OnCommandListEndCommandBehunin3-14/+2
Call rasterizer->ReleaseFences() directly
2023-02-05Remove fake vertex bindings when dynamic state is enabledKelebek11-25/+1
2023-01-30gl_compute_pipeline: Force context flush when loading shader cacheameerj4-7/+37
2023-01-30gl_graphics_pipeline: Force context flush when loading shader cacheameerj4-9/+12
2023-01-30Move to Clang Format 15Levi Behunin2-4/+2
Depends on https://github.com/yuzu-emu/build-environments/pull/69 clang-15 primary run
2023-01-28texture_cache: Adjust image view sizes by MSAA samplesameerj2-0/+48
2023-01-28video_core: Implement vulkan clear specified channelFengChen6-20/+152
2023-01-26video_core/opengl: Add FSR upscaling filter to the OpenGL rendererWollnashorn12-162/+546
2023-01-25Revert "MemoryManager: use fastmem directly."Merry2-33/+10
This reverts commit af5ecb0b15d4449f58434e70eed835cf71fc5527.
2023-01-21nsight_aftermath_tracker: update for latest Aftermath SDKLiam1-4/+4
2023-01-18Demote maxwell3d Firmware4 call log to debugKelebek11-1/+1
2023-01-16Address feedbackFeng Chen5-14/+62
2023-01-10vulkan_common: fix indirect draw with countLiam3-8/+15
2023-01-10MoltenVK: restrict number of vertex attributes/bindings to 16TellowKrinkle1-10/+25
2023-01-09vulkan_device: refactor feature testingLiam3-1173/+664
2023-01-08VideoCore: Fix OGL cache invalidation.Fernando Sahmkow2-0/+6
2023-01-07Revert "Vulkan, OpenGL: Hook up storage buffer alignment code"Liam6-22/+3
This reverts commit 9e2997c4b6456031622602002924617690e32a13.
2023-01-07renderer_vulkan: pause turbo submissions on inactive queueLiam5-0/+40
2023-01-07vulkan_device: avoid attempt to access empty optionalLiam1-2/+6
2023-01-07renderer_vulkan: disable clock boost on unvalidated devicesLiam3-1/+15
2023-01-06opengl: Sanitize antialiasing configNarr the Reg1-1/+7
2023-01-06video_core/vulkan: Fixed loading of Vulkan driver pipeline cacheWollnashorn1-1/+2
The header size of the Vulkan driver pipeline cache files was incorrectly in PipelineCache::LoadVulkanPipelineCache, for which the pipeline cache wasn't read correctly and got invalidated on each load.
2023-01-06MacroHLE: eliminate 2 rushed macros.Fernando Sahmkow1-42/+0
2023-01-05Run clang-formatBilly Laws1-1/+2
2023-01-05Vulkan, OpenGL: Hook up geometry shader passthrough emulationBilly Laws2-0/+2
2023-01-05Vulkan, OpenGL: Hook up storage buffer alignment codeBilly Laws6-3/+21
2023-01-05Vulkan: Add a workaround for input_position on Adreno driversBilly Laws1-0/+1
Adreno drivers will crash compiling geometry shaders if the input position is not wrapped in a gl_in struct.
2023-01-05video_core/vulkan: Vulkan driver pipelines now contain cache versionWollnashorn2-16/+28
So that old cache can get deleted when the cache version changes and does not grow infinitely
2023-01-05video_core/vulkan: Added check if Vulkan pipeline path has been setWollnashorn1-1/+1
2023-01-05video_core/vulkan: Added `VkPipelineCache` to store Vulkan pipelinesWollnashorn8-67/+226
As an optional feature which can be enabled in the advanced graphics configuration, all pipelines that get built at the initial shader loading are stored in a VkPipelineCache object and are dumped to the disk. These vendor specific pipeline cache files are located at `/shader/GAME_ID/vulkan_pipelines.bin`. This feature was mainly added because of an issue with the AMD driver (see yuzu-emu#8507) causing invalidation of the cache files the driver builds automatically.
2023-01-05BufferBase: Don't ignore GPU pages.Fernando Sahmkow7-22/+21
2023-01-05Fermi2D: sync cache flushesFernando Sahmkow2-2/+5
2023-01-05MemoryManager: use fastmem directly.Fernando Sahmkow2-10/+33
2023-01-05video_core: Cache GPU internal writes.Fernando Sahmkow10-30/+185
2023-01-05Vulkan: Fix drivers that don't support dynamic_state_2 upFernando Sahmkow2-8/+11
2023-01-05video_core: Implement opengl/vulkan draw_textureFeng Chen19-138/+291
2023-01-05video_core: Implement maxwell3d draw texture methodFeng Chen7-1/+177
2023-01-05common: add setting for renderer clock workaroundLiam1-1/+3
2023-01-05vulkan: implement 'turbo mode' clock boosterLiam8-2/+272
2023-01-05renderer_vulkan: implement fallback path for null descriptorsLiam3-0/+19
2023-01-04yuzu-ui: Add setting for disabling macro HLEFernando Sahmkow1-4/+5
2023-01-04Video_core: Address feedbackFernando Sahmkow10-167/+304
2023-01-04Texture Cache: Implement async texture downloads.Fernando Sahmkow5-35/+91
2023-01-04Vulkan: Update blacklisting to latest driver versions.Fernando Sahmkow1-5/+12
2023-01-03ShaderCompiler: Inline driver specific constants.Fernando Sahmkow3-2/+5
2023-01-03Vulkan: rework stencil tracking.Fernando Sahmkow4-36/+169
2023-01-01vulkan_common: blacklist radv from extended_dynamic_state2 on drivers before 22.3.1Liam2-2/+14
2023-01-01video_core: fix buildLiam4-3/+38
2023-01-01MacroHLE: Final cleanup and fixes.Fernando Sahmkow12-122/+88
2023-01-01Rasterizer: Setup skeleton for Host Conditional renderingFernando Sahmkow6-10/+53
2023-01-01RasterizerMemory: Add filtering for flushing/invalidation operations.Fernando Sahmkow14-93/+186
2023-01-01Vulkan: Allow stagging buffer deferrals.Fernando Sahmkow2-21/+56
2023-01-01MacroHLE: Add OpenGL SupportFernando Sahmkow4-38/+94
2023-01-01Vulkan: Add other additional pipeline specsFernando Sahmkow1-1/+17
2023-01-01Vulkan: Implement Dynamic State 3Fernando Sahmkow13-105/+313
2023-01-01Vulkan Implement Dynamic State 2 LogicOp and PatchVerticesFernando Sahmkow12-27/+75
2023-01-01Vulkan: Implement Dynamic States 2Fernando Sahmkow13-66/+315
2023-01-01DMAPusher: Improve collection of non executing methodsFernando Sahmkow13-2/+181
2023-01-01Revert Buffer cache changes and setup additional macros.Fernando Sahmkow7-128/+179
2023-01-01MacroHLE: Reduce massive calculations on sizing estimation.Fernando Sahmkow5-95/+27
2023-01-01MacroHLE: Add HLE replacement for base vertex and base instance.Fernando Sahmkow10-64/+174
2023-01-01MacroHLE: Add Index Buffer size estimation.Fernando Sahmkow5-10/+74
2023-01-01MacroHLE: Refactor MacroHLE system.Fernando Sahmkow11-121/+420
2023-01-01MacroHLE: Implement DrawIndexedIndirect & DrawArraysIndirect.Fernando Sahmkow16-72/+252
2023-01-01MacroHLE: Add MultidrawIndirect HLE Macro.Fernando Sahmkow13-47/+169
2023-01-01vulkan_common: unify VK_EXT_debug_utils and selection of validation layerLiam3-11/+10
2022-12-26video_core: Implement other missing vulkan topologyFengChen1-3/+16
2022-12-26video_core: Implement vulkan QuadStrip topologyFengChen8-122/+229
2022-12-25texture_cache: Use Common::ScratchBuffer for swizzle buffersameerj4-10/+12
2022-12-25texture_cache: Use pre-allocated buffer for texture downloadsameerj3-9/+14
2022-12-25texture_cache: Use pre-allocated buffer for texture uploadsameerj4-13/+28
2022-12-20scratch_buffer: Explicitly defing resize and resize_destructive functionsameerj5-16/+16
resize keeps previous data intact when the buffer grows resize_destructive destroys the previous data when the buffer grows
2022-12-20dma_pusher: Rework command_headers usageameerj2-9/+16
Uses ScratchBuffer and avoids overwriting the command_headers buffer with the prefetch_command_list
2022-12-20buffer_cache: Use Common::ScratchBuffer for ImmediateBuffer usageameerj1-7/+4
2022-12-20video_core: Add usages of ScratchBufferameerj4-33/+21
2022-12-19externals: update Vulkan-Headers to v1.3.238Jan Beich1-0/+12
2022-12-16Remove unimplemented transform feedback geometry spam, it should be implementedKelebek11-2/+1
2022-12-14Vulkan: Add support for VK_EXT_depth_clip_control.FernandoS275-4/+47
2022-12-14vulkan_common: declare storageBuffer8BitAccessLiam1-1/+2
2022-12-13gl_device: Use a more robust way to use strict context modeAlexander Orzechowski3-8/+7
Instead of checking a environment variable which may not actually exist or is just wrong, ask QT if it's running on the wayland platform.
2022-12-13video_core/vulkan: Explicity check swapchain size when deciding to recreateAlexander Orzechowski3-15/+28
Vulkan for whatever reason does not return VK_ERROR_OUT_OF_DATE_KHR when the swapchain is the wrong size. Explicity make sure the size is indeed up to date to workaround this.
2022-12-13renderer_opengl: refactor context acquireLiam5-36/+61
2022-12-13Fix validation errors on less compatible Intel GPUyzct123455-2/+34
2022-12-11video_core: fix off by one in anisotropic filtering amountLiam1-1/+2
2022-12-09Fix compilation errorSalvage1-1/+1
2022-12-08video_core: Integrate SMAALiam21-27/+13878
Co-authored-by: goldenx86 <goldenx86@users.noreply.github.com> Co-authored-by: BreadFish64 <breadfish64@users.noreply.github.com>
2022-12-08video_core: Add vertex_array_instance_* sbubbed called warningFengChen1-0/+5
2022-12-08video_core: The draw manager manages whether Clear is required.FengChen3-10/+9
2022-12-08video_core: Adjust topology update logicFengChen2-23/+23
2022-12-08video_core: Implement maxwell3d draw manager and split draw logicFeng Chen12-267/+341
2022-12-06vulkan_common: further initialization tweaksLiam2-1/+9
2022-12-05Vulkan: Implement Alpha coverageFernando Sahmkow3-2/+6
2022-12-04cmake: prefer system librariesAlexandre Bouvier1-4/+3
2022-12-04vulkan_common: add feature test for shaderDrawParametersLiam1-1/+13
2022-12-04vulkan_common: clean up extension usageLiam12-102/+105
2022-12-04vulkan_common: correct usage of timeline semaphore fallbacksLiam1-2/+1
2022-12-04vulkan_common: ensure all mandatory features are tested in feature reportLiam1-1/+24
2022-12-04vulkan_common: unsuffix 16-bit storage feature test structureLiam1-2/+2
2022-12-04vulkan_common: unsuffix timeline semaphore feature test structureLiam1-2/+2
2022-12-04vulkan_common: add logicOp to feature reportLiam1-1/+2
2022-12-04vulkan_common: promote host query reset usage to coreLiam4-11/+12
2022-12-04vulkan_common: promote descriptor update template usage to coreLiam8-37/+36
2022-12-04vulkan_common: promote timeline semaphore usage to coreLiam3-9/+15
2022-12-04externals: update dynarmic, SDL2Liam2-13/+19
2022-12-01shader_recompiler: add gl_Layer translation GS for older hardwareLiam2-5/+65
2022-12-01video_core: Fine tuning the index drawing judgment logicFeng Chen2-27/+22
2022-12-01vulkan_common: quiet some validation errorsLiam2-1/+3
2022-12-01CMake: Consolidate common PCH headersameerj1-7/+1
2022-11-30Respect render mode overrideKelebek11-29/+39
2022-11-30CMake: Use precompiled headersameerj2-0/+17
2022-11-29host1x/syncpoint_manager: Eliminate unnecessary std::function constructionLioncash1-4/+2
We can just pass the function object through, and if it's a valid function, then it will automatically be converted.
2022-11-29host1x/syncpoint_manager: Pass DeregisterAction() handle as const-refLioncash2-6/+6
The handle is only compared against and not modified in any way, so we can pass it by const reference. This also allows us to mark the respective parameters for DeregisterGuestAction() and DeregisterHostAction() as const references as well.
2022-11-29maxwell_3d: Mark shifted value as unsignedLioncash1-3/+3
Otherwise this is technically creating a signed int result that gets converted. Just a consistency change. While we're in the area, we can mark Samples() as const.
2022-11-29engines: Remove unnecessary castsLioncash10-85/+57
In a few cases we have some casts that can be trivially removed.
2022-11-29video_core/surface: Eliminate casts in GetFormatType()Lioncash1-11/+4
We can just compare directly and get rid of verbose casting.
2022-11-29video_core: add null backendLiam6-0/+236
2022-11-27Vulkan: update initializationLiam5-65/+140
Co-authored-by: bylaws <bylaws@users.noreply.github.com>
2022-11-24Fermi2D: Cleanup and address feedback.Fernando Sahmkow3-8/+150
2022-11-24GPU: Fix buffer cache issue, engine upload not inlining memory in multiline and pessismistic invalidation.Fernando Sahmkow4-15/+9
2022-11-24GPU: Implement additional render target formats.Fernando Sahmkow7-12/+126
2022-11-24MaxwellDMA: Implement BlockLinear to BlockLinear copies.Fernando Sahmkow2-1/+69
2022-11-24Fermi2D: Implement Bilinear software filtering and address feedback.Fernando Sahmkow7-116/+180
2022-11-24Fermi2D: Rework blit engine and add a software blitter.Fernando Sahmkow12-18/+1431
2022-11-24FSR Sharpening Slider part 1 - only a global sliderMatías Locatti1-1/+5
2022-11-24maxwell_to_vk: Add R16_SINTMorph1-1/+1
This was somehow missed when the format was added to GL
2022-11-24maxwell_to_vk: Fix format usage bitsMorph1-2/+2
- VK_FORMAT_B8G8R8A8_UNORM supports the STORAGE_IMAGE_BIT - VK_FORMAT_R4G4B4A4_UNORM_PACK16 does not support the COLOR_ATTACHMENT_BIT
2022-11-23general: fix compile for Apple ClangLiam28-12/+26
2022-11-22video_core: Optimize maxwell drawing trigger mechanismFengChen2-61/+63
2022-11-17maxwell3d: full HLE for multi-layer clearsLiam8-24/+17
2022-11-17maxwell3d: HLE multi-layer clear macroLiam2-1/+22
2022-11-16Update renderer_vulkan.cppMatías Locatti1-0/+4
2022-11-15video_core: Reimplement inline index buffer bindingFeng Chen5-33/+31
2022-11-14Add break for default casesKyle Kienapfel9-0/+16
Visual Studio has an option to search all files in a solution, so I did a search in there for "default:" looking for any missing break statements. I've left out default statements that return something, and that throw something, even if via ThrowInvalidType. UNREACHABLE leads towards throw R_THROW macro leads towards a return
2022-11-11Fix regs regression with OpenGL two-sided stencil, and re-add data invalidation regKelebek16-5/+32
2022-11-11ir/texture_pass: Use host_info instead of querying Settings::values (#9176)Morph8-8/+10
2022-11-10video_core: Fix dma copy 1D random crashFengChen1-17/+20
2022-11-09Initial ARM64 supportLiam2-3/+15
2022-11-07video_core: Fix few issues in Tess stageFengChen2-1/+3
2022-11-06video_core:Fix vmm kinds size errorFengChen1-1/+1
2022-11-05video_core: Fix scaling graphical regressions for multiple gamesFengChen1-4/+4
2022-11-04Update shader cache version. (#9175)gidoly1-1/+1
2022-11-04video_core: Fix SNORM texture buffer emulating error (#9001)Feng Chen8-36/+109
2022-10-31video_core: Fix drawing trigger mechanism regressionFengChen1-32/+25
2022-10-30Vulkan: Fix regression caused by limiting render area to width/height of rendef targets.Fernando Sahmkow1-6/+6
2022-10-30vk_blit_screen: recreate swapchain images on guest format changeLiam2-1/+10
2022-10-28vk_scheduler: Remove recorded_countsRobin Kertels1-3/+1
2022-10-27video_core: Fix drawing trigger mechanism regressionFengChen2-61/+70
2022-10-25video_core: Catch vulkan clear op not all channel need clearFengChen1-8/+13
2022-10-22general: Resolve -Wunused-but-set-variableMorph1-2/+2
2022-10-22general: Resolve -Wunused-lambda-capture and C5233Morph2-6/+6
2022-10-22decoders: Use 2's complement instead of unary -Morph1-1/+1
Resolves C4146 on MSVC
2022-10-22CMakeLists: Remove all redundant warningsMorph1-7/+1
These are already explicitly or implicitly set in src/CMakeLists.txt
2022-10-22video_core: Implement maxwell inline_index methodFengChen6-74/+130
2022-10-21video_coare: Reimplementing the maxwell drawing trigger mechanismFengChen10-224/+139
2022-10-21format_lookup_table: Implement R32_B24G8 with D32_FLOAT_S8_UINTMorph1-0/+2
This format is similar to Z32_FLOAT_X24S8_UINT, which is implemented with D32_FLOAT_S8_UINT. Used in Persona 5 Royal
2022-10-20video_core: don't build ASTC decoder shader unless requestedLiam4-14/+19
2022-10-19Maxwell3D/Puller: Fix regressions and syncing issues.Fernando Sahmkow2-13/+9
2022-10-19video_core: renderer_vulkan: vk_query_cache: Avoid shutdown crash in QueryPool::Reserve.bunnei1-3/+4
2022-10-17video_core: implement 1D copies based on VMM 'kind'FengChen2-56/+73
2022-10-17video_core: Implement memory manager page kindFengChen4-13/+334
2022-10-16video_core: Fix spelling of "synchronize"Morph2-5/+5
2022-10-13renderer_(opengl/vulkan): Fix tessellation clockwise parameterMorph3-6/+6
This should be assigned CW only on Triangles_CW rather than not Triangles_CCW, making CCW the default winding order rather than CW.
2022-10-11syncpoint_manager: ensure handle is removable before removingLiam1-1/+11
2022-10-10Fix stencil func registers, make clip control equivalent to how it was before, but surely wrong.Kelebek18-44/+51
2022-10-07video_core: don't block rendering on screenshotsLiam1-1/+7
2022-10-07Update 3D regsKelebek129-2043/+3974
2022-10-07Revert "vulkan: automatically use larger staging buffer sizes when possible"liamwhite2-60/+27
2022-10-06vulkan_blitter: Fix pool allocation double free.Byte3-25/+10
2022-10-06maxwell_dma: remove warnings from implemented functionalityLiam1-2/+0
2022-10-06General: address feedbackFernando Sahmkow11-33/+32
2022-10-06state_tracker: workaround channel setup for homebrewLiam5-4/+9
2022-10-06general: Format licenses as per SPDX guidelinesMorph18-51/+38
2022-10-06Address Feedback from bylaws.Fernando Sahmkow1-1/+1
2022-10-06General: Fix clang format.Fernando Sahmkow6-16/+12
2022-10-06Vulkan Swapchain: Overall improvements.Fernando Sahmkow2-4/+13
2022-10-06Vulkan Texture Cache: Limit render area to the max width/height of the targets.Fernando Sahmkow4-9/+29
2022-10-06ImageBase: Basic fixes.Fernando Sahmkow1-8/+5
2022-10-06General: Fix compilation for GCCLiam White1-1/+2
2022-10-06VideoCore: Implement formats needed for N64 emulation.Fernando Sahmkow6-10/+10
2022-10-06Buffer Cache: Deduce vertex array limit from memory layout when limit is the highest possible.Fernando Sahmkow3-4/+12
2022-10-06VideoCore: Add option to dump the macros.Fernando Sahmkow1-0/+1
2022-10-06NVDRV: Further improvements.Fernando Sahmkow3-32/+22
2022-10-06Buffer Cache: Basic fixes.Fernando Sahmkow1-15/+22
2022-10-06Decoders: Improve overall speed.Fernando Sahmkow1-4/+11
2022-10-06DMA & InlineToMemory Engines Rework.bunnei20-242/+315
2022-10-06Maxwell3D: Add small_index_2Fernando Sahmkow1-0/+2
2022-10-06Memory Manager: ensure safety of GPU to CPU address.Fernando Sahmkow1-0/+3
2022-10-06MemoryManager: Fix errors popping out.Fernando Sahmkow1-4/+8
2022-10-06Shader Decompiler: Check for shift when deriving composite samplers.Fernando Sahmkow4-8/+11
2022-10-06MemoryManager: Finish up the initial implementation.Fernando Sahmkow2-50/+138
2022-10-06OpenGL: Fix TickWorkFernando Sahmkow1-0/+4
2022-10-06VideoCore: Refactor fencing system.Fernando Sahmkow17-152/+146
2022-10-06MemoryManager: initial multi paging system implementation.Fernando Sahmkow2-189/+304
2022-10-06Vulkan: Fix Scissor on ClearsFernando Sahmkow1-1/+8
2022-10-06NVDRV: Further refactors and eliminate old code.Fernando Sahmkow6-94/+4
2022-10-06NVDRV: Refactor Host1xFernando Sahmkow24-108/+138
2022-10-06VideoCore: Refactor syncing.Fernando Sahmkow37-240/+595
2022-10-06Texture Cache: Fix GC and GPU Modified on Joins.Fernando Sahmkow1-3/+5
2022-10-06Texture cache: Fix the remaining issues with memory mnagement and unmapping.Fernando Sahmkow11-16/+60
2022-10-06Texture cache: Fix dangling references on multichannel.Fernando Sahmkow3-27/+36
2022-10-06Refactor VideoCore to use AS sepparate from Channel.Fernando Sahmkow9-152/+164
2022-10-06General: Rebase fixes.Fernando Sahmkow1-7/+6
2022-10-06VideoCore: Extra Fixes.Fernando Sahmkow2-2/+2
2022-10-06NVDRV: Remake ASGPUFernando Sahmkow2-4/+9
2022-10-06MemoryManager: Temporary Fix for NVDEC.Fernando Sahmkow1-1/+1
2022-10-06VideoCore: Update MemoryManagerFernando Sahmkow2-163/+82
2022-10-06VideoCore: Fix channels with disk pipeline/shader cache.Fernando Sahmkow11-71/+87
2022-10-06OpenGl: Implement Channels.Fernando Sahmkow9-118/+186
2022-10-06NVHOST_CTRl: Implement missing method and fix some stuffs.Fernando Sahmkow1-0/+5
2022-10-06VideoCore: implement channels on gpu caches.Fernando Sahmkow44-779/+1396
2022-10-06NvHost: Remake Ctrl Implementation.Fernando Sahmkow1-1/+1
2022-10-06Texture Cache: Add ASTC 10x5 Format.Fernando Sahmkow6-0/+23
2022-10-04vk_scheduler: wait for command processing to completeLiam1-2/+4
2022-10-04common: remove "yuzu:" prefix from thread namesLiam5-6/+6
2022-10-02MacroHLE: Add MultidrawIndirect HLE Macro.Fernando Sahmkow1-1/+62
2022-10-02macro_jit_x64: fix miscompilation of bit extraction operationsLiam1-37/+9
2022-10-01macro_jit_x64: cancel exit for taken branchLiam1-11/+5
2022-09-25vulkan: automatically use larger staging buffer sizes when possibleLiam2-27/+60
2022-09-20video_core: Fix legacy to generic location unpairedFengChen2-0/+2
2022-09-20video_core: Generate mipmap texture by drawingFengChen9-7/+96
2022-09-16astc: Enable parallel CPU astc decodingMorph1-21/+35
Given the issues with GPU accelerated ASTC decoding with NVIDIA's latest drivers, parallelize astc decoding on the CPU. Uses half the available threads in the system for astc decoding.
2022-09-15video_core: Modify astc texture decode error fill valueFengChen2-2/+2
2022-09-10Align index buffe size when vertex_buffer_unified_memory enableFengChen1-1/+1
2022-08-31style: General style changes to match with the rest of the codebaseMorph1-5/+2
2022-08-31(shader/pipeline)_cache: Raise shader/pipeline cache versionMorph2-2/+2
Since the following commit: https://github.com/yuzu-emu/yuzu/commit/a83a5d2e4c8932df864dd4cea2b04d87a12c8760 , many games will refuse to boot unless the shader/pipeline cache has been invalidated.
2022-08-25video_core: add option for pessimistic flushingLiam1-1/+4
2022-08-25video_code: support rectangle textureFengChen4-13/+18
2022-08-24video_core: vulkan: rasterizer: Workaround on viewport swizzle on AMDNarr the Reg1-1/+8
2022-08-20video_core: support framebuffer crop rect top not zerovonchenplus2-12/+25
2022-08-20code: dodge PAGE_SIZE #defineKyle Kienapfel9-60/+62
Some header files, specifically for OSX and Musl libc define PAGE_SIZE to be a number This is great except in yuzu we're using PAGE_SIZE as a variable Specific example `static constexpr u64 PAGE_SIZE = u64(1) << PAGE_BITS;` PAGE_SIZE PAGE_BITS PAGE_MASK are all similar variables. Simply deleted the underscores, and then added YUZU_ prefix Might be worth noting that there are multiple uses in different classes/namespaces This list may not be exhaustive Core::Memory 12 bits (4096) QueryCacheBase 12 bits ShaderCache 14 bits (16384) TextureCache 20 bits (1048576, or 1MB) Fixes #8779
2022-08-19video_core: implement R16G16B16X16 texture formatLiam1-1/+1
2022-08-09video_core/textures/decoders: Avoid SWIZZLE_TABLEMerry2-15/+48
2022-08-08Make vsync setting work for VulkanDJRobX1-2/+3
2022-08-03renderer_vulkan: add format fallbacks for R16G16B16_SFLOAT, R16G16B16_SSCALED, R8G8B8_SSCALEDLiam5-273/+337
2022-08-02vk_texture_cache: return VK_NULL_HANDLE for views of null imagesLiam1-0/+12
2022-07-30renderer_opengl: delete shader source after linkingLiam1-0/+1
2022-07-30video_core: stop waiting for shader compilation on user cancelLiam2-2/+2
2022-07-28video_core: differentiate between tiled and untiled framebuffer sizes for unaccelerated copiesLiam1-9/+7
2022-07-27chore: make yuzu REUSE compliantAndrea Pappacoda20-42/+46
[REUSE] is a specification that aims at making file copyright information consistent, so that it can be both human and machine readable. It basically requires that all files have a header containing copyright and licensing information. When this isn't possible, like when dealing with binary assets, generated files or embedded third-party dependencies, it is permitted to insert copyright information in the `.reuse/dep5` file. Oh, and it also requires that all the licenses used in the project are present in the `LICENSES` folder, that's why the diff is so huge. This can be done automatically with `reuse download --all`. The `reuse` tool also contains a handy subcommand that analyzes the project and tells whether or not the project is (still) compliant, `reuse lint`. Following REUSE has a few advantages over the current approach: - Copyright information is easy to access for users / downstream - Files like `dist/license.md` do not need to exist anymore, as `.reuse/dep5` is used instead - `reuse lint` makes it easy to ensure that copyright information of files like binary assets / images is always accurate and up to date To add copyright information of files that didn't have it I looked up who committed what and when, for each file. As yuzu contributors do not have to sign a CLA or similar I couldn't assume that copyright ownership was of the "yuzu Emulator Project", so I used the name and/or email of the commit author instead. [REUSE]: https://reuse.software Follow-up to 01cf05bc75b1e47beb08937439f3ed9339e7b254
2022-07-19video_core: use correct byte size for framebufferLiam1-5/+8
2022-07-17yuzu: settings: Remove framerate cap and merge unlocked framerate setting.bunnei1-3/+3
- These were all somewhat redundant.
2022-07-06gpu_thread: Use the previous MPSCQueue implementationMorph2-4/+3
The bounded MPSCQueue implementation causes crashes in Fire Emblem Three Houses, use the previous implementation for now.
2022-07-06renderer_(gl/vk): Implement ASTC_10x6_UNORMMorph7-1/+16
- Used by Monster Hunter Rise Update 10.0.2
2022-06-29Revert "vulkan_device: Block AMDVLK's VK_KHR_push_descriptor"lat9nq1-11/+0
2022-06-27video_core: Replace VKUpdateDescriptorQueue with UpdateDescriptorQueuegerman7714-33/+33
2022-06-27video_core: Replace VKSwapchain with Swapchaingerman775-25/+23
2022-06-27video_core: Replace VKQueryCache with QueryCachegerman776-28/+27
2022-06-27video_core: Replace VKScheduler with Schedulergerman7735-111/+110
2022-06-27video_core: Replace VKBlitScreen with BlitScreengerman773-51/+51
2022-06-27video_core: Replace VKFenceManager with FenceManagergerman773-15/+14
2022-06-15bounded_threadsafe_queue: Use constexpr capacity and maskMorph1-1/+1
While this is the primary change, we also: - Remove the mpsc namespace and rename Queue to MPSCQueue - Make Slot a private struct within MPSCQueue - Remove the AlignedAllocator template argument, as we use std::allocator - Replace instances of mask + 1 with capacity, and mask + 2 with capacity + 1
2022-06-15vk_compute_pass: Explicitly cast to VkAccessFlagsMorph1-25/+26
According to the standard, a narrowing conversion is an implicit conversion from an integer or unscoped enumeration type to an integer type that cannot represent all the values of the original type, except when the value is a literal or constant expression. MSVC, unlike GCC or Clang, determines this to be a narrowing conversion despite the enumeration exclusively containing values that fit within the range of a 32 bit integer, emitting a warning since designated initializers prohibit narrowing conversions. To solve this, explicitly cast to the type we are initializing.
2022-06-14vk_compute_pass: Use VK_ACCESS_NONEMorph1-1/+1
This enumeration was introduced in Vulkan 1.3, prefer using this instead of defaulting the enum. Also resolves a narrowing conversion warning on MSVC.
2022-06-14vk_compute_pass: Silence Wextra warningMorph1-1/+1
Silences a warning about using enumerated and non-enumerated types in a conditional expression.
2022-06-14general: fix compilation on MinGW GCC 12Liam1-1/+1
2022-06-14common: Change semantics of UNREACHABLE to unconditionally crashLiam29-103/+104
2022-06-14CMakeLists: Make variable shadowing a compile-time errorMorph1-5/+0
Now that the entire project is free of variable shadowing, we can enforce this as a compile time error to prevent any further introduction of this logic bug.
2022-06-03gpu_thread: Move to bounded queueLevi Behunin2-4/+5
2022-06-02Maxwell3D: Fix 3D semaphore counter type 0 handlingBilly Laws2-3/+3
Counter type 0 actually releases the semaphore payload rather than a constant zero as was previously thought. This is required by Skyrim.
2022-06-01core/debugger: Improved stepping mechanism and misc fixesLiam1-0/+4
2022-05-30vulkan_library: Add debug logginglat9nq1-0/+4
2022-05-25vulkan_device: Workaround extension buglat9nq1-1/+6
A bug occurs in yuzu when VK_KHR_workgroup_memory_explicit_layout is available but 16-bit integers are not supported in the host driver. Disable usage of the extension when this case arises.
2022-05-25vulkan_device: Block AMDVLK's VK_KHR_push_descriptorlat9nq1-0/+11
Recent AMD Vulkan drivers (22.5.2 or 2.0.226 for specifically Vulkan) have a broken VK_KHR_push_descriptor implementation that causes a crash in yuzu. Disable it for the time being.
2022-05-17video_core: Support new VkResultAlexandre Bouvier1-0/+2
2022-05-13video_core/surface: Use u8 for PixelFormat block tablesMorph1-3/+3
Using this smaller type saves 33280 bytes in the compiled executable.
2022-05-13codecs/vp9: Use u8 for norm and map lutsMorph1-4/+4
Using this smaller type saves 1536 bytes in the compiled executable.
2022-05-11maxwell_dma: use fallback if remapping is enabledLiam1-3/+6
2022-05-10video_core/macro: clear code on upload address assignmentLiam3-0/+10
2022-05-09VideoCore: Add option to dump the macros.Fernando Sahmkow1-0/+27
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
2022-05-08video_core/macro_jit_x64: warn on invalid parameter accessLiam1-3/+21
2022-05-07OpenGL: implement face flips according to NDCLiam1-4/+3
2022-05-07maxwell_dma: fix bytes per pixelLiam1-3/+3
2022-05-06vk_rasterizer: fix stencil test when two faces are disabledLody1-2/+2
2022-04-28GCC 12 fixesLiam1-2/+2
2022-04-28chore: add missing SPDX tagsAndrea Pappacoda8-151/+17
Follow-up to 99ceb03a1cfcf35968cab589ea188a8c406cda52
2022-04-26renderer_vulkan: Update screen info if the framebuffer size has changedMorph1-0/+5
2022-04-23general: Convert source file copyright comments over to SPDXMorph225-675/+450
This formats all copyright comments according to SPDX formatting guidelines. Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-18bootmanager: Don't create another screenshot request if previous one is not done yetgerman772-0/+7
2022-04-14video_core: implement formats for N64 emulationFernando Sahmkow8-7/+102
2022-04-14buffer_cache: cap vertex buffer sizesLiam1-1/+14
2022-04-14maxwell3d: add small_index_2 registerLiam2-1/+11
2022-04-07video_core: Replace lock_guard with scoped_lockMerry11-18/+18
2022-04-07OpenGL: fix S8D24 to ABGR8 conversionsLiam6-4/+58
2022-04-05Revert "texture_cache/util: Remove unneeded ReadBlockUnsafe"bunnei1-0/+1
2022-04-04texture_cache/util: Remove unneeded ReadBlockUnsafeameerj1-1/+0
This call was reading GPU memory into the dst buffer, which is then overwritten by the SwizzleTexture call.
2022-04-04OpenGL: fix croppingLiam3-1/+10
2022-04-04Vulkan: crop to screen dimensions if crop not explicitly requestedLiam1-2/+3
2022-04-04OpenGL: propagate face flip conditionLiam1-4/+10
2022-04-04OpenGL: flip front faces if Z scale is invertedLiam1-2/+3
2022-04-02fix: typosAndrea Pappacoda1-1/+1
2022-04-01GPU Garbage Collection: Fix regressions.Fernando Sahmkow2-3/+1
2022-03-29gl_rasterizer: Avoid scenario locking already owned mutexameerj1-3/+3
gpu.TickWork() may lock the texture_cache and buffer_cache mutexes, which are owned by the thread prior to invoking TickWork(). Defer invoking gpu.TickWork() until the scope ends, where the owned mutexes are released.
2022-03-26Revert "Memory GPU <-> CPU: reduce infighting in the texture cache by adding CPU Cached memory."bunnei5-64/+3
2022-03-25Texture Cache: Add Cached CPU system.Fernando Sahmkow5-3/+64
2022-03-25GC: Address Feedback.Fernando Sahmkow7-29/+37
2022-03-25hle: nvflinger: Migrate android namespace -> Service::android.bunnei6-18/+18
2022-03-25hle: vi: Integrate new NVFlinger and HosBinderDriverServer service.bunnei1-0/+1
2022-03-25hle: nvflinger: Move BufferTransformFlags to its own header.bunnei1-17/+2
2022-03-25hle: nvflinger: Move PixelFormat to its own header.bunnei6-23/+19
2022-03-25Garbage Collection: Final tuning.Fernando Sahmkow6-24/+36
2022-03-25Buffer Cache: Tune to the levels of the new GC.Fernando Sahmkow6-6/+78
2022-03-25Garbage Collection: Redesign the algorithm to do a better use of memory.Fernando Sahmkow13-32/+156
2022-03-24Vulkan: Use 3D helpers for MSAA scaling on NV drivers 510+ameerj3-7/+8
Nvidia Vulkan drivers 510+ crash when blitting MSAA images. Fall-back to 3D scale helpers for MSAA image scaling.
2022-03-24buffer_cache: reset cached write bits after flushing invalidationsLiam1-1/+2
2022-03-22codec: Plug GPU decoder memory leakameerj1-0/+2
2022-03-22codec: Disable HW_FRAMES method check on Windowsameerj1-14/+19
It was reported that this method causes crashes on certain Linux decoding backends, hence the check to avoid it. This subsequently caused Windows GPU decoders to never be selected and always fall back to CPU decoding, disable the check on Windows for now.
2022-03-20BufferCache: Find direction of the stream buffer increase.Fernando Sahmkow1-6/+14
2022-03-20general: Fix clang/gcc build errorsameerj2-0/+2
2022-03-19common: Reduce unused includesameerj1-0/+1
2022-03-19video_core: Reduce unused includesameerj75-139/+12
2022-03-18general: Reduce core.h includesameerj4-5/+0
2022-03-18vk_texture_cache: Do not reinterpret DepthStencil source imagesameerj1-5/+0
Fixes star pointer interactions in Super Mario Galaxy on some drivers, notably Nvidia. Co-Authored-By: Fernando S. <1731197+fernandos27@users.noreply.github.com>
2022-03-16Address review commentsLiam2-2/+2
2022-03-16Vulkan: convert S8D24 <-> ABGR8Liam5-2/+41
2022-03-15maxwell_3d: Implement a safer CB data uploadameerj2-70/+12
This makes constant buffer uploads safer and more accurate by updating the GPU memory as soon as the CB Data method is invoked. The previous implementation was deferring the updates until a different maxwell 3d method was detected, then writing all CB data at once.
2022-03-14Maxwell3D: Link to override constant definition in nouveaubyte[]1-0/+2
2022-03-14Maxwell3D: restore original topology when topology overrides are disabledbyte[]1-0/+2
2022-03-14Maxwell3D: Use override constants from nouveauLiam2-2/+37
This fixes some incorrect rendering in Sunshine
2022-03-12emit_spirv, vk_compute_pass: Resolve VS2022 compiler errorsameerj1-1/+2
2022-03-12Maxwell3D: Restrict topology override effect to after the register is setLiam2-1/+5
2022-03-11Maxwell3D: mark index buffers as dirty after updating countsLiam1-0/+2
2022-03-11TextureCacheRuntime: allow converting D24S8 to ABGR8Liam1-1/+2
I can't see how this would be useful, but Galaxy uses it.
2022-03-11Maxwell3D: read small-index draw and primitive topology override registersLiam2-2/+30
This allows Galaxy and Sunshine to render for the first time.
2022-03-08video_core: Cancel Scoped's exit call on GPU failurelat9nq1-0/+1
When CreateRenderer fails, the GraphicsContext that was std::move'd into it is destroyed before the Scoped that was created to manage its currency. In that case, the GraphicsContext::Scoped will still call its destructor at the ending of the function. And because the context is destroyed, the Scoped will cause a crash as it attempts to call a destroyed object's DoneCurrent function. Since we know when the call would be invalid, call the Scoped's Cancel method. This prevents it from calling a method on a destroyed object.
2022-03-07MaxwellDMA: Implement semaphore operationsLody2-1/+21
2022-03-06gl_graphics_pipeline: Improve shader builder synchronization using fences (#7969)Ameer J2-21/+32
* gl_graphics_pipeline: Improve shader builder synchronization Make use of GLsync objects to ensure better synchronization between shader builder threads and the main context * gl_graphics_pipeline: Make built_fence access threadsafe * gl_graphics_pipeline: Use GLsync objects only when building in parallel * gl_graphics_pipeline: Replace GetSync calls with non-blocking waits The spec states that a ClientWait on a Fence object ensures the changes propagate to the calling context
2022-02-27gl_fence_manager: Minor optimization to signal queryingameerj1-2/+1
Per the spec, bufSize is the number of integers that will be written, in this case, 1. Also, the length argument is optional if the information of the number of elements written is not needed.
2022-02-26vulkan_device: Blacklist RADV on RDNA2 from VK_EXT_vertex_input_dynamic_stateAmeer J1-4/+21
RDNA2 devices running under the RADV driver were crashing when VK_EXT_vertex_input_dynamic_state was enabled. Blacklisting these devices until a proper fix is established.
2022-02-25maxwell_to_(gl/vk): Add 11_11_10 float vertex formatMorph2-0/+4
- Used by パワプロクンポケットR
2022-02-24vk_blit_screen: Add missing format bgra8Lody1-0/+2
2022-02-21vulkan_device: fix missing format in ANVvoidanix3-2/+21
Currently Mesa's ANV driver does not support VK_FORMAT_B5G6R5_UNORM_PACK16, implement an alternative for it.
2022-02-02texture_cache: Ensure has_blacklisted is always initializedLioncash1-1/+1
Resolves a -Wmaybe_uninitialized warning
2022-02-02texture_cache: Remove dead code within SynchronizeAliasesLioncash1-13/+1
Since these were being copied by value, none of the changes applied in the loop would be reflected. However, from the looks of it, this would already be applied within CopyImage() anyways, so this can be removed.
2022-02-02texture_cache: Amend unintended bitwise OR in SynchronizeAliasesLioncash1-1/+1
2022-02-02general: Replace NonCopyable struct with equivalentsLioncash2-15/+43
2022-02-01video_core/shader_cache: Remove unused algorithm includeLioncash1-1/+0
2022-02-01video_core/shader_cache: Take std::span in RemoveShadersFromStorage()Lioncash2-3/+3
Same behavior, but without the need to move into the function to avoid an allocation.
2022-02-01Rasterizer: Refactor inlineToMemory.Fernando Sahmkow9-15/+16
2022-01-31Vulkan: Fix Scheduler Chunks when their FuncType is 0.Fernando Sahmkow2-4/+6
2022-01-29GPU: Improve syncing.Fernando Sahmkow1-3/+10
2022-01-29Rasterizer: Implement Inline2Memory Acceleration.Fernando Sahmkow14-6/+122
2022-01-29Inline2Memory: Flush before writting buffer.Fernando Sahmkow2-2/+3
2022-01-27buffer_cache: Reduce stream buffer allocations when expanding from the leftameerj1-0/+2
The existing stream buffer optimization accounts for size increases at the end of the allocated buffer. This adds the same optimization, increasing the size from the beginning of the buffer as well to reduce buffer allocations when expanding the same buffer from the left.
2022-01-26common/xbyak_api: Make BuildRegSet() constexprLioncash1-1/+1
This allows us to eliminate any static constructors that would have been emitted due to the function not being constexpr.
2022-01-25video_core/macro: Add missing <cstring> headerLioncash1-2/+3
Necessary since memcpy is used.
2022-01-25video_core/macro_interpreter: Move impl class to the cpp fileLioncash2-84/+86
Keeps the implementation hidden from the intended API and lessens the header dependencies on the interpreter's header.
2022-01-25video_core/macro_hle: Return unique_ptr directly from GetHLEProgram()Lioncash3-7/+7
Same behavior, but less code and header dependencies.
2022-01-25video_core/macro: Remove unused parameter from Execute()Lioncash3-4/+3
Simplifies the function interface.
2022-01-25video_core/macro_jit_x64: Remove unused impl class memberLioncash1-1/+0
Reduces the size of the impl class a tiny bit.
2022-01-25video_core/macro_jit_x64: Decouple PersistentCallerSavedRegs() from implLioncash1-5/+4
This doesn't depend on class state and can just be a regular function.
2022-01-25video_core/macro_jit_x64: Move impl class into cpp fileLioncash2-87/+86
Keeps the implementation internalized and also reduces API-facing header dependencies. Notably, this fully internalizes all of the xbyak externals.
2022-01-25video_core/macro_hle: Move impl class into cpp fileLioncash2-27/+19
Given it's intended to be an internal implementation class, we can move it into the cpp file to ensure that. This also lets us move some header dependencies into the cpp file as well.
2022-01-25gpu: Tidy up forward declarationsLioncash1-10/+0
Over time a few forward declarations became unnecessary, so we can remove these to tidy up the header a little bit.
2022-01-25gpu: Remove obsoleted CDMAPusher() accessorsLioncash1-6/+0
These were obsoleted in 2c47f8aa1886522898b5b3a73185b5662be3e9f3 but were accidentally overlooked.
2022-01-25vk_fsr: Replace comma operator with semicolonLioncash1-1/+1
Generally, we should be ending statements with a semicolon not a comma Resolves a clang diagnostic.
2022-01-20video_core: constify AVCodec for ffmpeg >= 5.0Jan Beich1-1/+1
src/video_core/command_classes/codecs/codec.cpp:177:16: error: assigning to 'AVCodec *' from 'const AVCodec *' discards qualifiers av_codec = avcodec_find_decoder(codec); ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2022-01-19vulkan_device: Fix sType for VkPhysicalDeviceShaderAtomicInt64FeaturesGeorg Lehmann1-1/+1
2022-01-16astc_decoder: Combine FastReplicate functions to work around new NV driver bugameerj1-34/+46
The new Nvidia drivers have a bug where the FastReplicateTo6 function produces a lookup into the REPLICATE_TO_8 table rather than the REPLICATE_TO_6 table. This seems to be an optimization gone wrong. Combining the logic of the FastReplicate functions seems to address the bug.
2022-01-05video_core: Remove unnecesary maybe_unused flagNarr the Reg1-1/+1
2022-01-04gpu: Add shut down method to synchronize threads before destructionameerj2-0/+13
2022-01-04ShaderDecompiler: Add a debug option to dump the game's shaders.Fernando Sahmkow4-1/+79
2022-01-04Revert "Merge pull request #7668 from ameerj/fence-stop-token"ameerj2-8/+14
This reverts commit e7733544779f2706d108682dd027d44e7fa5ff4b, reversing changes made to abbbdc2bc027ed7af236625ae8427a46df63f7e7.
2022-01-03gpu: Use std::stop_token in WaitFence for VSync threadameerj2-14/+8
Fixes a hang that may occur when stopping emulation and the VSync thread is blocked on the syncpoint condition variable.
2022-01-01texture_cache/util: Fix s32 overflow when resolving overlapsameerj1-5/+5
2021-12-31video_core/memory_manager: Fixes for sparse memory managementameerj2-14/+12
2021-12-31video_core/memory_manager: Deduplicate Read/WriteBlockameerj2-47/+32
2021-12-30glsl: Add boolean reference workaroundameerj3-0/+7
2021-12-30glsl_context_get_set: Add alternative cbuf type for broken driversameerj3-7/+8
some drivers have a bug bitwise converting floating point cbuf values to uint variables. This adds a workaround for these drivers to make all cbufs uint and convert to floating point as needed.
2021-12-28Remove invalid header includeFeng Chen1-1/+0
2021-12-24vk_texture_cache: Use 3D scale helpers for MSAA texture scaling on Intel Windows driversameerj4-20/+35
Fixes a crash when scaling MSAA textures in titles such as Sonic Colors Ultimate.
2021-12-24blit_image: Remove unused functionameerj2-50/+0
2021-12-24vk_texture_cache: Fix invalidated pointer accessameerj5-8/+21
The vulkan ImageView held a reference to its source image for rescale status checking. This pointer is sometimes invalidated when the texture cache slot_images container is resized. To avoid an invalid pointer dereference, the ImageView now holds a reference to the container itself.
2021-12-18video_core/codecs: re-enable VAAPI/VDPAU on BSDs after 72aa418b0b41Jan Beich1-1/+1
2021-12-18Address format clangvonchenplus2-2/+2
2021-12-18Vulkan: Fix the checks for primitive restart extension.Fernando Sahmkow3-21/+28
2021-12-18Vulkan: implement Logical Operations.Fernando Sahmkow2-3/+3
2021-12-18Vulkan: Implement VK_EXT_primitive_topology_list_restartFernando Sahmkow3-2/+40
2021-12-16video_core/codecs: (re-spin) refactor ffmpeg searching and handlingliushuyu1-0/+6
2021-12-15Revert "video_core/codecs: refactor ffmpeg searching and handling in cmake"bunnei1-6/+0
2021-12-14CI: fix CI on Linuxliushuyu1-3/+0
2021-12-14video_core/codecs: skip decoders that use hw frames ...liushuyu1-0/+9
... this would resolve some edge-cases where multiple devices are present and ffmpeg is unable to auto-supply the hw surfaces
2021-12-11maxwell_to_vk: Add ASTC_2D_5X4_UNORMMorph1-1/+1
2021-12-10Fix blit image/view not compatibleFeng Chen1-1/+6
2021-12-09maxwell_to_vk: Add ASTC_2D_8X5_UNORMMorph1-1/+1
- Used by Lego City Undercover
2021-12-08renderer_vulkan: Add R16G16_UINTMorph2-1/+2
- Used by Immortals Fenyx Rising
2021-12-05vk_texture_cache: Add ABGR src format check for D24S8 conversionsameerj1-1/+5
2021-12-05renderer_opengl: Minor refactoring of filter selectionameerj1-30/+20
2021-12-05texture_cache: Fix image convert dimensions assertionameerj1-1/+12
2021-12-05blit_image: Refactor upscale factors usageameerj6-62/+53
The image view itself can be queried to see if it is being rescaled or not, removing the need to pass the upscale/down shift factors from the texture cache.
2021-12-05vk_texture_cache: Add a function to ImageView to check if src image is rescaledameerj2-4/+22
2021-12-05blit_image: Refactor ConvertPipeline functionsameerj2-29/+15
2021-12-05blit_image: Refactor ConvertPipelineEx functionsameerj2-33/+18
reduces much of the duplication between the color/depth variants
2021-12-05vk_blit_screen: Minor refactor of filter pipeline selectionameerj1-21/+16
2021-12-05Revert "Merge pull request #7395 from Morph1984/resolve-comments"ameerj3-16/+31
This reverts commit d20f91da11fe7c5d5f1bd4f63cc3b4d221be67a4, reversing changes made to 5082712b4e44ebfe48bd587ea2fa38767b7339cb.
2021-12-04Address feedbackFeng Chen1-4/+5
2021-12-04Texture Cache: Fix crashes on NVIDIA.Fernando Sahmkow1-3/+6
2021-12-03video_core/cmake: link against libva explicitly ...liushuyu1-0/+1
... to fix build on Flatpak (and self-builds)
2021-12-03video_core/codecs: more fixes for VAAPI detection ...liushuyu1-63/+25
* skip impersonated VAAPI implementaions ("imposter detection") * place VAAPI priority below CUDA/NVDEC/CUVID
2021-12-03video_core/codec: address commentsliushuyu1-8/+12
2021-12-03video_core/codecs: more robust ffmpeg hwdecoder selection logicliushuyu1-10/+27
2021-12-02general: Replace high_resolution_clock with steady_clockMorph2-2/+2
On some OSes, high_resolution_clock is an alias to system_clock and is not monotonic in nature. Replace this with steady_clock.
2021-12-02Support multiple videos playingFeng Chen2-32/+15
2021-11-29Add missing pixel format mappingFeng Chen1-0/+2
2021-11-28Texture Cache: Secure insertions against deletions.Fernando Sahmkow1-3/+13
2021-11-27Texture Cache: Redesigning the blitting system (again).Fernando Sahmkow3-23/+64
2021-11-26Texture Cache: Further fix regressions.Fernando Sahmkow1-11/+15
2021-11-25video_core/codec: address commentsliushuyu1-17/+11
2021-11-25video_core/codecs: fix multiple decoding issues on Linux ...liushuyu1-2/+47
* when someone installed Intel video drivers on an AMD system, the decoder will select the Intel VA-API decoding driver and yuzu will crash due to incorrect driver selection; the fix will check if the currently about-to-use driver is loaded in the kernel * when using NVIDIA driver on Linux with a ffmpeg that does not have CUDA capability enabled, the decoder will crash; the fix simply making the decoder prefers the VDPAU driver over CUDA on Linux
2021-11-22Texture Cache: Fix issue with blitting 3D textures.Fernando Sahmkow1-2/+4
2021-11-22Texture Cache: Correct conversion shaders.Fernando Sahmkow2-2/+2
2021-11-22Texture Cache: Always copy on NVIDIA.Fernando Sahmkow1-0/+5
2021-11-22TextureCache: Simplify blitting of D24S8 formats and fix bugs.Fernando Sahmkow10-195/+73
2021-11-21VulkanTexturECache: Use reinterpret on D32_S8 formats.Fernando Sahmkow1-2/+7
2021-11-21HostShaders: Fix D24S8 convertion shaders.Fernando Sahmkow6-23/+47
2021-11-21TextureCache: Eliminate format deduction as full depth conversion has been supported.Fernando Sahmkow2-29/+5
2021-11-21vk_texture_cache: Mark VkBufferUsageFlags as static constexprMorph1-3/+3
2021-11-21vk_blit_image: Consolidate CreatePipelineTargetEx functionsMorph2-28/+13
2021-11-20Fix screenshot dimensions when at 1x scaleameerj2-8/+0
This was regressed by ART. Prior to ART, the screenshots were saved at the title's framebuffer resolution. A misunderstanding of the existing logic led to screenshot dimensions becoming dependent on the host render window size. This changes the behavior to match how it was prior to ART at 1x, with screenshots now always being the title's framebuffer dimensions scaled by the resolution scaling factor.
2021-11-20TextureCache: Refactor and fix linux compiling.Fernando Sahmkow2-9/+4
2021-11-20TextureCache: Assure full conversions on depth/stencil write shaders.Fernando Sahmkow3-6/+6
2021-11-20TextureCache: Implement buffer copies on Vulkan.Fernando Sahmkow6-9/+193
2021-11-20TextureCache: Add R16G16 to D24S8 converter.Fernando Sahmkow5-0/+38
2021-11-19TextureCache: Add B10G11R11 to D24S8 converter.Fernando Sahmkow5-13/+84
2021-11-19TextureCache: Further fixes on resolve algorithm.Fernando Sahmkow2-16/+17
2021-11-19Implement convert legacy to genericFeng Chen2-0/+5
2021-11-19TextureCache: Implement additional D24S8 convertions.Fernando Sahmkow6-0/+86
2021-11-19TextureCache: force same image format when resolving an image.Fernando Sahmkow2-2/+9
2021-11-19TextureCache: Fix regression caused by ART and improve blit detection algorithm to be smarter.Fernando Sahmkow2-10/+27
2021-11-19Vulkan: implement D24S8 <-> RGBA8 convertions.Fernando Sahmkow6-0/+166
2021-11-18renderer_vulkan: Implement S8_UINT stencil formatMorph3-0/+18
It should be noted that on Windows, only nvidia gpus support this format natively as of this commit.
2021-11-18gl_texture_cache: Round format conversion PBO to next power of 2ameerj1-1/+5
2021-11-17renderer_opengl: Implement S8_UINT stencil formatMorph3-6/+25
2021-11-17video_core: Add S8_UINT stencil formatMorph4-3/+21
2021-11-17Fix image update/download error when width too smallFeng Chen2-10/+18
2021-11-17texture_cache: Use pixel format conversion when supported by the runtimeameerj5-0/+15
2021-11-17gl_texture_cache: Make FormatConversionPass more genericameerj1-7/+12
This allows the usage of the FormatConversionPass to be applied to more than the previously used BGR conversion scenarios.
2021-11-17gl_texture_cache: Rename BGRCopyPass to FormatConversionPassameerj2-21/+18
2021-11-17TextureCache: Fix Automatic Anisotropic.Fernando Sahmkow1-6/+5
2021-11-17TextureCache: OGL query device memory if possible.FernandoS272-2/+14
2021-11-17TextureCache: Fix OGL cleaningFernando Sahmkow5-0/+43
2021-11-16TextureCache: Add automatic anisotropic filtering and refactor code.Fernando Sahmkow3-15/+16
2021-11-16TextureCache: Make a better Anisotropic setter.Fernando Sahmkow3-20/+17
2021-11-16Texture Cache: revert Image changes.Fernando Sahmkow1-0/+4
2021-11-16HostShader: fix Gaussian filter.FernandoS271-2/+2
2021-11-16Texture Cahe/Shader decompiler: Resize PointSize on rescaling, refactor and make reaper more agressive on 4Gb GPUs.FernandoS274-22/+8
2021-11-16texture_cache: Refactor Render Target scaling functionameerj2-14/+24
2021-11-16gl_resource_manager: Ensure non EXT_framebuffer objects are createdameerj2-13/+8
2021-11-16Texture Cache: Fix memory usage on ScaleDown.FernandoS271-4/+0
2021-11-16OpenGL: Fix viewport/Scissor scaling on downscaling.FernandoS271-6/+28
2021-11-16Vulkan: fix regression.FernandoS271-14/+17
2021-11-16host_shaders: Misc copyright/style changesameerj4-10/+12
2021-11-16FSR: Fix GCC build errorsameerj3-43/+50
2021-11-16Vulkan: Reimplement FSR constant generation functions to avoid GCC warningsMarshall Mohror2-9/+145
2021-11-16vk_blit_screen: Fix AA destruction orderameerj1-9/+10
2021-11-16Presentation: Only use FP16 in scaling shaders on supported devices in VulkanMarshall Mohror14-116/+197
2021-11-16renderer_vulkan/blit_image: Use generic color state on Depth to Color blitsameerj1-1/+1
Fixes Bayonetta 2 on AMD
2021-11-16vk_texture_cache: Refactor 3D scaling helpersameerj2-113/+74
2021-11-16gl_rasterizer: Fix ScissorTest and Clear when scalingameerj1-10/+6
2021-11-16gl_texture_cache: Simplify scaling proceduresameerj2-57/+28
2021-11-16OpenGlTextureCache: Fix state invalidation on rescaling.Fernando Sahmkow3-2/+17
2021-11-16VulkanBufferCache: Avoid adding barriers between multiple copies.Fernando Sahmkow3-5/+43
2021-11-16HostShader: Fix gaussian and add attribution.Fernando Sahmkow1-23/+19
2021-11-16Vulkan: Fix FXAA in AMD.Fernando Sahmkow1-2/+40
2021-11-16Texture Cache: Fix blitting.Fernando Sahmkow1-2/+2
2021-11-16Vulkan: Implement FXAAFernandoS273-22/+387
2021-11-16OpenGL: fix FXAA with scalingMarshall Mohror2-9/+31
2021-11-16OpenGL: Implement FXAAMarshall Mohror6-35/+194
2021-11-16QtGUI: Add buttton to toggle the filter.FernandoS271-0/+1
2021-11-16VideoCore: Add gaussian filtering.FernandoS276-0/+132
2021-11-16TextureCache: Improve Reaper.FernandoS272-14/+26
2021-11-16Vulkan: fix waiting on semaphore.FernandoS271-1/+3
2021-11-16Update scaleforce to use FP16Marshall Mohror1-88/+55
2021-11-16TextureCache: fix rescaling in aliases and overlap joins.FernandoS274-23/+48
2021-11-16Presentation: Fix turning FSR on and off in settingsMarshall Mohror1-0/+11
2021-11-16Video Core: fix building for GCC.Fernando Sahmkow4-22/+40
2021-11-16Vulkan Rasterizer: Fix clears on integer textures.FernandoS273-1/+84
2021-11-16Texture cache: fix Intel with rescaler.FernandoS271-2/+2
2021-11-16TextureCache: Fix blitting filter in Vulkan and correct viewport/scissor calculation when downscaling.FernandoS272-20/+44
2021-11-16Texture Cache: fix memory managment and optimize scaled downloads, uploads.Fernando Sahmkow7-28/+57
2021-11-16Texture Cache: ease the requirements of textures being blacklisted.Fernando Sahmkow2-22/+7
2021-11-16Vulkan: Fix Blit Depth StencilFernando Sahmkow2-14/+20
2021-11-16Texture Cache: Fix downscaling and correct memory comsumption.Fernando Sahmkow8-36/+147
2021-11-16Presentation: add Nearest Neighbor filter.Fernando Sahmkow4-9/+56
2021-11-16vulkan: Implement FidelityFX Super ResolutionMarshall Mohror9-17/+637
2021-11-16Texture Cache: Rescale conversions between depth and colorFernandoS276-25/+37
2021-11-16Texture cache: Fix memory consumption and ignore rating when a depth texture is rendered.Fernando Sahmkow3-7/+19
2021-11-16vulkan: Fix rescaling push constant usageameerj4-35/+42
2021-11-16Texture Cahe: Fix downscaling on SMO.Fernando Sahmkow3-0/+8
2021-11-16texture_cache_base: Remove unused function declarationsameerj1-8/+0
2021-11-16vk_texture_cache: Use 3D to scale images when blit is unsupportedameerj4-29/+87
2021-11-16texture_cache: Fix infinitely recursive ImageCanRescale checkameerj3-10/+13
2021-11-16vk_texture_cache: Fix BlitScale of non-2D imagesameerj1-10/+9
2021-11-16video_core: Refactor resolution scale functionameerj3-46/+20
2021-11-16texture_cache: Fix image resolves when src/dst are not both scaledameerj1-5/+8
2021-11-16video_core,yuzu: Move UpdateRescalingInfo call to video_corelat9nq1-0/+2
This only needs to happen once per game boot, so we can just call it during CreateGPU and be done with it, avoiding the need to call it in the frontends.
2021-11-16gl_texture_cache: Disable scissor test when scaling texturesameerj1-0/+8
Fixes a bug on BOTW where some objects were no longer being rendered after blitting
2021-11-16vk_texture_cache: Fix unsupported blit format error checkingameerj2-9/+9
2021-11-16vk_texture_cache: Fix early returns on unsupported scalesameerj2-19/+11
2021-11-16video_core: Misc resolution scaling related refactoringameerj7-46/+50
2021-11-16texture_cache: Refactor scaled image size calculationameerj2-12/+13
2021-11-16Texture Cache: Fix calculations when scaling.Fernando Sahmkow1-0/+12
2021-11-16gl_texture_cache: Fix BGR pbo size for scaled texturesameerj1-11/+10
2021-11-16Texture Cache: Fix Rescaling on MultisampleFernando Sahmkow3-8/+21
2021-11-16TextureCache: Base fixes on rescaling.Fernando Sahmkow2-4/+6
2021-11-16vk_texture_cache: Simplify scaled image managementameerj2-107/+34
2021-11-16gl_texture_cache: Fix scaling backup logicameerj2-20/+16
2021-11-16vk_rasterizer: Fix scaling on Y_NEGATEameerj1-3/+9
2021-11-16vk_texture_cache: Use nearest neighbor scaling when availableameerj4-29/+36
2021-11-16gl_texture_cache: Fix depth and integer format scaling blitsameerj2-16/+61
2021-11-16gl_texture_cache/rescaling_pass: minor cleanupameerj2-4/+2
2021-11-16vk_texture_cache: Minor cleanupameerj2-11/+8
2021-11-16image_info: Mark MSAA textures as non-rescalableameerj1-2/+2
Blitting or resolving multisampled images requires the dimensions of the src and dst to be equal for valid usage, making them difficult for resolution scaling using the current implementation.
2021-11-16gl_texture_cache: Simplify scalingameerj2-31/+39
We don't need to reconstruct new textures every time we ScaleUp/ScaleDown. We can scale up once, and revert to the original texture whenever scaling down. Fixes memory leaks due to glDeleteTextures being deferred for later handling on some drivers
2021-11-16Renderers: Unify post processing filter shadersameerj7-211/+36
2021-11-16gl_texture_cache: fix scaling on uploadameerj1-0/+7
2021-11-16Renderer: Implement Bicubic and ScaleForce filters.Fernando Sahmkow9-13/+537
2021-11-16Texture Cache: fix scaling on upload and stop scaling on base resolution.Fernando Sahmkow1-14/+32
2021-11-16shader, video_core: Fix GCC build errorsameerj2-10/+3
2021-11-16emit_spirv: Fix RescalingLayout alignmentameerj2-4/+7
2021-11-16TextureCache: Fix Buffer Views Scaling.Fernando Sahmkow2-5/+9
2021-11-16Texture Cache: Correctly fix Blits Rescaling.Fernando Sahmkow1-9/+12
2021-11-16texture_cache: Disable dst_image scaling in BlitImageameerj1-5/+7
Fixes scaling in Super Mario Party
2021-11-16emit_spirv: Fix RescalingLayout alignmentameerj1-1/+1
2021-11-16shader: Properly scale image reads and add GL SPIR-V supportReinUsesLisp5-26/+57
Thanks for everything!
2021-11-16shader: Properly blacklist and scale image loadsReinUsesLisp4-8/+12
2021-11-16texture_cache: Add getter to query if image view is rescaledReinUsesLisp5-22/+12
2021-11-16vk_rasterizer: Minor style changeReinUsesLisp1-2/+2
2021-11-16gl_texture_cache: Fix scaling blitsReinUsesLisp1-20/+12
2021-11-16glsl/glasm: Pass and use scaling parameters in shadersReinUsesLisp3-21/+40
2021-11-16gl_rasterizer: Properly scale viewports and scissorsReinUsesLisp1-23/+24
2021-11-16gl_texture_cache: Fix multi layered texture Scaleameerj1-11/+15
2021-11-16gl_compute_pipeline: Add downscale factor to shader uniformsameerj1-0/+9
2021-11-16gl_rasterizer: Fix rescale dirty state checkingameerj1-4/+9
2021-11-16gl_graphics_pipeline: Add downscale factor to shader uniformsameerj1-1/+14
2021-11-16texture_cache: Fix blacklists on computeReinUsesLisp1-1/+1
2021-11-16texture_cache: Simplify image view queries and blacklistingReinUsesLisp16-192/+192
2021-11-16Vulkan: Fix downscaling Blit.Fernando Sahmkow1-14/+18
2021-11-16Texture Cache: Implement Rating System.Fernando Sahmkow5-15/+47
2021-11-16OpenGL: set linear mag filter when blitting a downscaled image.Fernando Sahmkow1-0/+1
2021-11-16Vulkan: Fix AA when rescaling.Fernando Sahmkow1-1/+1
2021-11-16Texture Cache: Implement Blacklisting.Fernando Sahmkow5-4/+90
2021-11-16vulkan: Implement rescaling shader patchingReinUsesLisp8-27/+103
2021-11-16vk_texture_cache: Properly scale blit source imagesReinUsesLisp1-2/+2
2021-11-16vk_graphics_pipeline: Use Shader::NumDescriptors when possibleReinUsesLisp1-18/+6
2021-11-16opengl: Use Shader::NumDescriptors when possibleReinUsesLisp3-46/+20
2021-11-16texture_cache: Add image gettersReinUsesLisp2-0/+16
2021-11-16gl_texture_cache: Simplify rescalingameerj2-19/+15
2021-11-16texture_cache: Fix typo in aliased image rescalingameerj1-1/+1
2021-11-16vk_texture_cache: Simplify and optimize scaling blitsReinUsesLisp1-106/+62
2021-11-16vk_texture_cache: Fix scaling blit validation errorsReinUsesLisp1-81/+78
2021-11-16gl_texture_cache: Implement ScaleDownameerj2-26/+36
2021-11-16gl_texture_cache: Rescale fixes for multi-layered texturesameerj2-16/+32
2021-11-16Texture Cache: Implement Rescaling on Aliases and Blits.Fernando Sahmkow1-5/+53
2021-11-16Fix blits with mipsReinUsesLisp1-12/+16
2021-11-16Fix blitsReinUsesLisp1-10/+10
2021-11-16renderer_gl: Resolution scaling fixesameerj3-61/+107
2021-11-16TextureCache: Fix rescaling of ImageCopiesFernando Sahmkow3-18/+67
2021-11-16TextureCache: Modify Viewports/Scissors according to Rescale.Fernando Sahmkow5-35/+89
2021-11-16Settings: eliminate rescaling_factor.Fernando Sahmkow2-6/+5
2021-11-16Texture Cache: More rescaling fixes.Fernando Sahmkow4-84/+96
2021-11-16gl_texture_cache: WIP texture rescaleameerj2-3/+69
2021-11-16Texture Cache: Implement Vulkan UpScaling & DownScalingFernando Sahmkow6-42/+327
2021-11-16VideoCore: Initial Setup for the Resolution Scaler.Fernando Sahmkow9-18/+236
2021-11-13codes: Rename ComposeFrameHeader to ComposeFrameameerj7-14/+14
These functions were composing the entire frame, not just the headers. Rename to more accurately describe them.
2021-11-13vp8: Implement header compositionameerj4-6/+90
Enables frame decoding with FFmpeg
2021-11-13codecs: Add VP8 codec classameerj9-20/+90
2021-11-05vulkan_device: Add missing vulkan image format R5G6B5 in GetFormatPropertiesFeng Chen1-0/+1
- Used by Dragon Quest Builders
2021-11-01gl_rasterizer: Remove unused includesMorph1-4/+2
This removes unused includes, especially the core includes which were causing this file to be recompiled every time files included by those headers are modified.
2021-10-29gl_device: Force GLASM on NVIDIA drivers 495-496lat9nq1-0/+15
GLSL shaders currently do not render correctly on the recent NVIDIA drivers. This adds a check that forces assembly shaders for these drivers since they seem unaffected and adds a warning informing of the decision. Developers can disable the check by enabling graphics debugging.
2021-10-23Vulran Rasterizer: address feedback.Fernando Sahmkow1-3/+5
2021-10-22Fix vulkan viewport issueFeng Chen1-0/+1
2021-10-17settings: Remove std::chrono usageameerj1-0/+1
Alleviates the dependency on chrono for all files that include settings.h
2021-10-11vic: Use the minimum of surface/frame dimensions when writing the final frame to the GPUameerj1-16/+15
Addresses possible buffer overflow behavior.
2021-10-10h264: Use max allowed max_num_ref_frames when using CPU decodingFeng Chen1-1/+6
2021-10-09vic: Allow surface to be higher than frameValeri1-2/+3
Touhou Genso Wanderer Lotus Labyrinth R decodes 1920x1080 videos into 1920x1088 surface. Only allow mismatch for height, since larger width would result in increasingly offset rows and somewhat defeat entire purpose of this check.
2021-10-08vic: Avoid memory corruption when multiple streams with different dimensions are decodedameerj1-0/+9
This is a work around to avoid buffer overflow errors until multi channel/multi stream decoding is supported.
2021-10-07vic: Refactor frame writing methodsameerj2-138/+146
2021-10-07vic: Implement RGBX frame formatameerj2-3/+15
2021-10-04Vulkan: Fix failing barrier on refresh.Fernando Sahmkow1-1/+2
2021-10-04RasterizerInterface: Correct size of CPU addresses to cache.FernandoS271-1/+1
2021-10-04Vulkan: Fix the master SemaphoreFernandoS271-4/+12
2021-10-03nvhost_ctrl: Refactor usage of gpu.LockSync()ameerj2-20/+1
This seems to only be used to protect a later gpu function call. So we can move the lock into that call instead.
2021-10-03gpu: Migrate implementation to the cpp fileameerj17-627/+862
2021-10-02common/logging: Move Log::Entry declaration to a separate headerameerj1-0/+1
This reduces the load of requiring to include std::chrono in all files which include log.h
2021-09-28vk_graphics_pipeline: Force patch list topology when tessellation is usedameerj1-1/+10
Fixes a crash on some drivers when tessellation is used but the IA topology is not patch list.
2021-09-24general: Update style to clang-format-12ameerj3-13/+9
2021-09-24video_core: Fix jthread related hangs when stopping emulationameerj1-1/+1
jthread on some compilers is more picky when it comes to the order in which objects are destroyed.
2021-09-24vk_texture_cache: Disable cube compatibility flag on non-mesa AMD GCN4 and earlierameerj3-11/+22
Fixes rainbow textures on BOTW.
2021-09-24Vulkan Query Cache: make sure to wait for the query result.Fernando Sahmkow1-1/+2
2021-09-24QueryCache: Flush queries in order of running.Fernando Sahmkow1-4/+4
2021-09-23Vulkan Rasterizer: Correct DepthBias/PolygonOffset on Vulkan.Fernando Sahmkow6-3/+29
2021-09-20maxwell_dma: Minor refactoringameerj2-33/+33
2021-09-20buffer_cache: Minor fixesameerj2-6/+4
Loop through the tmp_intervals by reference, rather than by copy, and fix gl clear buffer size calculation.
2021-09-17host_shaders: Remove opengl_copy_bgra.compameerj4-19/+0
2021-09-17gl_texture_cache: Migrate BGRCopyPass from util_shadersameerj4-42/+48
The BGR copies no longer use shaders.
2021-09-16vulkan_device: Reorder Float16Int8 declarationameerj1-1/+2
This variable was going out of scope before its usage in the vulkan device creation, causing a crash on very specific drivers.
2021-09-16Revert "Merge pull request #7006 from FernandoS27/a-motherfucking-driver"ameerj1-13/+1
This reverts commit 62e88d0e7455e37840db7e2a8e199bc6ca176966, reversing changes made to edf3da346f4ec0ca492b427f4f693d56e84abc52.
2021-09-16util_shaders: Unify BGRA copy passesameerj5-82/+36
2021-09-16vk_scheduler: Use std::jthreadameerj2-17/+9
2021-09-16gpu: Use std::jthread for async gpu threadameerj4-64/+17
2021-09-14renderers: Log total pipeline countMorph2-0/+4
2021-09-14vulkan_debug_callback: Ignore InvalidCommandBuffer-VkDescriptorSet errorsameerj1-0/+1
This validation error is spammed on some titles, asserting that VkDescriptorSet 0x0[] was destroyed. This is likely a validation layer bug when using VK_KHR_push_descriptor, which can avoid using traditional VkDescriptorSet. It should be safe to ignore for now.
2021-09-13Vulkan: Disable VK_EXT_SAMPLER_FILTER_MINMAX in GCN AMD since it's broken.Fernando Sahmkow1-6/+20
2021-09-13Vulkan: Blacklist Int8Float16 Extension on AMD on driver 21.9.1Fernando Sahmkow1-1/+13
2021-09-13Vulkan/Descriptors: Increase sets per pool on AMFD propietary driver.Fernando Sahmkow3-3/+14
2021-09-13vk_swapchain: Use immediate present mode when mailbox is unavailable and FPS is unlockedameerj3-4/+38
Allows drivers that do not support VK_PRESENT_MODE_MAILBOX_KHR the ability to present at a framerate higher than the monitor's refresh rate when the FPS is unlocked.
2021-09-12vk_rasterizer: Fix dynamic StencilOp updating when two faces are enabledameerj1-6/+8
This function was incorrectly using the stencil_two_side_enable register when dynamically updating the StencilOp.
2021-09-12vk_state_tracker: Remove unused functionameerj1-4/+0
2021-09-11shader_environment: Add missing <algorithm> includeMorph1-0/+1
2021-09-11vk_descriptor_pool: Add missing <algorithm> includeMorph1-0/+1
2021-09-11slot_vector: Add missing <algorithm> includeMorph1-0/+1
2021-09-11video_core/memory_manager: Add missing <algorithm> includeMorph1-0/+2
2021-09-11codec: Add missing <string_view> includeMorph1-0/+1
2021-09-07Fix blend equation enum errorFeng Chen1-4/+4
2021-09-02renderer_vulkan: Wait on present semaphore at queue submitameerj5-26/+33
The present semaphore is being signalled by the call to acquire the swapchain image. This semaphore is meant to be waited on when rendering to the swapchain image. Currently it is waited on when presenting, but moving its usage to be waited on in the command buffer submission allows for proper usage of this semaphore. Fixes the device lost when launching titles on the Intel Linux Mesa driver.
2021-08-30structured_control_flow: Conditionally invoke demote reorder passameerj3-0/+7
This is only needed on select drivers when a fragment shader discards/demotes.
2021-08-29Garbage Collection: Make it more agressive on high priority mode.Fernando Sahmkow3-5/+5
2021-08-29Garbage Collection: Adress Feedback.Fernando Sahmkow3-5/+12
2021-08-29vulkan_device: Enable VK_KHR_swapchain_mutable_format if availableameerj3-0/+27
Silences validation errors when creating sRGB image views of linear swapchain images
2021-08-29vk_swapchain: Prefer linear swapchain format when presenting sRGB imagesameerj3-11/+10
Fixes broken sRGB when presenting from a secondary GPU.
2021-08-28Garbage Collection: enable as default, eliminate option.Fernando Sahmkow2-3/+2
2021-08-28VideoCore: Rework Garbage Collection.Fernando Sahmkow5-101/+72
2021-08-26vp9_types: Minor refactor of VP9 info structs.ameerj1-32/+29
2021-08-26vp9_types: Remove unused Vp9PictureInfo membersameerj2-24/+1
2021-08-25vulkan_device: Add a check for int8 supportameerj3-9/+19
Silences validation errors when shaders use int8 without specifying its support to the API
2021-08-21vk_rasterizer: Only clear depth and stencil buffers when set in attachment aspect maskameerj3-6/+24
Silences validation errors for clearing the depth/stencil buffers of framebuffer attachments that were not specified to have depth/stencil usage.
2021-08-19GPU_MemoryManger: Fix GetSubmappedRange.Fernando Sahmkow1-0/+1
2021-08-19video_core: eliminate constant ternaryValeri1-1/+1
`via_header_index` is already checked above, so it would never be true in this branch
2021-08-16h264: Lower max_num_ref_framesameerj1-1/+2
GPU decoding seems to be more picky when it comes to the maximum number of reference frames.
2021-08-16configure_graphics: Add GPU nvdec decoding as an optionameerj2-2/+7
Some system configurations may see visual regressions or lower performance using GPU decoding compared to CPU decoding. This setting provides the option for users to specify their decoding preference. Co-Authored-By: yzct12345 <87620833+yzct12345@users.noreply.github.com>
2021-08-16codec: Improve libav memory alloc and cleanupameerj2-14/+19
2021-08-16codec: Fallback to CPU decoding if no compatible GPU format is foundameerj2-22/+32
2021-08-16cmake: Add VDPAU and NVDEC support to FFmpeglat9nq1-0/+1
Adds {h264_,vp9_}{nvdec,vdpau} hwaccels.
2021-08-16vk_blit_screen: Fix non-accelerated texture size calculationameerj2-9/+3
Addresses the potential OOB access in UnswizzleTexture.
2021-08-15xbyak: Update include pathMerry1-1/+1
2021-08-12codec: Replace deprecated av_init_packet usageameerj1-9/+13
2021-08-12nvdec: Implement GPU accelerated decoding for all platformsameerj2-70/+92
Supplements the VAAPI intel gpu decoder by implementing the D3D11VA decoder for Windows, and CUVID/VDPAU for Nvidia and AMD on drivers linux respectively.
2021-08-12decoders: Templates allow memcpy optimizationsyzct123451-57/+116
2021-08-10vic: Specify sws_scale height stride.ameerj1-3/+2
Silences a sws_scale runtime warning about unaligned strides.
2021-08-08vp9: Ensure the first frame is completeameerj2-3/+3
Silences a runtime error due to the first frame missing the frame data, and being set to hidden despite being a key-frame.
2021-08-08texture_cache: Address ameerj's reviewyzct123453-7/+4
2021-08-07vulkan_memory_allocator: Respect bufferImageGranularityRobin Kertels2-2/+8
2021-08-07nvdec: Better logging for unimplemented codecsameerj1-1/+1
2021-08-07texture_cache: Address ameerj's reviewyzct123454-10/+5
2021-08-07vp9: Cleanup unused variablesameerj3-58/+17
With reference frames refreshes fix, we no longer need to buffer two frames in advance. We can also remove other unused or otherwise unneeded variables.
2021-08-07vp9: Fix reference frame refreshesameerj2-48/+31
This resolves the artifacting when decoding VP9 streams.
2021-08-05texture_cache: Don't change copyright yearyzct123454-4/+4
2021-08-05texture_cache: Address ameerj's reviewyzct1234512-1821/+1821
2021-08-05texture_cache: Split templates outyzct123457-1532/+1533
2021-08-04nvdec: Implement VA-API hardware video acceleration (#6713)yzct123455-72/+175
* nvdec: VA-API * Verify formatting * Forgot a semicolon for Windows * Clarify comment about AV_PIX_FMT_NV12 * Fix assert log spam from missing negation * vic: Remove forgotten debug code * Address lioncash's review * Mention VA-API is Intel/AMD * Address v1993's review * Hopefully fix CMakeLists style this time * vic: Improve cache locality * vic: Fix off-by-one error * codec: Async * codec: Forgot the GetValue() * nvdec: Address ameerj's review * codec: Fallback to CPU without VA-API support * cmake: Address lat9nq's review * cmake: Make VA-API optional * vaapi: Multiple GPU * Apply suggestions from code review Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com> * nvdec: Address ameerj's review * codec: Use anonymous instead of static * nvdec: Remove enum and fix memory leak * nvdec: Address ameerj's review * codec: Remove preparation for threading Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2021-08-02decoders: Optimize swizzle copy performance (#6790)yzct123451-9/+43
This makes UnswizzleTexture up to two times faster. It is the main bottleneck in NVDEC video decoding.
2021-08-01astc_decoder: Reduce workgroup sizeameerj3-5/+5
This reduces the amount of over dispatching when there are odd dimensions (i.e. ASTC 8x5), which rarely evenly divide into 32x32.
2021-08-01astc_decoder: Compute offset swizzles in-shaderameerj4-109/+25
Alleviates the dependency on the swizzle table and a uniform which is constant for all ASTC texture sizes.
2021-08-01astc_decoder: Make use of uvec4 for payload dataameerj1-79/+43
2021-08-01astc_decoder: Simplify Select2DPartitionameerj1-38/+19
2021-08-01astc_decoder: Optimize the use EncodingDataameerj6-138/+108
This buffer was a list of EncodingData structures sorted by their bit length, with some duplication from the cpu decoder implementation. We can take advantage of its sorted property to optimize its usage in the shader. Thanks to wwylele for the optimization idea.
2021-08-01astc.h: Move data to cpp implementationameerj2-64/+63
Moves leftover values that are no longer used by the gpu decoder back to the cpp implementation.
2021-07-29vk_rasterizer: Flip viewport on Y_NEGATEReinUsesLisp1-2/+7
Matches OpenGL's behavior. I don't believe this register flips geometry, but we have to try to match behavior on both backends.
2021-07-29renderers: Add explicit invert_y bool to screenshot callbackameerj4-5/+5
OpenGL and Vulkan images render in different coordinate systems. This allows us to specify the coordinate system of the screenshot within each renderer
2021-07-29renderer_vulkan: Implement screenshotsameerj2-0/+152
2021-07-29vk_blit_screen: Add public CreateFramebuffer methodameerj2-14/+18
2021-07-29vk_blit_screen: Make Draw method more genericameerj3-55/+71
Allows specifying the framebuffer and render area dimensions, rather than being hard coded for the render window.
2021-07-28renderer_vulkan: Add setting to log pipeline statisticsReinUsesLisp13-19/+278
Use VK_KHR_pipeline_executable_properties when enabled and available to log statistics about the pipeline cache in a game. For example, this is on Turing GPUs when generating a pipeline cache from Super Smash Bros. Ultimate: Average pipeline statistics ========================================== Code size: 6433.167 Register count: 32.939 More advanced results could be presented, at the moment it's just an average of all 3D and compute pipelines.
2021-07-27render_target: Add missing initializer for size extentLioncash1-3/+3
Everything else has a default constructor that does the straightforward thing of initializing most members to a default value, except for the size. We explicitly initialize the size (and others, for consistency), to prevent potential uninitialized reads from occurring. Particularly given the largeish surface area that this struct is used in.
2021-07-27video_core/engine: Consistently initialize rasterizer pointersLioncash2-2/+2
Ensures all of the engines have consistent and deterministic initialization of the rasterizer pointers.
2021-07-27vulkan_wrapper: Fix SetObjectName() always indicating objects as imagesLioncash1-1/+1
We should be using the passed in object type instead.
2021-07-27buffer_cache: Remove unused small_vector in CommitAsyncFlushesHigh()Lioncash1-1/+0
Given this is non-trivial, the constructor is required to execute, so this removes a bit of redundant codegen.
2021-07-27gl_shader_cache: Remove unused variableLioncash1-1/+0
2021-07-27vk_compute_pass: Remove unused capturesLioncash1-3/+2
Resolves two compiler warnings.
2021-07-26vk_staging_buffer_pool: Fall back to host memory when allocation failsRobin Kertels1-8/+21
2021-07-26vk_stream_buffer: Remove unused stream bufferReinUsesLisp2-244/+0
Remove unused file.
2021-07-26vk_compute_pass: Fix pipeline barrier for indexed quadsReinUsesLisp1-1/+1
Use an index buffer barrier instead of a vertex input read barrier.
2021-07-26vk_buffer_cache: Add transform feedback usage to null bufferReinUsesLisp1-3/+7
Fixes bad API usages on Vulkan.
2021-07-24renderer_base: Removed redundant settingsameerj3-12/+4
use_framelimiter was not being used internally by the renderers. set_background_color was always set to true as there is no toggle for the renderer background color, instead users directly choose the color of their choice.
2021-07-24general: Rename "Frame Limit" references to "Speed Limit"ameerj1-1/+1
This setting is best referred to as a speed limit, as it involves the limits of all timing based aspects of the emulator, not only framerate. This allows us to differentiate it from the fps unlocker setting.
2021-07-23vulkan/blit_image: Commit descriptor sets within worker threadReinUsesLisp1-9/+7
Fixes race condition caused. The descriptor pool is not thread safe, so we have to commit descriptor sets within the same thread.
2021-07-23vulkan_device: Blacklist Volta and older from VK_KHR_push_descriptorReinUsesLisp1-4/+39
Causes crashes on Link's Awakening intro. It's hard to debug if it's our fault due to bugs in validation layers.
2021-07-23Revert "renderers: Disable async shader compilation"ReinUsesLisp2-5/+3
This reverts commit 4a152767286717fa69bfc94846a124a366f70065.
2021-07-23opengl: Fix asynchronous shadersReinUsesLisp2-4/+33
Wait for shader to build before configuring it, and wait for the shader to build before sharing it with other contexts.
2021-07-23shader_environment: Receive cache version from outsideReinUsesLisp4-16/+23
This allows us invalidating OpenGL and Vulkan separately in the future.
2021-07-23shader: Fix disabled attribute default valuesameerj1-1/+1
2021-07-23gl_device: Simplify GLASM setting logicameerj1-15/+8
2021-07-23renderer_opengl: Use ARB_separate_shader_objectsReinUsesLisp9-116/+154
Ensures that states set for a particular stage are not attached to other stages which may not need them.
2021-07-23glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZEameerj1-0/+1
2021-07-23gl_shader_cache: Properly implement asynchronous shadersReinUsesLisp1-1/+1
2021-07-23shader_recompiler, video_core: Resolve clang errorslat9nq1-3/+1
Silences the following warnings-turned-errors: -Wsign-conversion -Wunused-private-field -Wbraced-scalar-init -Wunused-variable And some other errors
2021-07-23renderers: Fix clang formattingameerj4-9/+13
2021-07-23renderers: Disable async shader compilationameerj2-3/+5
The current implementation is prone to causing graphical issues. Disable until a better solution is implemented.
2021-07-23maxwell_to_vk: Add R16_SNORMReinUsesLisp2-1/+2
2021-07-23shader: Ignore global memory ops on devices lacking int64 supportameerj2-0/+2
2021-07-23vulkan_device: Add missing include algorithmlat9nq1-0/+1
2021-07-23vulkan_device: Blacklist ampere devices from float16 mathameerj2-12/+23
2021-07-23gl_shader_cache: Fixes for async shadersameerj2-2/+25
2021-07-23vulkan_device: Enable VK_EXT_extended_dynamic_state on RADV 21.2 onwardReinUsesLisp1-4/+7
2021-07-23emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 NvidiaReinUsesLisp2-0/+2
Fix regression on Fire Emblem: Three Houses when using native fp16.
2021-07-23vk_rasterizer: Workaround bug in VK_EXT_vertex_input_dynamic_stateReinUsesLisp4-19/+20
Workaround potential bug on Nvidia's driver where only updating high attributes leaves low attributes out dated.
2021-07-23shader: Fix disabled and unwritten attributes and varyingsReinUsesLisp1-15/+20
2021-07-23vk_graphics_pipeline: Implement smooth linesReinUsesLisp5-5/+65
2021-07-23vk_graphics_pipeline: Implement line widthReinUsesLisp8-8/+36
2021-07-23video_core: Enable GL SPIR-V shaderslat9nq7-38/+105
2021-07-23general: Add setting shader_backendlat9nq1-4/+6
GLASM is getting good enough that we can move it out of advanced graphics settings. This removes the setting `use_assembly_shaders`, opting for a enum class `shader_backend`. This comes with the benefits that it is extensible for additional shader backends besides GLSL and GLASM, and this will work better with a QComboBox. Qt removes the related assembly shader setting from the Advanced Graphics section and places it as a new QComboBox in the API Settings group. This will replace the Vulkan device selector when OpenGL is selected. Additionally, mark all of the custom anisotropic filtering settings as "WILL BREAK THINGS", as that is the case with a select few games.
2021-07-23glasm: Add passthrough geometry shader supportReinUsesLisp3-1/+7
2021-07-23shader: Rework varyings and implement passthrough geometry shadersReinUsesLisp7-15/+43
Put all varyings into a single std::bitset with helpers to access it. Implement passthrough geometry shaders using host's.
2021-07-23vk_graphics_pipeline: Implement conservative renderingReinUsesLisp6-10/+44
2021-07-23shader: Unify shader stage typesReinUsesLisp14-53/+28
2021-07-23shader: Emulate 64-bit integers when not supportedReinUsesLisp5-2/+13
Useful for mobile and Intel Xe devices.
2021-07-23gl_graphics_pipeline: Fix assembly shaders check for transform feedbacksReinUsesLisp1-1/+1
2021-07-23gl_graphics_pipeline: Inline hash and operator== key functionsReinUsesLisp2-12/+8
2021-07-23gl_shader_cache: Check previous pipeline before checking hash mapReinUsesLisp5-29/+41
Port optimization from Vulkan.
2021-07-23gl_graphics_pipeline: Port optimizations from Vulkan pipelinesReinUsesLisp2-57/+141
2021-07-23buffer_cache: Fix debugging leftoverReinUsesLisp1-1/+1
2021-07-23buffer_cache: Fix size reductions not having in mind bind sizesReinUsesLisp1-7/+23
A buffer binding can change between shaders without changing the shaders. This lead to outdated bindings on OpenGL.
2021-07-23shaders: Allow shader notify when async shaders is disabledameerj2-11/+9
2021-07-23vk_graphics_pipeline: Use VK_KHR_push_descriptor when availableReinUsesLisp8-36/+88
~51% faster on Nvidia compared to previous method.
2021-07-23shader: Properly manage attributes not written from previous stagesReinUsesLisp2-5/+22
2021-07-23shader: Split profile and runtime info headersReinUsesLisp2-1/+2
2021-07-23shader: Add support for native 16-bit floatsReinUsesLisp5-10/+24
2021-07-23shader: Rename maxwell/program.h to translate_program.hReinUsesLisp2-2/+2
2021-07-23vulkan_device: Blacklist VK_EXT_vertex_input_dynamic_state on IntelReinUsesLisp1-0/+4
2021-07-23glsl: Address rest of feedbackameerj4-17/+22
2021-07-23glsl: Conditionally use fine/coarse derivatives based on device supportameerj1-0/+1
2021-07-23glsl: Cleanup/Address feedbackameerj1-0/+2
2021-07-23gl_shader_cache: Implement async shadersameerj7-107/+154
2021-07-23glsl: Add stubs for sparse queries and variable aoffi when not supportedameerj3-0/+8
2021-07-23gl_shader_cache: Remove const from pipeline source argumentsameerj4-6/+6
2021-07-23gl_shader_cache: Move OGL shader compilation to the respective Pipeline constructorameerj5-76/+79
2021-07-23glsl: Address more feedback. Implement indexed texture readsameerj1-3/+3
2021-07-23gl_rasterizer: Add texture fetch barrier for fragmentsameerj1-1/+1
Fixes flicker seen in XC2
2021-07-23glsl: Implement fswzaddameerj1-0/+1
and wip nv thread shuffle impl
2021-07-23glsl: Rebase fixesameerj2-3/+5
2021-07-23glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupportedameerj1-0/+1
2021-07-23glsl: skip gl_ViewportIndex write if device does not support itameerj1-0/+1
2021-07-23glsl: Implement transform feedbackameerj1-5/+13
2021-07-23glsl: Implement VOTE for subgroup size potentially largerameerj3-1/+7
2021-07-23glsl: Implement some attribute getters and settersameerj1-1/+0
2021-07-23glsl: Query GL Device for FP16 extension supportameerj3-0/+14
2021-07-23glsl: Fixup build issuesReinUsesLisp1-1/+1
2021-07-23glsl: Initial backendameerj1-2/+5
2021-07-23vk_rasterizer: Exit render passes on fragment barriersReinUsesLisp1-0/+1
2021-07-23vk_graphics_pipeline: Fix path with no VK_EXT_extended_dynamic_stateRodrigo Locatti1-1/+1
2021-07-23buffer_cache: Invalidate fast buffers on computeReinUsesLisp1-0/+1
2021-07-23shader: Add shader loop safety check settingslat9nq1-2/+2
Also add a setting for enable Nsight Aftermath.
2021-07-23vulkan_device: Enable VK_EXT_vertex_input_dynamic_stateReinUsesLisp1-0/+28
2021-07-23vk_pipeline_cache: Skip cached pipelines with different dynamic stateReinUsesLisp1-0/+6
2021-07-23vulkan: Add VK_EXT_vertex_input_dynamic_state supportReinUsesLisp11-116/+291
Reduces the number of total pipelines generated on Vulkan. Tested on Super Smash Bros. Ultimate.
2021-07-23shader: Reorder shader cache directoriesReinUsesLisp2-18/+12
2021-07-23vk_rasterizer: Implement first indexReinUsesLisp1-2/+5
2021-07-23vulkan: Use VK_EXT_provoking_vertex when availableReinUsesLisp6-4/+52
2021-07-23 gl_buffer_cache: Use unorm internal formats for snorm texture buffer viewsameerj1-1/+24
Fixes black textures in UE4 games
2021-07-23shader_environment: Fix local memory size calculationsReinUsesLisp1-3/+3
2021-07-23buffer_cache: Fix copy based uniform bindings trackingReinUsesLisp2-9/+22
2021-07-23shader_environment: Add shader_local_memory_crs_size to local memory sizeameerj1-1/+1
Fixes DOOM 2016 missing local memory
2021-07-23gl_texture_cache: Create image storage viewsReinUsesLisp4-38/+126
Fixes SULD.D tests.
2021-07-23gl_shader_util: Move shader utility code to a separate fileReinUsesLisp7-245/+106
2021-07-23gl_shader_cache: Store workers in shader cache objectReinUsesLisp2-58/+78
2021-07-23vk_pipeline_cache,shader_notify: Add shader notificationsReinUsesLisp9-90/+121
2021-07-23vk_pipeline_cache: Add asynchronous shadersReinUsesLisp3-3/+33
2021-07-23vk_rasterizer: Flush work on clear and dispatchesReinUsesLisp1-0/+3
2021-07-23DMA: Restrict optimised path for BlockToLinear further.FernandoS271-1/+2
2021-07-23vk_swapchain: Handle outdated swapchainsReinUsesLisp3-17/+34
Fixes pixelated presentation on Intel devices.
2021-07-23shader: Fix VertexA Shaders.FernandoS271-5/+21
2021-07-23vk_buffer_cache: Handle null texture buffersReinUsesLisp1-0/+4
Fixes a crash on Age of Calamity cutscenes.
2021-07-23nsight_aftermath_tracker: Fix SPIR-V module writesReinUsesLisp1-1/+1
2021-07-23vk_pipeline_cache: Set support_derivative_control to trueReinUsesLisp1-0/+1
2021-07-23glasm: Use ARB_derivative_control conditionallyReinUsesLisp3-0/+7
2021-07-23buffer_cache: Reduce uniform buffer size from shader usageReinUsesLisp9-35/+61
Increases performance significantly on certain titles.
2021-07-23transform_feedback: Read buffer stride from index instead of layoutReinUsesLisp1-1/+2
2021-07-23fixed_pipeline_state: Use regular for loop instead of ranges for perfReinUsesLisp1-2/+3
MSVC generates better code for it.
2021-07-23vk_swapchain: Avoid recreating the swapchain on each frameReinUsesLisp2-15/+9
Recreate only when requested (or sRGB is changed) instead of tracking the frontend's size. That size is still used as a hint.
2021-07-23vulkan: Conditionally use shaderInt16ReinUsesLisp3-2/+9
Add support for Polaris AMD devices.
2021-07-23vulkan: Enable depth bounds and use it conditionallyReinUsesLisp4-2/+17
Intel devices pre-Xe don't support this.
2021-07-23vk_buffer_cache: Add transform feedback usage to buffersReinUsesLisp1-15/+22
2021-07-23opengl: Declare fragment outputs even if they are not usedReinUsesLisp2-0/+9
Fixes Ori and the Blind Forest's menu on GLASM. For some reason (probably high level optimizations) it is not sanitized on SPIR-V for OpenGL. Vulkan is unaffected by this change.
2021-07-23buffer_cache: Mark uniform buffers as dirty if any enable bit changesReinUsesLisp5-7/+17
2021-07-23vulkan_device: Enable float64 and int64 conditionallyReinUsesLisp2-2/+6
Add Intel Xe support.
2021-07-23texture_cache: Reduce invalid image/sampler error severityReinUsesLisp1-7/+7
2021-07-23shader: Handle host exceptionsReinUsesLisp4-32/+55
2021-07-23glasm: Prepare XFB from state instead of global registersReinUsesLisp1-4/+2
2021-07-23glasm: Use storage buffers instead of global memory when possibleReinUsesLisp11-67/+120
2021-07-23gl_shader_cache: Add disk shader cacheReinUsesLisp3-11/+116
2021-07-23video_core,shader: Clang-format fixesReinUsesLisp2-5/+10
2021-07-23gl_shader_cache: Rename Program abstractions into PipelineReinUsesLisp10-104/+104
2021-07-23gl_shader_cache: Do not flip tessellation on OpenGLReinUsesLisp1-2/+1
2021-07-23gl_graphics_program: Fix texture buffer bindingsReinUsesLisp1-24/+35
2021-07-23gl_shader_cache: Conditionally use viewport maskReinUsesLisp1-1/+1
2021-07-23gl_shader_cache,glasm: Conditionally use typeless image reads extensionReinUsesLisp2-37/+39
2021-07-23gl_shader_cache: Improve GLASM error print logicReinUsesLisp1-7/+10
2021-07-23glasm: Implement forced early ZReinUsesLisp1-2/+2
2021-07-23glasm: Set transform feedback stateReinUsesLisp5-113/+132
2021-07-23video_core: Abstract transform feedback translation utilityReinUsesLisp6-111/+145
2021-07-23gl_shader_cache: Pass shader runtime informationReinUsesLisp1-2/+74
2021-07-23shader: Split profile and runtime information in separate structsReinUsesLisp3-237/+212
2021-07-23gl_shader_manager: Zero initialize current assembly programsReinUsesLisp1-1/+1
2021-07-23gl_shader_manager: Remove unintentionally committed #pragmaReinUsesLisp1-2/+0
2021-07-23renderer_opengl: State track compute assembly programsReinUsesLisp3-4/+21
2021-07-23renderer_opengl: State track assembly programsReinUsesLisp3-23/+56
2021-07-23HACK: Bind stages before and after bindingsReinUsesLisp1-0/+11
Works around a bug where program parameters are only applied to the current stage, and this one wasn't bound at the moment. Affects all SSBO usages on GLASM.
2021-07-23glasm: Support textures used in more than one stageReinUsesLisp1-1/+1
2021-07-23opengl: Initial (broken) support to GLASM shadersReinUsesLisp3-14/+53
2021-07-23vk_update_descriptor: Properly initialize payload on the update descriptor queueReinUsesLisp1-1/+3
2021-07-23vk_pipeline_cache: Enable int8 and int16 types on VulkanReinUsesLisp1-0/+2
2021-07-23gl_rasterizer: Flush L2 caches before glFlush on GLASMReinUsesLisp1-0/+8
2021-07-23glasm: Initial GLASM compute implementation for testingReinUsesLisp3-14/+47
2021-07-23vk_scheduler: Use locks instead of SPSC a queueReinUsesLisp2-32/+42
This tries to fix a data race where we'd wait forever for the GPU.
2021-07-23vk_query_cache: Wait before reading queriesReinUsesLisp1-9/+2
2021-07-23vk_master_semaphore: Use fetch_add to increase master semaphore tickReinUsesLisp2-6/+4
2021-07-23gl_shader_cache: Remove code unintentionally committedReinUsesLisp1-3/+0
2021-07-23Move SPIR-V emission functions to their own headerReinUsesLisp2-7/+6
2021-07-23shader: Initial OpenGL implementationReinUsesLisp35-705/+1415
2021-07-23spirv: Support OpenGL uniform buffers and change bindingsReinUsesLisp1-2/+5
2021-07-23shader: Address feedbackFernandoS271-9/+9
2021-07-23shader: Implement VertexA stageFernandoS271-3/+14
2021-07-23vk_graphics_pipeline: Fix texture buffer descriptorsReinUsesLisp1-7/+8
2021-07-23vk_scheduler: Allow command submission on worker threadReinUsesLisp8-182/+200
This changes how Scheduler::Flush works. It queues the current command buffer to be sent to the GPU but does not do it immediately. The Vulkan worker thread takes care of that. Users will have to use Scheduler::Flush + Scheduler::WaitWorker to get the previous behavior. Scheduler::Finish is unchanged. To avoid waiting on work never queued, Scheduler::Wait sends the current command buffer if that's what the caller wants to wait.
2021-07-23vk_compute_pass: Fix -Wshadow warningReinUsesLisp1-3/+3
2021-07-23shader: Move pipeline cache logic to separate filesReinUsesLisp12-824/+1095
Move code to separate files to be able to reuse it from OpenGL. This greatly simplifies the pipeline cache logic on Vulkan. Transform feedback state is not yet abstracted and it's still intrusively stored inside vk_pipeline_cache. It will be moved when needed on OpenGL.
2021-07-23vulkan: Defer descriptor set work to the Vulkan threadReinUsesLisp8-79/+69
Move descriptor lookup and update code to a separate thread. Delaying this removes work from the main GPU thread and allows creating descriptor layouts on another thread. This reduces a bit the workload of the main thread when new pipelines are encountered.
2021-07-23vulkan: Rework descriptor allocation algorithmReinUsesLisp15-197/+314
Create multiple descriptor pools on demand. There are some degrees of freedom what is considered a compatible pool to avoid wasting large pools on small descriptors.
2021-07-23vk_graphics_pipeline: Generate specialized pipeline config functions and improve codeReinUsesLisp2-31/+230
2021-07-23shader: Accelerate pipeline transitions and use dirty flags for shadersReinUsesLisp9-64/+114
2021-07-23vk_compute_pipeline: Fix index comparison oversight on compute texture buffersReinUsesLisp1-1/+1
2021-07-23vulkan_device: Require shaderClipDistance and shaderCullDistance featuresReinUsesLisp1-2/+4
2021-07-23vk_graphics_pipeline: Guard against non-tessellation pipelines using patchesReinUsesLisp1-2/+8
2021-07-23shader: Fix bugs and build issues on GCCRodrigo Locatti3-4/+4
2021-07-23shader: Fix render targets with null attachmentsReinUsesLisp2-26/+34
2021-07-23shader: Require dual source blendingReinUsesLisp1-1/+2
2021-07-23shader: Implement indexed texturesReinUsesLisp3-64/+95
2021-07-23shader: Move microinstruction header to the value headerReinUsesLisp1-1/+1
2021-07-23shader: Implement D3D samplersReinUsesLisp3-37/+51
2021-07-23shader: Implement SR_Y_DIRECTIONFernandoS273-0/+4
2021-07-23shader: Implement PIXLD.MY_INDEXReinUsesLisp1-1/+2
2021-07-23spirv: Implement ViewportMask with NV_viewport_array2ReinUsesLisp3-0/+12
2021-07-23shader: Implement tessellation shaders, polygon mode and invocation idReinUsesLisp6-3/+50
2021-07-23vk_pipeline_cache: Silence GCC warningslat9nq1-0/+2
Silences `-Werror=missing-field-initializers` due to missing initializers.
2021-07-23spirv: Implement image buffersReinUsesLisp4-26/+56
2021-07-23spirv: Implement alpha testameerj1-0/+36
2021-07-23shader: Implement transform feedbacks and define file formatReinUsesLisp3-7/+156
2021-07-23shader: Implement early Z testsReinUsesLisp1-0/+1
2021-07-23spirv: Rework storage buffers and shader memoryReinUsesLisp1-1/+28
2021-07-23shader: Implement geometry shadersReinUsesLisp2-7/+56
2021-07-23pipeline_helper: Simplify descriptor objects initializationReinUsesLisp1-33/+25
2021-07-23shader: Implement ATOM/S and REDameerj3-0/+21
2021-07-23nsight_aftermath_tracker: Report used shaders to Nsight AftermathReinUsesLisp6-16/+20
2021-07-23spirv: Guard against typeless image reads on unsupported devicesReinUsesLisp1-0/+1
2021-07-23vk_rasterizer: Request outside render pass execution context for computeReinUsesLisp1-0/+1
2021-07-23pipeline_helper: Add missing [[maybe_unused]]ReinUsesLisp1-1/+1
2021-07-23shader: Implement SULD and SUSTReinUsesLisp8-65/+135
2021-07-23shader: Address feedback + clang formatlat9nq1-2/+2
2021-07-23shader_recompiler,video_core: Cleanup some GCC and Clang errorslat9nq5-16/+16
Mostly fixing unused *, implicit conversion, braced scalar init, fpermissive, and some others. Some Clang errors likely remain in video_core, and std::ranges is still a pertinent issue in shader_recompiler shader_recompiler: cmake: Force bracket depth to 1024 on Clang Increases the maximum fold expression depth thread_worker: Include condition_variable Don't use list initializers in control flow Co-authored-by: ReinUsesLisp <reinuseslisp@airmail.cc>
2021-07-23shader: Interact texture buffers with buffer cacheReinUsesLisp14-119/+304
2021-07-23shader: Implement texture buffersReinUsesLisp4-12/+29
2021-07-23vk_pipeline_cache: Fix num of pipeline workers on weird platformsReinUsesLisp1-1/+1
2021-07-23shader: Fix ShadowCube declaration type, set number of pipeline threads based on hardwareFernandoS271-1/+3
2021-07-23vk_compute_pass: Fix compute passesReinUsesLisp3-23/+19
2021-07-23shader: Remove atomic flags and use mutex + cond variable for pipelinesReinUsesLisp4-11/+32
2021-07-23vk_pipeline_cache: Remove unnecesary scope in pipeline cache lockingReinUsesLisp1-15/+12
2021-07-23vk_pipeline_cache: Small fixes to the pipeline cacheFernandoS271-10/+14
2021-07-23shader: Mark SSBOs as written when they areFernandoS272-2/+2
2021-07-23shader: Implement ViewportIndexFernandoS271-0/+1
2021-07-23vulkan: Serialize pipelines on a separate threadReinUsesLisp2-67/+64
2021-07-23vulkan: Create pipeline layouts in separate threadsReinUsesLisp7-63/+65
2021-07-23vulkan: Build pipelines in parallel at runtimeReinUsesLisp9-165/+197
Wait from the worker thread for a pipeline to build before binding it to the command buffer. This allows queueing pipelines to multiple threads.
2021-07-23vk_pipeline_cache: Name SPIR-V modulesReinUsesLisp1-1/+11
2021-07-23shader: Address feedbackFernandoS271-1/+1
2021-07-23shader: Implement TLDFernandoS271-2/+1
2021-07-23spirv: Add fixed pipeline point sizeReinUsesLisp1-0/+3
2021-07-23shader: Implement BRXFernandoS271-1/+49
2021-07-23vk_pipeline_cache: Fix size hashing of shadersReinUsesLisp1-8/+7
2021-07-23shader: Implement LDS, STS, LDL, and STS and use SPIR-V 1.4 when availableReinUsesLisp3-19/+104
2021-07-23shader: Better interpolation and disabled attributes supportReinUsesLisp2-2/+5
2021-07-23spirv: Remove dependencies on Environment when generating SPIR-VReinUsesLisp1-7/+3
2021-07-23vk_pipeline_cache: Fix pipeline and shader cachesReinUsesLisp2-6/+21
2021-07-23shader: Fix rasterizer integration order issuesReinUsesLisp3-7/+6
2021-07-23shader: Implement TXQ and fix FragDepthReinUsesLisp1-0/+92
2021-07-23shader: Implement NDC [-1, 1], attribute types and default varying initializationReinUsesLisp3-3/+37
2021-07-23shader: Implement VOTEameerj4-1/+15
2021-07-23vk_pipeline_cache: Fix ReleaseContents orderReinUsesLisp1-2/+2
2021-07-23vk_pipeline_cache: Add pipeline cacheReinUsesLisp2-0/+7
2021-07-23vk_pipeline_cache: Add pipeline cacheReinUsesLisp4-98/+332
2021-07-23shader: Implement DMNMX, DSET, DSETPameerj1-0/+2
2021-07-23spirv: Implement VertexId and InstanceId, refactor codeReinUsesLisp1-0/+1
2021-07-23shader: Implement I2FReinUsesLisp1-1/+2
2021-07-23shader: Add partial rasterizer integrationReinUsesLisp20-410/+1298
2021-07-23spirv: Add SignedZeroInfNanPreserve logicameerj1-0/+4
2021-07-23shader: Initial support for textures and TEXReinUsesLisp4-1/+111
2021-07-23spirv: Fixes and Intel specific workaroundsReinUsesLisp1-0/+1
2021-07-23shader: Rename, implement FADD.SAT and P2R (imm)ReinUsesLisp1-2/+2
2021-07-23shader: Add denorm flush supportReinUsesLisp5-33/+50
2021-07-23spirv: Add lower fp16 to fp32 passReinUsesLisp4-9/+14
2021-07-23shader: Primitive Vulkan integrationReinUsesLisp15-2538/+430
2021-07-23shader: Remove old shader managementReinUsesLisp79-19513/+53
2021-07-23spirv: Initial SPIR-V supportReinUsesLisp2-3265/+0
2021-07-20gl_buffer_cache: Use glClearNamedBufferSubData:GL_RED instead of GL_RGBAReinUsesLisp1-1/+1
Avoids reading out of bounds from the stack.
2021-07-20buffer_cache: Simplify clear logicReinUsesLisp1-6/+2
Use existing helper functions and avoid looping when only one buffer has to be active.
2021-07-20vk_texture_cache: Use VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL when possibleReinUsesLisp1-21/+35
Silences performance warnings generated from validation layers on each frame.
2021-07-20gl_texture_cache: Workaround slow PBO downloads on radeonsiReinUsesLisp1-1/+1
There's an optimization bug on non-git mesa versions where not specifying GL_CLIENT_STORAGE_BIT causes very slow reads on the CPU side. Add this bit for all vendors.
2021-07-20vk_buffer_cache: Fix quad index array with 0 vertices (#6627)Fernando S1-0/+7
2021-07-19Update src/video_core/renderer_vulkan/vk_texture_cache.cppyzct123451-1/+1
Co-authored-by: Vitor K <vitor-kiguchi@hotmail.com>
2021-07-19Update src/video_core/renderer_vulkan/vk_texture_cache.cppyzct123451-1/+1
Co-authored-by: Vitor K <vitor-kiguchi@hotmail.com>
2021-07-18Ignore wrong blit formatyzct123451-1/+4
2021-07-18vk_texture_cache: Finalize renderpass when downloading imagesReinUsesLisp1-0/+1
2021-07-18vk_compute_pass: Fix pipeline barriers on non-initialized ASTC imagesReinUsesLisp1-2/+3
2021-07-18vk_compute_pass: Fix ASTC buffer setup synchronizationReinUsesLisp1-14/+14
2021-07-18texture_cache/util: Fix size calculations of multisampled imagesReinUsesLisp1-53/+33
On the texture cache we handle multisampled images by keeping their real size in samples (e.g. 1920x1080 with 4 samples is 3840x2160). This works nicely with size matches and other comparisons, but the calculation for guest sizes was not having this in mind, and the size was being multiplied (again) by the number of samples per dimension. For example a 3840x2160 texture cache image had its width and height multiplied by 2, resulting in a much larger texture. Fix this issue. - Fixes performance regression on cooking related titles when an unrelated bug was fixed.
2021-07-18texture_cache: Always prepare image views on render targetsReinUsesLisp1-0/+6
Images used as render targets were not being "prepared", causing desynchronizations on the texture cache. Needs #6669 to avoid performance regressions on certain cooking titles. - Fixes black shadows on Age of Calamity.
2021-07-15vic: Fix dimension compuation of YUV framesameerj1-11/+10
Fixes out of bound memory crashes in Mario Golf
2021-07-15Buffer cache: Fixes, Clang and Feedback.Fernando Sahmkow3-11/+10
2021-07-14GPUMemoryManager: Force inmediate invalidation when writting block.Fernando Sahmkow1-1/+1
2021-07-14Buffer Cache: Fixes to DMA Copy.Fernando Sahmkow1-6/+7
2021-07-14DMAEngine: Revert flushing from Pitch to BlpockLinear.Fernando Sahmkow1-2/+7
2021-07-14BufferCache: fix clearing on forced download.Fernando Sahmkow1-10/+20
2021-07-13vk_rasterizer: Only clear valid color attachmentsameerj1-2/+4
2021-07-13DMAEngine: Accelerate BufferClearFernando Sahmkow11-6/+115
2021-07-12accelerateDMA: Fixes and feedback.Fernando Sahmkow3-88/+62
2021-07-11accelerateDMA: Accelerate Buffer Copies.Fernando Sahmkow9-13/+176
2021-07-10Buffer Cache: Address Feedback.Fernando Sahmkow2-4/+9
2021-07-09Buffer Cache: Fix GCC copmpile errorFernando Sahmkow1-1/+0
2021-07-09Fence Manager: remove reference fencing.Fernando Sahmkow3-31/+6
2021-07-09BufferCache: Additional download fixes.Fernando Sahmkow2-23/+107
2021-07-09Buffer Cache: Revert unnecessary range reduction.Fernando Sahmkow1-29/+13
2021-07-09Fence Manager: Force ordering on WFI.Fernando Sahmkow4-38/+71
2021-07-09Buffer Cache: Eliminate the AC Hack as the base game is fixed in Hades.Fernando Sahmkow1-14/+4
2021-07-09Fence Manager: Add fences on Reference Count.Fernando Sahmkow8-6/+57
2021-07-09Videocore: Address Feedback & CLANG Format.Fernando Sahmkow2-78/+75
2021-07-09Buffer Cache: Fix High Downloads and don't predownload on Extreme.Fernando Sahmkow3-91/+122
2021-07-09vk_buffer_cache: Use emulated null buffers for transform feedbackReinUsesLisp2-11/+19
Vulkan does not support null buffers on transform feedback bindings. Emulate these using the same null buffer we were using for index buffers.
2021-07-09configure_graphics: Use u8 for bg_color valuesameerj2-6/+7
2021-07-08Out of bound blit (#6531)Feng Chen2-58/+35
* Fix out of bound blit error * Fix code read * Fix ci error Co-authored-by: Feng Chen <chen.feng@gloritysolutions.com>
2021-07-07util_shaders: Fix BindImageTexturelat9nq1-2/+2
According to https://gitlab.freedesktop.org/mesa/mesa/-/issues/3820#note_753371 we need to set these to true for use with 3D textures. Fixes BOTW teleporting on RadeonSI and iris.
2021-07-04Texture Cache: Fix collision with multiple overlaps of the same sparse texture.Fernando Sahmkow1-1/+6
2021-07-04Texture Cache: Fix GCC & Clang.Fernando Sahmkow2-11/+11
2021-07-04Texture Cache: Address feedback.Fernando Sahmkow5-18/+37
2021-07-04Texture Cache: Improve accuracy of sparse texture detection.Fernando Sahmkow6-131/+342
2021-07-04Texture Cache: Initial Implementation of Sparse Textures.Fernando Sahmkow12-23/+310
2021-07-03TextureCacheOGL: Implement Image Copies for 1D and 1D Array.Fernando Sahmkow1-0/+26
2021-07-03TextureCache: Fix 1D to 2D overlapps.Fernando Sahmkow1-3/+0
2021-07-01Slightly refactor NVDEC and codecs for readability and safetyKelebek110-356/+522
2021-06-29yuzu qt: Make most UISettings a BasicSettinglat9nq1-4/+9
For simple primitive settings, moves their defaults and labels to definition time. Also fixes typo and clang-format yuzu qt: config: Fix rng_seed
2021-06-28general: Make most settings a BasicSettinglat9nq1-10/+5
Creates a new BasicSettings class in common/settings, and forces setting a default and label for each setting that uses it in common/settings. Moves defaults and labels from both frontends into common settings. Creates a helper function in each frontend to facillitate reading the settings now with the new default and label properties. Settings::Setting is also now a subclass of Settings::BasicSetting. Also adds documentation for both Setting and BasicSetting.
2021-06-28video_core: Remove #pragma warning directives for external headersMorph2-15/+0
2021-06-28video_core: Enforce C4242Morph1-3/+2
2021-06-28video_core: Silence signed/unsigned mismatch warningsMorph4-5/+6
2021-06-26buffer_cache: Only flush downloaded sizeReinUsesLisp1-2/+3
Fixes a regression unintentionally introduced by the garbage collector. This makes regular memory downloads only flush the requested sizes. This negatively affected Koei Tecmo games.
2021-06-26video_core: Enforce C4244ReinUsesLisp1-0/+1
Enforce implicit integer casts to a smaller type as errors.
2021-06-26codec,vic: Disable warnings in ffmpeg headersReinUsesLisp2-4/+29
2021-06-26vk_buffer_cache: Silence implicit cast warningsReinUsesLisp1-2/+3
2021-06-26buffer_cache/texture_cache: Make GC functions privateReinUsesLisp2-5/+5
2021-06-26buffer_cache: Silence implicit cast warningReinUsesLisp1-1/+1
2021-06-25vulkan_device: Make device memory match the rest of the fileReinUsesLisp2-19/+18
Match the style in the file.
2021-06-24common: Replace common_sizes into user-literalsWunkolo5-14/+27
Removes common_sizes.h in favor of having `_KiB`, `_MiB`, `_GiB`, etc user-literals within literals.h. To keep the global namespace clean, users will have to use: ``` using namespace Common::Literals; ``` to access these literals.
2021-06-23General: Resolve fmt specifiers to adhere to 8.0.0 API where applicableLioncash2-2/+2
Also removes some deprecated API usages.
2021-06-23maxwell3d: Add missing return in default SizeInBytes() caseLioncash1-0/+1
We were returning '1' in ComponentCount()'s default case but were neglecting to do the same with SizeInBytes().
2021-06-22Reaper: Set minimum cleaning limit on OGL.Fernando Sahmkow1-1/+4
2021-06-22common: fs: Remove [[nodiscard]] attribute on Remove* functionsMorph1-1/+1
There are a lot of scenarios where we don't particularly care whether or not the removal operation and just simply attempt a removal. As such, removing the [[nodiscard]] attribute is best for these functions.
2021-06-22bootmanager: Use std::stop_source for stopping emulationReinUsesLisp5-8/+8
Use its std::stop_token to abort shader cache loading. Using std::stop_token instead of std::atomic_bool allows the usage of other utilities like std::stop_callback.
2021-06-22vk_master_semaphore: Use jthread for debug threadReinUsesLisp2-19/+8
2021-06-21gl_device: Expand on Mesa driver nameslat9nq1-3/+28
Makes this list a bit more capable at identifying Mesa drivers. Tries to deal with two of the overloaded vendor strings in a more generic fashion.
2021-06-21video_core: Add GPU vendor name to window title barameerj7-4/+66
2021-06-20Reaper: Guarantee correct deletion.Fernando Sahmkow5-2/+23
2021-06-19util_shaders: Specify ASTC decoder memory barrier bitsameerj1-1/+6
2021-06-19astc_decoder.comp: Remove unnecessary LUT SSBOsameerj5-113/+34
We can move them to instead be compile time constants within the shader.
2021-06-19astc: Various robustness enhancements for the gpu decoderameerj5-47/+16
These changes should help in reducing crashes/drivers panics that may occur due to synchronization issues between the shader completion and later access of the decoded texture.
2021-06-18vulkan_debug_callback: Skip logging known false-positive validation errorsameerj1-0/+8
Avoids overwhelming the log with validation errors that are not applicable
2021-06-17Reaper: Correct size calculation on Vulkan.Fernando Sahmkow1-5/+3
2021-06-17Reaper: Change memory restrictions on TC depending on host memory on VK.Fernando Sahmkow8-39/+88
2021-06-16Reaper: Address Feedback.Fernando Sahmkow4-19/+41
2021-06-16Reaper: Setup settings and final tuning.Fernando Sahmkow3-32/+38
2021-06-16Reaper: Tune it up to be an smart GC.Fernando Sahmkow5-13/+130
2021-06-16Initial Reaper SetupReinUsesLisp6-56/+226
WIP
2021-06-16vulkan_memory_allocator: Release allocations with no commitsReinUsesLisp2-5/+22
2021-06-16astc_decoder: Fix LDR CEM1 endpoint calculationameerj2-2/+2
Per the spec, L1 is clamped to the value 0xff if it is greater than 0xff. An oversight caused us to take the maximum of L1 and 0xff, rather than the minimum. Huge thanks to wwylele for finding this. Co-Authored-By: Weiyi Wang <wwylele@gmail.com>
2021-06-16configure_graphics: Add Accelerate ASTC decoding settingameerj2-2/+11
2021-06-16textures: Reintroduce CPU ASTC decoderameerj4-2/+1592
Users may want to fall back to the CPU ASTC texture decoder due to hangs and crashes that may be caused by keeping the GPU under compute heavy loads for extended periods of time. This is especially the case in games such as Astral Chain which make extensive use of ASTC textures.
2021-06-15texture_cache/util: Avoid relaxed image views on different bytes per pixelReinUsesLisp1-1/+9
Avoids API usage errors on UE4 titles leading to crashes.
2021-06-13cmake: Fix find_program usage for 3.15lat9nq1-1/+4
yuzu requires CMake 3.15 yet find_program was using REQUIRED, which is only available on 3.18 and later. Instead, we check for "<VAR>-NOTFOUND". In addition, check for additional requirements before building libusb or FFmpeg with autotools. Otherwise, CMake configuration will pass yet compilation will fail.
2021-06-11GPUTHread: Remove async reads from Normal Accuracy.Fernando Sahmkow1-18/+6
2021-06-11rasterizer: Update pages in batchesReinUsesLisp1-15/+41
2021-06-10Fix GCC undefined behavior sanitizer.Markus Wick2-0/+6
* Wrong alignment in u64 LOG_DEBUG -> memcpy. * Huge shift exponent in stride calculation for linear buffer, unused result -> skipped. * Large shift in buffer cache if word = 0, skip checking for set bits. Non of those were critical, so this should not change any behavior. At least with the assumption, that the last one used masking behavior, which always yield continuous_bits = 0.
2021-06-04decoders: Break instead of continuelat9nq1-2/+2
continue causes a memory leak in A Hat in Time.
2021-06-04decoders: Avoid out-of-bounds accesslat9nq1-0/+8
This is not a real fix, so assert here and continue before crashing.
2021-06-01buffer_cache: Simplify uniform disabling logicameerj8-6/+29
2021-05-29video_core: gpu: WaitFence: Do not block threads during shutdown.bunnei2-1/+13
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur. - Commonly observed when stopping emulation in Super Mario Odyssey.
2021-05-29Fix two GCC 11 warnings: Unneeded copies.Markus Wick1-2/+2
std::move created an unneeded copy. iterating without reference also created copies.
2021-05-27video_core: rasterizer_cache: Use u16 for cached page count.bunnei2-9/+9
- Greatly reduces the risk of overflow, at the cost of doubling the size of this array.
2021-05-27vulkan_memory_allocator: Allow textures to be allocated in host memoryReinUsesLisp2-31/+43
Allow Vulkan's allocator to use host memory when there's no more device local memory. This delays OOM, but it will eventually still happen.
2021-05-26common: fs: Rework the Common Filesystem interface to make use of std::filesystem (#6270)Morph5-106/+116
* common: fs: fs_types: Create filesystem types Contains various filesystem types used by the Common::FS library * common: fs: fs_util: Add std::string to std::u8string conversion utility * common: fs: path_util: Add utlity functions for paths Contains various utility functions for getting or manipulating filesystem paths used by the Common::FS library * common: fs: file: Rewrite the IOFile implementation * common: fs: Reimplement Common::FS library using std::filesystem * common: fs: fs_paths: Add fs_paths to replace common_paths * common: fs: path_util: Add the rest of the path functions * common: Remove the previous Common::FS implementation * general: Remove unused fs includes * string_util: Remove unused function and include * nvidia_flags: Migrate to the new Common::FS library * settings: Migrate to the new Common::FS library * logging: backend: Migrate to the new Common::FS library * core: Migrate to the new Common::FS library * perf_stats: Migrate to the new Common::FS library * reporter: Migrate to the new Common::FS library * telemetry_session: Migrate to the new Common::FS library * key_manager: Migrate to the new Common::FS library * bis_factory: Migrate to the new Common::FS library * registered_cache: Migrate to the new Common::FS library * xts_archive: Migrate to the new Common::FS library * service: acc: Migrate to the new Common::FS library * applets/profile: Migrate to the new Common::FS library * applets/web: Migrate to the new Common::FS library * service: filesystem: Migrate to the new Common::FS library * loader: Migrate to the new Common::FS library * gl_shader_disk_cache: Migrate to the new Common::FS library * nsight_aftermath_tracker: Migrate to the new Common::FS library * vulkan_library: Migrate to the new Common::FS library * configure_debug: Migrate to the new Common::FS library * game_list_worker: Migrate to the new Common::FS library * config: Migrate to the new Common::FS library * configure_filesystem: Migrate to the new Common::FS library * configure_per_game_addons: Migrate to the new Common::FS library * configure_profile_manager: Migrate to the new Common::FS library * configure_ui: Migrate to the new Common::FS library * input_profiles: Migrate to the new Common::FS library * yuzu_cmd: config: Migrate to the new Common::FS library * yuzu_cmd: Migrate to the new Common::FS library * vfs_real: Migrate to the new Common::FS library * vfs: Migrate to the new Common::FS library * vfs_libzip: Migrate to the new Common::FS library * service: bcat: Migrate to the new Common::FS library * yuzu: main: Migrate to the new Common::FS library * vfs_real: Delete the contents of an existing file in CreateFile Current usages of CreateFile expect to delete the contents of an existing file, retain this behavior for now. * input_profiles: Don't iterate the input profile dir if it does not exist Silences an error produced in the log if the directory does not exist. * game_list_worker: Skip parsing file if the returned VfsFile is nullptr Prevents crashes in GetLoader when the virtual file is nullptr * common: fs: Validate paths for path length * service: filesystem: Open the mod load directory as read only
2021-05-16buffer_cache: Ensure null buffers cannot take the fast uniform bind pathameerj1-1/+4
Fixes a crash in New Pokemon Snap
2021-05-16perf_stats: Rework FPS counter to be more accurateameerj4-0/+9
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case. This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics. The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values. The status bar update frequency was also changed from 2 seconds to 500ms.
2021-05-08texture_cache: Handle out of bound texture blitsameerj8-61/+99
Some games interleave a texture blit using regions which are out-of-bounds. This addresses the interleaving to avoid oob reads from the src texture.
2021-05-06hle: kernel: Rename Process to KProcess.bunnei3-3/+3
2021-04-26gl_device: Intel: Disable texture view formats workaround on mesaA-w-x1-1/+1
2021-04-25vk_texture_cache: Swap R and B channels of color flipped formatameerj1-1/+24
Swaps the Red and Blue channels of the A1B5G5R5_UNORM texture format, which was being incorrectly rendered.
2021-04-25nvhost_vic: Fix device closureameerj2-5/+3
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation. Also cleans up some of the surrounding code.
2021-04-19texture_cache/util: Fix src being used instead of dst within DeduceBlitImagesLioncash1-1/+1
This line can only ever be reached if src is null, so dereferencing it here is a logic bug that slipped through. Instead, we dereference dst instead which is guaranteed to be valid.
2021-04-16Address issuesChloe Marcec1-3/+2
2021-04-15common: Move settings to common from core.bunnei18-18/+18
- Removes a dependency on core and input_common from common.
2021-04-12engine_interface: Add missing virtual destructorLioncash4-4/+5
Eliminates a potential bug vector related to inheritance. Plus, we should generally be specifying the destructor as virtual within purely virtual interfaces to begin with.
2021-04-12vk_master_semaphore: Deduplicate atomic access within IsFree()Lioncash1-1/+1
We can just reuse the already existing KnownGpuTick() to deduplicate the access.
2021-04-12vk_master_semaphore: Add missing const qualifier for IsFree()Lioncash1-1/+1
This member function doesn't modify class state.
2021-04-12vk_texture_cache: Make use of Common::BitCast where applicableLioncash1-5/+6
Also clarify the TODO comment a little more on the lacking implementations for std::bit_cast.
2021-04-12texure_cache/util: Resolve implicit sign conversions with std::reduceLioncash2-11/+15
Amends implicit sign conversions occurring with usages of std::reduce and also relocates it to its own utility function to reduce verbosity a little bit.
2021-04-12query_cache: Make use of std::erase_ifLioncash1-5/+4
Same behavior, but much more straightforward to read.
2021-04-11vk_buffer_cache: Fix offset for NULL vertex buffersJoshua Ashton1-1/+1
The Vulkan spec states: If an element of pBuffers is VK_NULL_HANDLE, then the corresponding element of pOffsets must be zero. https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers2EXT.html#VUID-vkCmdBindVertexBuffers2EXT-pBuffers-04112
2021-04-11vulkan_device: Enable EXT_robustness2 featuresJoshua Ashton1-0/+9
When this was being made mandatory, these enablement of these features was removed, but this is still needed. Fixes: 757fd1e91716 ("vulkan_device: Require VK_EXT_robustness2")
2021-04-11renderer_vulkan: Check return value of AcquireNextImageJoshua Ashton3-5/+10
We can get into a really bad state by ignoring this leading to device loss and using incorrect resources.
2021-04-07video_core: Use a CV for blocking commands.Markus Wick2-23/+31
There is no need for a busy loop here. Let's just use a condition variable to save some power.
2021-04-07video_core/gpu_thread: Keep the write lock for allocating the fence.Markus Wick2-1/+4
Else the fence might get submited out-of-order into the queue, which makes testing them pointless. Overhead should be tiny as the mutex is just moved from the queue to the writing code.
2021-04-07video_core/gpu_thread: Implement a ShutDown method.Markus Wick4-14/+27
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone. This should fix a race condition while removing the other subsystems while the GPU is still active.
2021-04-07common/threadsafe_queue: Provide Wait() method.Markus Wick1-2/+1
It shall block until there is something to consume in the queue. And use it for the GPU emulation instead of the spin loop. This is only in booting the emulator, however in BOTW this is the case for about 1 second.
2021-04-05vp9: Avoid memcpy with null pointerslat9nq1-7/+9
Avoid sending null pointer to memcpy as reported by Undefined Behaviour Sanitizer. Replaces the std::memcpy calls in SpliceVectors with std::copy calls. Opting to replace all the memcpy's with copy's. Co-authored-by: LC <mathew1800@gmail.com>
2021-03-30nvdrv: Cleanup CDMA Processor on device closureChloe Marcec2-5/+11
Brings us a step closer to unifying all channels to share a common interface.
2021-03-30vulkan_common: enable OpenGL interop on other UnicesJan Beich2-5/+5
2021-03-25astc_decoder: Refactor for style and more efficient memory useameerj9-2256/+502
2021-03-24gl_device: unblock async shaders on other Unix systemsJan Beich1-1/+1
Mesa is the primary OpenGL provider on all FreeDesktop systems. For example, iris is used on Intel GPU + FreeBSD by default.
2021-03-21gl_device: Block async shaders on AMD and Intellat9nq1-1/+13
Currently, the Windows versions of the Intel OpenGL driver and the AMD proprietary OpenGL driver do not properly support (or in fact degrade) when asynchronous shader compilation is enabled. This blocks specifically those drivers from using this feature. This affects AMDGPU-PRO on Linux, and AMD's and Intel's OpenGL drivers on Windows.
2021-03-13astc_decoder: Reimplement LayersRodrigo Locatti5-142/+161
Reimplements the approach to decoding layers in the compute shader. Fixes multilayer astc decoding when using Vulkan.
2021-03-13astc_decoder: Fix out of bounds memory accessameerj1-2/+10
resolves a crash with some anamolous textures found in Astral Chain.
2021-03-13renderer_vulkan: Accelerate ASTC decodingameerj11-57/+426
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13host_shaders: Modify shader cmake integration to allow for larger shadersameerj4-8/+27
using a raw string to encapsulate the entire shader code limits us to shaders of size less than 2KB. This change overcomes this limitation.
2021-03-13renderer_opengl: Accelerate ASTC texture decoding with a compute shaderameerj6-2/+1598
ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively. This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.
2021-03-13video_core: rasterizer_accelerated: Fix un/signed mismatch.bunnei1-1/+2
2021-03-04texture_cache: Blacklist BGRA8 copies and views on OpenGLameerj9-28/+80
In order to force the BGRA8 conversion on Nvidia using OpenGL, we need to forbid texture copies and views with other formats. This commit also adds a boolean relating to this, as this needs to be done only for the OpenGL api, Vulkan must remain unchanged.
2021-03-04renderer_opengl: Swizzle BGR textures on copyameerj5-2/+132
OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped. This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.
2021-03-03video_core: rasterizer_accelerated: Fix delta check ordering.bunnei1-3/+3
2021-03-03video_core: rasterizer_accelerated: Improve error handling & fix implicit conversion.bunnei1-4/+8
2021-03-03video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.bunnei2-44/+32
- Uses a fixed 64MB for the cache instead of an ever growing map. - Slightly faster by using atomics instead of a single mutex for access. - Thanks for Rodrigo for the idea.
2021-03-02buffer_cache: Heuristically decide to skip cache on uniform buffersReinUsesLisp2-11/+37
Some games benefit from skipping caches (Pokémon Sword), and others don't (Animal Crossing: New Horizons). Add an heuristic to decide this at runtime. The cache hit ratio has to be ~98% or better to not skip the cache. There are 16 frames of buffer.
2021-03-01gpu_thread: Remove Async NVDEC placeholdersameerj3-26/+8
This commit removes early placeholders for an implementation of async nvdec. With recent changes to the source code, the placeholders are no longer accurate, and can cause a nullptr dereference due to the nature of the cdma_pusher lifetime.
2021-02-24Implement glDepthRangeIndexeddNVKelebek13-1/+12
2021-02-23vk_command_pool: Reduce the command pool size from 4096 to 4ReinUsesLisp1-1/+1
This allows drivers to reuse memory more easily and preallocate less. The optimal number has been measured booting Pokémon Sword.
2021-02-23video_core: add missing header after 468bd9c1b0f9Jan Beich1-0/+1
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkShaderComplete()': src/video_core/shader_notify.cpp:33:10: error: 'unique_lock' is not a member of 'std' 33 | std::unique_lock lock{mutex}; | ^~~~~~~~~~~ src/video_core/shader_notify.cpp:6:1: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'? 5 | #include "video_core/shader_notify.h" +++ |+#include <mutex> 6 | src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkSharderBuilding()': src/video_core/shader_notify.cpp:38:10: error: 'unique_lock' is not a member of 'std' 38 | std::unique_lock lock{mutex}; | ^~~~~~~~~~~ src/video_core/shader_notify.cpp:38:10: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
2021-02-20gl_disk_shader_cache: Log total shader entries count on game loadMorph1-0/+4
2021-02-19hle: kernel: Migrate PageHeap/PageTable to KPageHeap/KPageTable.bunnei1-1/+1
2021-02-16vk_rasterizer: Fix loading shader addresses twiceReinUsesLisp1-1/+0
This was recently introduced on a wrongly rebased commit.
2021-02-15Review 1Kelebek12-3/+3
2021-02-15Implement texture offset support for TexelFetch and TextureGather and add offsets for TldsKelebek13-9/+34
Formatting
2021-02-14yuzu: Various frontend improvements to avoid crashes and improve experience on Linux.bunnei1-0/+1
2021-02-13vk_resource_pool: Load GPU tick once and compare with itReinUsesLisp2-8/+8
Other minor style improvements. Rename free_iterator to hint_iterator, to describe better what it does.
2021-02-13vk_update_descriptor: Inline and improve code for binding buffersReinUsesLisp4-24/+24
Allow compilers with our settings inline hot code.
2021-02-13fixed_pipeline_cache: Use dirty flags to lazily update keyReinUsesLisp7-56/+103
Use dirty flags to avoid building pipeline key from scratch on each draw call. This saves a bit of unnecesary work on each draw call.
2021-02-13gl_texture_cache: Lazily create non-sRGB texture views for sRGB formatsameerj3-7/+41
This creates non-sRGB texture views for sRGB texture formats to allow for interfacing with these views in compute shaders using imageLoad and imageStore. Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-02-13 rebase, fix name shadowing, more constameerj4-11/+10
2021-02-13Address PR feedbackameerj2-8/+5
Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2021-02-13 streamline cdma_pusher/command_classesameerj1-13/+5
2021-02-13 streamline cdma_pusher/command_classesameerj5-85/+34
2021-02-13nvdec cleanupameerj7-42/+31
2021-02-13vk_master_semaphore: Mark gpu_tick atomic operations with relaxed orderReinUsesLisp1-4/+4
2021-02-13vk_staging_buffer_pool: Inline tick testsReinUsesLisp2-1/+7
Load the current tick to a local variable, moving it out of an atomic and allowing us to compare the value without going through a pointer each time. This should make the loop more optimizable.
2021-02-13gl_stream_buffer/vk_staging_buffer_pool: Fix size checkReinUsesLisp2-2/+2
Fix a tragic off-by-one condition that causes Vulkan's stream buffer to think it's always full, using fallback memory. The OpenGL was also affected by this bug to a lesser extent.
2021-02-13vulkan_device: Require VK_EXT_robustness2ReinUsesLisp2-37/+14
We are already using robustness2 features without requiring it explicitly, causing potential crashes on drivers without the extension. Requiring this at boot allows better diagnostics for it and formalizes our usage on the extension.
2021-02-13video_core: Fix clang build issuesReinUsesLisp2-8/+5
2021-02-13vk_staging_buffer_pool: Fix softlock when stream buffer overflowsReinUsesLisp2-19/+20
There was still a code path that could wait on a timeline semaphore tick that would never be signalled. While we are at it, make use of more STL algorithms.
2021-02-13vk_buffer_cache: Add support for null index buffersReinUsesLisp2-4/+40
Games can bind a null index buffer (size=0) where all indices are evaluated as zero. VK_EXT_robustness2 doesn't support this and all drivers segfault when a null index buffer is passed to vkCmdBindIndexBuffer. Workaround this by creating a 4 byte buffer and filling it with zeroes. If it's read out of bounds, robustness takes care of returning zeroes as indices.
2021-02-13buffer_cache: Add extra bytes to guest SSBOsReinUsesLisp1-1/+7
Bind extra bytes beyond the guest API's bound range. This is due to some games like Astral Chain operating out of bounds. Binding the whole map range would be technically correct, but games have large maps that make this approach unaffordable for now.
2021-02-13Merge branch 'bytes-to-map-end' into new-bufcache-wipReinUsesLisp1-0/+2
2021-02-13vk_staging_buffer_pool: Get a staging buffer instead of waitingReinUsesLisp2-9/+18
Avoids waiting idle while the GPU finishes to do work, and fixes an issue where we'd wait forever if a single command buffer (logic tick) all the data.
2021-02-13renderer_opengl: Remove interopReinUsesLisp8-122/+10
Remove unused interop code from the OpenGL backend.
2021-02-13gl_buffer_cache: Drop interop based parameter buffer workaroundsReinUsesLisp3-65/+45
Sacrify runtime performance to avoid generating kernel exceptions on Windows due to our abusive aliasing of interop buffer objects.
2021-02-13buffer_cache: Heuristically detect stream buffersReinUsesLisp2-6/+33
Detect when a memory region has been joined several times and increase the size of the created buffer on those instances. The buffer is assumed to be a "stream buffer", increasing its size should stop us from constantly recreating it and fragmenting memory.
2021-02-13buffer_cache: Split CreateBuffer in separate functionsReinUsesLisp1-29/+52
Allow adding functionality to each function without making CreateBuffer more complex.
2021-02-13buffer_cache: Skip cache on small uploads on VulkanReinUsesLisp3-9/+18
Ports from OpenGL the optimization to skip small 3D uniform buffer uploads. This will take advantage of the previously introduced stream buffer. Fixes instances where the staging buffer offset was being ignored.
2021-02-13vk_staging_buffer_pool: Add stream buffer for small uploadsReinUsesLisp15-127/+298
This uses a ring buffer similar to OpenGL's stream buffer for small uploads. This stops us from allocating several small buffers, reducing memory fragmentation and cache locality. It uses dedicated allocations when possible.
2021-02-13vulkan_device: Enable robustBufferAccessReinUsesLisp1-1/+2
Fix regression on Pascal on Animal Crossing: New Horizons, fixing a validation error.
2021-02-13video_core: Reimplement the buffer cacheReinUsesLisp67-2607/+2514
Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits.
2021-02-13vulkan_common: Expose interop and headless devicesReinUsesLisp4-21/+100
2021-02-13vulkan_common: Make interop extensions mandatoryReinUsesLisp1-0/+6
2021-02-13vulkan_device: Enable robust buffersReinUsesLisp1-2/+4
2021-02-13vulkan_device: Use designated initializers for featuresReinUsesLisp1-60/+59
2021-02-13vulkan_wrapper: Add memory barrier pipeline barrier helperReinUsesLisp1-0/+6
2021-02-13vulkan_device: Fix formatting of constantsReinUsesLisp1-10/+6
2021-02-13vulkan_wrapper: Add interop functionsReinUsesLisp2-1/+41
2021-02-13vulkan_instance: Initialize Vulkan instance in a separate threadReinUsesLisp1-1/+5
Workaround an issue on Nvidia where creating a Vulkan instance from an active OpenGL thread disables threaded optimization on the driver. This optimization is important to have good performance on Nvidia OpenGL.
2021-02-13vulkan_wrapper: Pull Windows symbolsReinUsesLisp1-0/+11
2021-02-13gpu: Report renderer errors with exceptionsReinUsesLisp24-227/+158
Instead of using a two step initialization to report errors, initialize the GPU renderer and rasterizer on the constructor and report errors through std::runtime_error.
2021-02-13buffer_base: Add support for cached CPU writesReinUsesLisp1-61/+145
Some games usually write memory pages currently used by the GPU, causing rendering issues (e.g. flashing geometry and shadows on Link's Awakening). To workaround this issue, Guest CPU writes are delayed until the command buffer finishes processing, but the pages are updated immediately. The overall behavior is: - CPU writes are cached until they are flushed, they update the page state, but don't change the modification state. Cached writes stop pages from being flushed, in case games have meaningful data in it. - Command processing writes (e.g. push constants) update the page state and are marked to the command processor as dirty. They don't remove the state of cached writes.
2021-02-13maxwell_to_gl: Remove unused codeameerj2-36/+3
Removes unused declarations in maxwell_to_gl.h
2021-02-09gl_rasterizer: Remove unused variablesLioncash1-3/+0
Resolves warnings on clang 12
2021-02-09texture_cache/util: Remove unused functionsLioncash1-34/+0
Silences a few warnings on clang 12.
2021-02-08video_core: Delete mortonChloe Marcec3-2/+0
moron.h & morton.cpp are not used anywhere and are just empty files
2021-02-07renderer_opengl: Update OpenGL backend version requirement to 4.6Morph1-1/+1
2021-02-05Address reviewer commentslat9nq1-1/+1
2021-02-05CMake: Port citra-emu/citra FindFFmpeg.cmakelat9nq1-2/+2
Also renames related CMake variables to match both the Find*FFmpeg* and variables defined within the file. Fixes odd errors produced by the old FindFFmpeg. Citra's FindFFmpeg is slightly modified here: adds Citra's copyright at the beginning, renames FFmpeg_INCLUDES to FFmpeg_INCLUDE_DIR, disables a few components in _FFmpeg_ALL_COMPONENTS, and adds the missing avutil component to the comment above.
2021-02-05CMake: Implement YUZU_USE_BUNDLED_FFMPEGlat9nq2-7/+6
For Linux, instructs CMake to use the FFmpeg submodule in externals. This is HEAVILY based on our usage of the late Unicorn. Minimal change to MSVC as it uses the yuzu-emu/ext-windows-bin. MinGW now targets the same ext-windows-bin libraries as MSVC for FFmpeg. Adds FFMPEG_LIBRARIES to WIN32 and simplifies video_core/CMakeLists.txt a bit.
2021-02-02video_core: host_shaders: Don't pass --quiet to glslangValidator if unavailablelat9nq1-1/+19
Prevents CMake from calling `glslangValidator` with `--quiet` when it is not available, i.e. on older downstream versions from Ubuntu.
2021-01-28vk_scheduler: Fix unaligned placement new expressionsReinUsesLisp1-6/+6
We were accidentaly creating an object in an unaligned memory address. Fix this by manually aligning the offset.
2021-01-27vulkan_device: Blacklist Intel from float16 math (#5798)Rodrigo Locatti1-0/+5
Astral Chain crashes Intel's SPIR-V compiler when using fp16. Disable this while the vendor works on a fix.
2021-01-25Revert "Start of Integer flags implementation"ReinUsesLisp3-59/+3
This reverts #4713. The implementation in that PR is not accurate. It does not reflect the behavior seen in hardware.
2021-01-25vk_graphics_pipeline: Fix narrowing conversion on MSVCReinUsesLisp1-2/+2
2021-01-24vk_texture_cache: Support image store on sRGB images with VkImageViewUsageCreateInfoReinUsesLisp3-38/+72
Vulkan 1.0 didn't support creating sRGB image views on an ABGR8 VkImage with storage usage bits. VK_KHR_maintenance2 addressed this allowing to reduce the usage bits on a VkImageView. To allow image store on non-sRGB image views when the VkImage is created with sRGB, always create VkImages without sRGB and add the sRGB format on the view.
2021-01-25vulkan_device: Lift VK_EXT_extended_dynamic_state blacklist on RDNAReinUsesLisp1-23/+0
It seems to be safe to use this on new drivers.
2021-01-24cmake: Enforce -Warray-bounds and -Wmissing-field-initializers globallyReinUsesLisp1-2/+0
2021-01-24host_shaders/cmake: Pass --quiet to glslang to keep it quietReinUsesLisp1-1/+1
Silences noisy builds on toolchains.
2021-01-24video_core/cmake: Enforce -Warray-bounds and -Wmissing-field-initializersReinUsesLisp1-0/+2
2021-01-24video_core: Silence -Wmissing-field-initializers warningsReinUsesLisp5-25/+56
2021-01-24maxwell_3d: Silence array bounds warningsReinUsesLisp2-35/+35
2021-01-24maxwell_to_vk: Silence -Wextra warnings about using different enum typesReinUsesLisp2-2/+2
2021-01-23shader_ir: Fix comment typoLevi Behunin1-1/+1
2021-01-23video_core/cmake: Properly generate fatal errors on AftermathReinUsesLisp1-2/+2
Fix "message(ERROR ..." to "message(FATAL_ERROR ..." to properly stop cmake when Nsight Aftermath can't be configured.
2021-01-23nsight_aftermath_tracker: Fix build issues when enabledReinUsesLisp2-16/+5
Fixes a bunch of build errors when Nsight Aftermath is properly enabled.
2021-01-23vk_pipeline_cache: Properly bypass VertexA shadersReinUsesLisp1-9/+3
The VertexA stage is not yet implemented, but Vulkan is adding its descriptors, causing a discrepancy in the pushed descriptors and the template. This generally ends up in a driver side crash. Bypass the VertexA stage for now.
2021-01-22video_core/memory_manager: Add BytesToMapEndReinUsesLisp2-2/+27
Track map address sizes in a flat ordered map and add a method to query the number of bytes until the end of a map in a given address.
2021-01-21gl_shader_decompiler: Fix constant buffer size calculationReinUsesLisp1-1/+2
The divide logic was wrong and can cause an uniform buffer size overflow.
2021-01-21video_core/memory_manager: Remove unused CopyBlockUnsafeReinUsesLisp2-8/+0
This function was not being used.
2021-01-21video_core/memory_manager: Flush destination buffer on CopyBlockReinUsesLisp1-0/+4
When we copy into a buffer, it might contain data modified from the GPU on the same pages. Because of this, we have to flush the contents before writing new data. An alternative approach would be to write the data in place, but games can also write data in other ways, invalidating our contents. Fixes geometry in Zombie Panic in Wonderland DX.
2021-01-21video_core/memory_manager: Add GPU address based flush methodReinUsesLisp2-0/+17
Allow flushing rasterizer contents based on a GPU address.
2021-01-21renderer_opengl: Avoid precompiled cache and force NV GL cache directoryReinUsesLisp3-5/+14
Setting __GL_SHADER_DISK_CACHE_PATH we can force the cache directory to be in yuzu's user directory to stop commonly distributed malware from deleting our driver shader cache. And by setting __GL_SHADER_DISK_CACHE_SKIP_CLEANUP we can have an unbounded shader cache size. This has only been implemented on Windows, mostly because previous tests didn't seem to work on Linux. Disable the precompiled cache on Nvidia's driver. There's no need to hide information the driver already has in its own cache.
2021-01-17texture_cache/util: Resolve -Wsign-compare warningLioncash1-1/+1
Resolves a -Wsign-compare warning on Clang.
2021-01-17video_core: Resolve -Wdocumentation warningsLioncash2-4/+3
Silences some -Wdocumentation warnings on Clang.
2021-01-17vulkan_debug_callback: Add missing header guardLioncash1-0/+2
Prevents inclusion issues from occurring.
2021-01-16vk_shader_decompiler: Show comments as OpUndef with a typeReinUsesLisp1-1/+4
Silence the new validation layer error about SPIR-V not allowing OpUndef on a OpTypeVoid, even when the SPIR-V spec doesn't say anything against it. They will be inserted as an undefined int to avoid SPIRV-Cross and validation errors, but only when a debugging tool is attached.
2021-01-15common/common_funcs: Rename INSERT_UNION_PADDING_{BYTES,WORDS} to _NOINITReinUsesLisp6-123/+123
INSERT_PADDING_BYTES_NOINIT is more descriptive of the underlying behavior.
2021-01-15vulkan_memory_allocator: Remove unnecesary 'device' memory from commitsReinUsesLisp2-15/+15
2021-01-15vk_texture_cache: Use Download memory types for texture flushesReinUsesLisp2-5/+10
Use the Download memory type where it matters.
2021-01-15vulkan_memory_allocator: Add allocation support for download typesReinUsesLisp2-55/+91
Implements the allocator logic to handle download memory types. This will try to use HOST_CACHED_BIT when available.
2021-01-15vulkan_memory_allocator: Add "download" memory usage hintReinUsesLisp9-45/+86
Allow users of the allocator to hint memory usage for downloads. This removes the non-descriptive boolean passed for "host visible" or not host visible memory commits, and uses an enum to hint device local, upload and download usages.
2021-01-15vulkan_common: Move allocator to the common directoryReinUsesLisp11-11/+11
Allow using the abstraction from the OpenGL backend.
2021-01-15renderer_vulkan: Rename Vulkan memory manager to memory allocatorReinUsesLisp15-54/+52
"Memory manager" collides with the guest GPU memory manager, and a memory allocator sounds closer to what the abstraction aims to be.
2021-01-15vk_memory_manager: Improve memory manager and its APIReinUsesLisp13-343/+318
Fix a bug where the memory allocator could leave gaps between commits. To fix this the allocation algorithm was reworked, although it's still short in number of lines of code. Rework the allocation API to self-contained movable objects instead of naively using an unique_ptr to do the job for us. Remove the VK prefix.
2021-01-15common/bit_util: Replace CLZ/CTZ operations with standardized onesLioncash3-5/+5
Makes for less code that we need to maintain.
2021-01-15common/alignment: Rename AlignBits to AlignUpLog2ReinUsesLisp3-11/+11
AlignUpLog2 describes what the function does better than AlignBits.
2021-01-15video_core/cmake: Remove Werror flags already defined code-base wideReinUsesLisp1-2/+0
These flags are already defined in src/cmake.
2021-01-15cmake: Enforce -Wunused-function code-base wideReinUsesLisp1-1/+0
2021-01-15video_core: Enforce -Wunused-functionReinUsesLisp1-0/+1
Stops us from merging code with unused functions in the future. If something is invoked behind conditionally evaluated code in a way that the language can't see it (e.g. preprocessor macros), the potentially unused function should use [[maybe_unused]].
2021-01-15vk_buffer_cache: Remove unused functionReinUsesLisp1-4/+0
2021-01-15vulkan_common: Silence missing initializer warningsReinUsesLisp2-145/+146
Silence warnings explicitly initializing all members on construction.
2021-01-15vulkan_device: Enable shaderStorageImageMultisample conditionallyReinUsesLisp2-18/+20
Fix Vulkan initialization on ANV.
2021-01-15astc: Increase integer encoded vector sizeReinUsesLisp1-1/+1
Invalid ASTC textures seem to write more bytes here, increase the size to something that can't make us push out of bounds.
2021-01-15astc: Return zero on out of bound bitsReinUsesLisp1-17/+22
Avoid out of bound reads on invalid ASTC textures. Games can bind invalid textures that make us read or write out of bounds.
2021-01-13vulkan_device: Remove requirement on shaderStorageImageMultisampleReinUsesLisp1-1/+0
yuzu doesn't currently emulate MS image stores. Requiring this makes no sense for now. Fixes ANV not booting any games on Vulkan.
2021-01-13buffer_cache/buffer_base: Add a range tracking buffer containerReinUsesLisp2-0/+496
It keeps track of the modified CPU and GPU ranges on a CPU page granularity, notifying the given rasterizer about state changes in the tracking behavior of the buffer. Use a small vector optimization to store buffers smaller than 256 KiB locally instead of using free store memory allocations.
2021-01-08vk_fence_manager: Use timeline semaphores instead of spin waitsReinUsesLisp3-54/+18
With timeline semaphores we can avoid creating objects. Instead of creating an event, grab the current tick from the scheduler and flush the current command buffer. When the fence has to be queried/waited, we can do so against the master semaphore instead of spinning on an event. If Vulkan supported NVN like events or fences, we could signal from the command buffer and wait for that without splitting things in two separate command buffers.
2021-01-07remove inaccurate referenceAmeer J1-1/+1
Co-authored-by: LC <mathew1800@gmail.com>
2021-01-07fix for nvdec disabled, cleanup host1xameerj2-61/+9
2021-01-07nvdec syncpt incorporationameerj4-17/+16
laying the groundwork for async gpu, although this does not fully implement async nvdec operations
2021-01-07vulkan_library: Common::DynamicLibrary::Open is [[nodiscard]]MerryMage1-1/+1
Ignore the return value on __APPLE__ systems as well
2021-01-07texture_cache: Replace PAGE_SHIFT with PAGE_BITSMerryMage1-6/+6
PAGE_SHIFT is a #define in system headers that leaks into user code on some systems
2021-01-04vk_rasterizer: Skip binding empty descriptor sets on computeReinUsesLisp1-2/+4
Fixes unit tests where compute shaders had no descriptors in the set, making Vulkan drivers crash when binding an empty set.
2021-01-04vulkan_device: Allow creating a device without surfaceReinUsesLisp1-3/+3
2021-01-04renderer_vulkan/nsight_aftermath_tracker: Move to vulkan_commonReinUsesLisp5-30/+21
2021-01-04renderer_vulkan: Move device abstraction to vulkan_commonReinUsesLisp29-29/+31
2021-01-04gl_texture_cache: Avoid format views on Intel and AMDReinUsesLisp11-21/+48
Intel and AMD proprietary drivers are incapable of rendering to texture views of different formats than the original texture. Avoid creating these at a cache level. This will consume more memory, emulating them with copies.
2021-01-04gl_texture_cache: Create base images with sRGBReinUsesLisp2-99/+100
This breaks accelerated decoders trying to imageStore into images with sRGB. The decoders are currently disabled so this won't cause issues at runtime.
2021-01-03renderer_vulkan: Rename VKDevice to DeviceReinUsesLisp52-169/+166
The "VK" prefix predates the "Vulkan" namespace. It was carried around the codebase for consistency. "VKDevice" currently is a bad alias with "VkDevice" (only an upcase character of difference) that can cause confusion. Rename all instances of it.
2021-01-02general: Fix various spelling errorsMorph2-2/+2
2020-12-31vulkan_instance: Allow different Vulkan versions and enforce 1.1ReinUsesLisp7-41/+39
For listing the available physical devices we can use Vulkan 1.0. Now that MoltenVK supports 1.1 we can require it for running games. Add missing documentation.
2020-12-31vk_device: Use an array to report lacking device limitsReinUsesLisp1-13/+17
This makes easier to add and tune the required device limits.
2020-12-31vk_device: Stop initialization when device is not suitableReinUsesLisp2-61/+39
VKDevice::IsSuitable was not being called. To address this issue, check suitability before initialization and throw an exception if it fails. By doing this, we can deduplicate some code on queue searches. Previosuly we would first search if a present and graphics queue existed, then on initialization we would search again to find the index.
2020-12-31renderer_vulkan: Remove two step initialization on VKDeviceReinUsesLisp6-31/+10
The Vulkan device abstraction either initializes successfully on the constructor or throws a Vulkan exception.
2020-12-31renderer_vulkan: Throw when enumerating devices failsReinUsesLisp5-33/+21
Report device enumeration errors with exceptions to be consistent with other initialization related function calls. Reduces the amount of code to maintain.
2020-12-31renderer_vulkan: Initialize surface in separate fileReinUsesLisp6-73/+109
Move surface initialization code to a separate file. It's unlikely to use this code outside of Vulkan, but keeping platform-specific code (Win32, Xlib, Wayland) in its own translation unit keeps things cleaner.
2020-12-31renderer_vulkan: Catch and report exceptionsReinUsesLisp1-2/+5
Move more Vulkan code to report errors with exceptions and report them through a log before notifying it with an error boolean for backwards compatibility. In the future we can replace the rasterizer two-step initialization to always use exceptions.
2020-12-31renderer_vulkan: Create debug callback on separate file and throwReinUsesLisp8-79/+88
Initialize debug callbacks (messenger) from a separate file. This allows sharing code with different backends. Change our Vulkan error handling to use exceptions instead of error codes, simplifying the initialization process.
2020-12-31renderer_vulkan: Move instance initialization to a separate fileReinUsesLisp4-111/+176
Simplify Vulkan's backend initialization code by moving it to a separate file, allowing us to initialize a Vulkan instance from different backends.
2020-12-31vulkan_common: Rename renderer_vulkan/wrapper.h to vulkan_common/vulkan_wrapper.hReinUsesLisp51-51/+51
Allows sharing Vulkan wrapper code between different rendering backends.
2020-12-31vulkan_common: Move dynamic library load to a separate fileReinUsesLisp4-31/+59
Allows us to initialize a Vulkan dynamic library from different backends without duplicating code.
2020-12-30half_set: Resolve -Wmaybe-uninitialized warningsLioncash1-7/+7
2020-12-30maxwell_to_vk: Initialize usage variable in SurfaceFormat()Lioncash1-1/+1
Silences a -Wmaybe-uninitialized warning
2020-12-30video_core: Rewrite the texture cacheReinUsesLisp152-8101/+10359
The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues.
2020-12-30video_core: Add a delayed destruction ring abstractionReinUsesLisp2-0/+33
2020-12-30host_shaders: Add Vulkan assembler compute shadersReinUsesLisp4-0/+96
2020-12-30host_shaders: Add helper to blit depth stencil fragment shaderReinUsesLisp2-0/+17
2020-12-30host_shaders: Add texture color blit fragment shaderReinUsesLisp2-0/+15
2020-12-30host_shaders: Add shaders to present to the swapchainReinUsesLisp3-0/+36
2020-12-30host_shaders: Add shaders to convert between depth and color imagesReinUsesLisp3-0/+28
2020-12-30host_shaders: Add compute shader to copy BC4 as RG32UI to RGBA8ReinUsesLisp2-0/+71
2020-12-30host_shaders: Add shader to render a full screen triangleReinUsesLisp2-0/+30
2020-12-30host_shaders: Add pitch linear upload compute shaderReinUsesLisp2-0/+87
2020-12-30host_shaders: Add block linear upload compute shadersReinUsesLisp3-0/+249
2020-12-30host_shaders: Add copyright headers to OpenGL present shadersReinUsesLisp2-0/+8
2020-12-30video_core/host_shaders: Add support for prebuilt SPIR-V shadersReinUsesLisp1-16/+37
Add support for building SPIR-V shaders from GLSL and generating headers to include the text of those same GLSL shaders to consume from OpenGL.
2020-12-29gpu: gpu_thread: Ensure MicroProfile is shutdown on exit.bunnei1-0/+3
2020-12-29video_core: gpu_thread: Do not wait when system is powered down.bunnei1-1/+2
2020-12-29video_core: gpu: Implement synchronous mode using threaded GPU.bunnei4-12/+34
2020-12-29video_core: gpu: Refactor out synchronous/asynchronous GPU implementations.bunnei10-289/+130
- We must always use a GPU thread now, even with synchronous GPU.
2020-12-26renderer_vulkan/fixed_pipeline_state: Move enabled bindings to static stateReinUsesLisp3-26/+12
Without using VK_EXT_robustness2, we can't consider the 'enabled' (not null) vertex buffers as dynamic state, as this leads to invalid Vulkan state. Move this to static state that is always hashed and compared in the pipeline key. The bits for enabled vertex buffers are moved into the attribute state bitfield. This is not 'correct' as it's not an attribute state, but that struct has bits to spare, and it's used in an array of 32 elements (the exact same number of vertex buffer bindings).
2020-12-25cmake: Always enable VulkanReinUsesLisp2-79/+66
Removes the unnecesary burden of maintaining separate #ifdef paths and allows us sharing generic Vulkan code across APIs.
2020-12-25video_core: Enforce C4715 (not all control paths return a value)ReinUsesLisp1-0/+1
Most of the time people write code that always returns a value, terminates execution, throws an exception, or uses an unconventional jump primitive. This is not always true when we build without asserts on mainline builds. To avoid introducing undefined behavior on our most used builds, enforce this warning signalling an error and stopping the build from shipping.
2020-12-25vk_shader_decompiler: Silence warning when compiling without assertsReinUsesLisp1-0/+1
2020-12-07video_core: Make use of ordered container contains() where applicableLioncash8-16/+13
With C++20, we can use the more concise contains() member function instead of comparing the result of the find() call with the end iterator.
2020-12-07ast: Improve string concat readability in operator()Lioncash1-5/+4
Provides an in-place format string to make it more pleasant to read.
2020-12-07gl_shader_decompiler: Elide unnecessary copies within DeclareConstantBuffers()Lioncash1-1/+1
Resolves a -Wrange-loop-analysis warning.
2020-12-07buffer_block: Mark interface as nodiscard where applicableLioncash1-7/+7
Prevents logic errors from occurring from unused values.
2020-12-07buffer_block: Remove unnecessary includesLioncash1-5/+0
Reduces the amount of dependencies the header pulls in.
2020-12-07shader_ir: std::move node within DeclareAmend()Lioncash1-2/+2
Same behavior, but elides an unnecessary atomic reference count increment and decrement.
2020-12-07video_core: Remove unnecessary enum class casting in logging messagesLioncash33-148/+125
fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.
2020-12-07maxwell_3d: Move member variables to end of classLioncash1-31/+32
Follows our established coding style.
2020-12-07maxwell_3d: Resolve -Wdocumentation warningLioncash1-1/+1
Removes a documentation comment for a non-existent member.
2020-12-07maxwell_3d: Remove unused dirty_pointer arrayLioncash1-2/+0
This is unused and removing it shrinks the structure by 3584 bytes.
2020-12-07renderer_vulkan: Add missing `override` specifiercomex1-1/+1
2020-12-07map_interval: Change field order to address uninitialized field warningcomex1-1/+2
Clang complains about `new_chunk`'s constructor using the then-uninitialized `first_chunk` (even though it's just to get a pointer into it).
2020-12-07video_core: Adjust `NUM` macro to avoid Clang warningcomex3-3/+3
The previous definition was: #define NUM(field_name) (sizeof(Maxwell3D::Regs::field_name) / sizeof(u32)) In cases where `field_name` happens to refer to an array, Clang thinks `sizeof(an array value) / sizeof(a type)` is an instance of the idiom where `sizeof` is used to compute an array length. So it thinks the type in the denominator ought to be the array element type, and warns if it isn't, assuming this is a mistake. In reality, `NUM` is not used to get array lengths at all, so there is no mistake. Silence the warning by applying Clang's suggested workaround of parenthesizing the denominator.
2020-12-05maxwell_dma: Rename RenderEnable::Mode::FALSE and TRUE to avoid name conflictcomex1-5/+7
On Apple platforms, FALSE and TRUE are defined as macros by <mach/boolean.h>, which is included by various system headers. Note that there appear to be no actual users of the names to fix up.
2020-12-05video_core: Resolve more variable shadowing scenarios pt.3Lioncash45-280/+293
Cleans out the rest of the occurrences of variable shadowing and makes any further occurrences of shadowing compiler errors.
2020-12-05video_core: Resolve more variable shadowing scenarios pt.2Lioncash39-296/+305
Migrates the video core code closer to enabling variable shadowing warnings as errors. This primarily sorts out shadowing occurrences within the Vulkan code.
2020-12-05Fix telemetry-related exit crash from use-after-freeFearlessTobi1-3/+3
Co-Authored-By: xperia64 <xperia64@users.noreply.github.com>
2020-12-04codec: Remove deprecated usage of AVCodecContext::refcounted_framesLioncash1-1/+0
This was only necessary for use with the avcodec_decode_video2/avcoded_decode_audio4 APIs which are also deprecated. Given we use avcodec_send_packet/avcodec_receive_frame, this isn't necessary, this is even indicated directly within the FFmpeg API changes document here on 2017-09-26: https://github.com/FFmpeg/FFmpeg/blob/master/doc/APIchanges#L410 This prevents our code from breaking whenever we update to a newer version of FFmpeg in the future if they ever decide to fully remove this API member.
2020-12-04video_core: Resolve more variable shadowing scenariosLioncash42-206/+219
Resolves variable shadowing scenarios up to the end of the OpenGL code to make it nicer to review. The rest will be resolved in a following commit.
2020-12-03node: Mark member functions as [[nodiscard]] where applicableLioncash1-29/+29
Prevents logic bugs from accidentally ignoring the return value.
2020-12-03node: Eliminate variable shadowingLioncash1-47/+49
2020-12-03vp9/vic: Resolve pessimizing movesLioncash2-11/+11
Removes the usage of moves that don't result in behavior different from a copy, or otherwise would prevent copy elision from occurring.
2020-11-26codec: Fix `pragma GCC diagnostic pop` missing corresponding pushcomex1-0/+1
2020-11-26vk_shader_decompiler: Implement force early fragment testsReinUsesLisp6-11/+19
Force early fragment tests when the 3D method is enabled. The established pipeline cache takes care of recompiling if needed. This is implemented only on Vulkan to avoid invalidating the shader cache on OpenGL.
2020-11-26Limit queue size to 10 framesameerj1-0/+4
Workaround for ZLA, which seems to decode and queue twice as many frames as it displays.
2020-11-26Address PR feedbackameerj4-32/+33
remove some redundant moves, make deleter match naming guidelines. Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2020-11-25Queue decoded frames, cleanup decodersameerj10-338/+227
2020-11-25cleanup unneeded comments and newlinesameerj1-6/+0
2020-11-25Refactor MaxwellToSpirvComparison. Use Common::BitCastameerj3-31/+34
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-11-25Address PR feedback from Reinameerj5-40/+31
2020-11-25vulkan_renderer: Alpha Test Culling Implementationameerj5-2/+76
Used by various textures in many titles, e.g. SSBU menu.
2020-11-24nvdrv, video_core: Don't index out of bounds when given invalid syncpoint IDcomex1-11/+18
- Use .at() instead of raw indexing when dealing with untrusted indices. - For the special case of WaitFence with syncpoint id UINT32_MAX, instead of crashing, log an error and ignore. This is what I get when running Super Mario Maker 2.
2020-11-23Overhaul EmuWindow::PollEvents to fix yuzu-cmd calling SDL_PollEvents off main threadcomex2-4/+2
EmuWindow::PollEvents was called from the GPU thread (or the CPU thread in sync-GPU mode) when swapping buffers. It had three implementations: - In GRenderWindow, it didn't actually poll events, just set a flag and emit a signal to indicate that a frame was displayed. - In EmuWindow_SDL2_Hide, it did nothing. - In EmuWindow_SDL2, it did call SDL_PollEvents, but this is wrong because SDL_PollEvents is supposed to be called on the thread that set up video - in this case, the main thread, which was sleeping in a busyloop (regardless of whether sync-GPU was enabled). On macOS this causes a crash. To fix this: - Rename EmuWindow::PollEvents to OnFrameDisplayed, and give it a default implementation that does nothing. - In EmuWindow_SDL2, do not override OnFrameDisplayed, but instead have the main thread call SDL_WaitEvent in a loop.
2020-11-21gl_rasterizer: Remove warning of untested alpha testReinUsesLisp1-4/+0
Alpha test has been proven to only affect the first render target.
2020-11-20async_shaders: emplace threads into the worker thread vectorLioncash1-2/+2
Same behavior, but constructs the threads in place instead of moving them.
2020-11-20async_shaders: Simplify implementation of GetCompletedWork()Lioncash1-2/+1
This is equivalent to moving all the contents and then clearing the vector. This avoids a redundant allocation.
2020-11-20async_shaders: Simplify moving data into the pending queueLioncash1-13/+8
2020-11-20async_shaders: std::move data within QueueVulkanShader()Lioncash1-2/+2
Same behavior, but avoids redundant copies. While we're at it, we can simplify the pushing of the parameters into the pending queue.
2020-11-20gl_rasterizer: Make floating-point literal a floatLioncash1-1/+1
Gets rid of an unnecessary expansion from float to double.
2020-11-20shader_bytecode: Make use of [[nodiscard]] where applicableLioncash1-73/+79
Ensures that all queried values are made use of.
2020-11-20shader_bytecode: Eliminate variable shadowingLioncash1-15/+17
2020-11-17rasterizer_interface: Make use of [[nodiscard]] where applicableLioncash1-8/+9
2020-11-17render_base: Make use of [[nodiscard]] where applicableLioncash1-11/+11
2020-11-17gpu: Make use of [[nodiscard]] where applicableLioncash1-31/+35
2020-11-11maxwell_3d: Use insert instead of loop push_backReinUsesLisp1-3/+1
This reduces the overhead of bounds checking on each element. It won't reduce the cost of allocation because usually this vector's capacity is usually large enough to hold whatever we push to it.
2020-11-11maxwell_3d: Move code to separate functionsReinUsesLisp2-151/+124
Deduplicate some code and put it in separate functions so it's easier to understand and profile.
2020-11-07video_core: dma_pusher: Remove integrity check on command lists.bunnei2-28/+1
- This seems to cause softlocks in Breath of the Wild.
2020-11-05General: Fix clang buildLioncash2-5/+4
Allows building on clang to work again
2020-11-02nvdec: Make use of [[nodiscard]] where applicableLioncash7-16/+16
Prevents bugs from occurring where the results of a function are accidentally discarded
2020-11-01video_core: dma_pusher: Add support for integrity checks.bunnei2-0/+27
- Log corrupted command lists, rather than crash.
2020-11-01video_core: dma_pusher: Add support for prefetched command lists.bunnei2-25/+52
2020-11-01video_core: gpu: Implement WaitFence and IncrementSyncPoint.bunnei3-28/+70
2020-10-30vp9: Be explicit with copy and move operatorsLioncash1-0/+18
It's deprecated in the language to autogenerate these if the destructor for a type is specified, so we can explicitly specify how we want these to be generated.
2020-10-30vp9: Mark functions with [[nodiscard]] where applicableLioncash2-13/+13
Prevents values from mistakenly being discarded in cases where it's a bug to do so.
2020-10-30vp9: Provide a default initializer for "hidden" memberLioncash1-1/+1
The API of VP9 exposes a WasFrameHidden() function which accesses this member. Given the constructor previously didn't initialize this member, it's a potential vector for an uninitialized read. Instead, we can initialize this to a deterministic value to prevent that from occurring.
2020-10-30vp9: Make some member functions internally linkedLioncash2-58/+54
These helper functions don't directly modify any member state and can be hidden from view.
2020-10-30General: Resolve a few missing initializer warningsLioncash3-2/+10
Resolves a few -Wmissing-initializer warnings.
2020-10-29async_shaders: Increase Async worker thread count for 8+ thread cpusameerj1-8/+9
Adds 1 async worker thread for every 2 available threads above 8
2020-10-29video_core: cdma_pusher: Add missing LOG_DEBUG field in ExecuteCommand.bunnei1-1/+1
2020-10-28shader: Partially implement texture cube array shadowReinUsesLisp3-25/+37
This implements texture cube arrays with shadow comparisons but doesn't fix the asserts related to it. Fixes out of bounds reads on swizzle constructors and makes them use bounds checked ::at instead of the unsafe operator[].
2020-10-28shader/arithmetic: Implement FCMP immediate + register variantReinUsesLisp2-1/+4
Trivially add the encoding for this.
2020-10-28video_core: Enforce -Wredundant-move and -Wpessimizing-moveReinUsesLisp4-4/+5
Silence three warnings and make them errors to avoid introducing more in the future.
2020-10-28video_core: Enforce -Werror=type-limitsReinUsesLisp2-1/+2
Silences one warning and avoids introducing more in the future.
2020-10-27sync_manager: Amend parameter order of calls to SyncptIncr constructorLioncash2-9/+9
Corrects some cases where the arguments would be incorrectly swapped.
2020-10-27h264: Make WriteUe take a u32Lioncash2-7/+8
Enforces the type of the desired value in calling code.
2020-10-27vp9: std::move buffer within ComposeFrameHeader()Lioncash1-1/+1
We can move the buffer here to avoid a heap reallocation
2020-10-27vp9: Remove dead codeLioncash1-6/+0
2020-10-27vp9: Join declarations with assignmentsLioncash1-7/+8
2020-10-27vp9: Remove pessimizing movesLioncash1-2/+2
The move will already occur without std::move.
2020-10-27vp9: Resolve variable shadowingLioncash1-4/+4
2020-10-27nvdec: Tidy up header includesLioncash13-62/+59
Prevents a few unnecessary inclusions.
2020-10-27video_core: NVDEC Implementationameerj30-22/+3311
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library. The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data. To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library. Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header. Async GPU is not properly implemented at the moment. Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
2020-10-21video_core: Conditially activate relevant compiler warningsLioncash1-2/+4
These compiler flags aren't shared with clang, so specifying these flags unconditionally can lead to a bit of warning spam. While we're in the area, we can also enable -Wunused-but-set-parameter given this is almost always a bug.
2020-10-20gl_arb_decompiler: Implement robust buffer operationsReinUsesLisp3-33/+54
This emulates the behavior we get on GLSL with regular SSBOs with a pointer + length pair. It aims to be consistent with the crashes we might get. Out of bounds stores are ignored. Atomics are ignored and return zero. Reads return zero.
2020-10-13vk_graphics_pipeline: Manage primitive topology as fixed stateReinUsesLisp6-26/+7
Vulkan has requirements for primitive topologies that don't play nicely with yuzu's. Since it's only 4 bits, we can move it to fixed state without changing the size of the pipeline key. - Fixes a regression on recent Nvidia drivers on Fire Emblem: Three Houses.
2020-10-09video_core: Enforce -Wclass-memaccessReinUsesLisp2-7/+7
2020-10-09vk_device: Block VK_EXT_extended_dynamic_state for RDNA devicesgoldenx861-0/+24
RDNA devices seem to crash when using VK_EXT_extended_dynamic_state in the latest 20.9.2 proprietary Windows drivers. As a workaround, for now we block device names corresponding to current RDNA released products.
2020-10-08shader/texture: Implement CUBE texture type for TMML and fix arraysReinUsesLisp1-19/+22
TMML takes an array argument that has no known meaning, this one appears as the first component in gpr8 followed by s, t and r. Skip this component when arrays are being used. Also implement CUBE texture types. - Used by Pikmin 3: Deluxe Demo.
2020-10-07renderer_vulkan/wrapper: Fix physical device sortingReinUsesLisp1-13/+35
The old code had a sort function that was invalid and it didn't work as expected when the base vector had a different order (e.g. renderdoc was attached). This sorts devices as expected and fixes a debug assert on MSVC.
2020-10-03video_core: Enforce -Wunused-variable and -Wunused-but-set-variableReinUsesLisp3-4/+7
2020-09-30Remove ext_extended_dynamic_state blacklistMatías Locatti1-8/+0
Latest AMD 20.9.2 driver fixed this, there's no reason to keep it blocked, as the previous stable signed driver release doesn't include the extension.
2020-09-25vk_stream_buffer: Fix initializing Vulkan with NVIDIA on Linuxlat9nq1-1/+2
The previous fix only partially solved the issue, as only certain GPUs that needed 9 or less MiB subtracted would work (i.e. GTX 980 Ti, GT 730). This takes from DXVK's example to divide `heap_size` by 2 to determine `allocable_size`. Additionally tested on my Quadro K4200, which previously required setting it to 12 to boot.
2020-09-25vk_command_pool: Move definition of Pool into the cpp fileLioncash2-4/+6
Allows the implementation details to be changed without recompiling any files that include this header.
2020-09-25vk_command_pool: Make use of override on destructorLioncash1-1/+1
2020-09-25vk_command_pool: Add missing header guardLioncash1-0/+2
2020-09-25More forgetting... duhLevi Behunin1-2/+2
2020-09-25Forgot to apply suggestion here as wellLevi Behunin1-1/+1
2020-09-25Address CommentsLevi Behunin3-25/+34
2020-09-25Start of Integer flags implementationLevi Behunin3-3/+50
2020-09-24arithmetic_integer_immediate: Make use of std::move where applicableLioncash1-16/+19
Same behavior, minus any redundant atomic reference count increments and decrements.
2020-09-24video_core: Fix instances where msbuild always regenerated host shadersReinUsesLisp2-12/+7
When HEADER_GENERATOR was included in the DEPENDS section of custom commands, msbuild assumed this was always modified. Changing this file is not common so we can remove it from there.
2020-09-23shader/registry: Silence a -Wshadow warningLioncash2-5/+5
2020-09-23shader/registry: Remove unnecessary namespace qualifiersLioncash1-5/+3
Using statements already make these unnecessary.
2020-09-23shader/registry: Make use of designated initializers where applicableLioncash1-17/+19
Same behavior, less repetition.
2020-09-23control_flow: emplace elements in place within TryQuery()Lioncash1-6/+6
Places data structures where they'll eventually be moved to to avoid needing to even move them in the first place.
2020-09-23control_flow: Make use of std::move in InsertBranch()Lioncash1-7/+8
Avoids unnecessary atomic increments and decrements.
2020-09-22General: Make use of std::nullopt where applicableLioncash7-32/+29
Allows some implementations to avoid completely zeroing out the internal buffer of the optional, and instead only set the validity byte within the structure. This also makes it consistent how we return empty optionals.
2020-09-20renderer_opengl: Remove emulated mailbox presentationReinUsesLisp5-293/+22
Emulated mailbox presentation was causing performance issues on Nvidia's OpenGL driver. Remove it.
2020-09-19vk_query_cache: Hack counter destructor to avoid reserving queriesReinUsesLisp1-1/+10
This is a hack to destroy all HostCounter instances before the base class destructor is called. The query cache should be redesigned to have a proper ownership model instead of using shared pointers. For now, destroy the host counter hierarchy from the derived class destructor.
2020-09-19renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphoreReinUsesLisp42-814/+638
This reworks how host<->device synchronization works on the Vulkan backend. Instead of "protecting" resources with a fence and signalling these as free when the fence is known to be signalled by the host GPU, use timeline semaphores. Vulkan timeline semaphores allow use to work on a subset of D3D12 fences. As far as we are concerned, timeline semaphores are a value set by the host or the device that can be waited by either of them. Taking advantange of this, we can have a monolithically increasing atomic value for each submission to the graphics queue. Instead of protecting resources with a fence, we simply store the current logical tick (the atomic value stored in CPU memory). When we want to know if a resource is free, it can be compared to the current GPU tick. This greatly simplifies resource management code and the free status of resources should have less false negatives. To workaround bugs in validation layers, when these are attached there's a thread waiting for timeline semaphores.
2020-09-18fermi_2d: Make use of designated initializersLioncash2-8/+8
Same behavior, less repetition. We can also ensure all members of Config are initialized.
2020-09-17decode/image: Eliminate switch fallthrough in DecodeImage()Lioncash1-0/+1
Fortunately this didn't result in any issues, given the block that code was falling through to would immediately break.
2020-09-17decoder/texture: Eliminate narrowing conversion in GetTldCode()Lioncash1-1/+1
The assignment was previously truncating a u64 value to a bool.
2020-09-16video_core: Enforce -Werror=switchReinUsesLisp7-10/+59
This forces us to fix all -Wswitch warnings in video_core.
2020-09-06video_core: Remove all Core::System references in rendererReinUsesLisp49-629/+566
Now that the GPU is initialized when video backends are initialized, it's no longer needed to query components once the game is running: it can be done when yuzu is booting. This allows us to pass components between constructors and in the process remove all Core::System references in the video backend.
2020-08-31vk_device: Fix driver id check on AMD for VK_EXT_extended_dynamic_stateReinUsesLisp1-6/+9
'driver_id' can only be known on Vulkan 1.1 after creating a logical device. Move the driver id check to disable VK_EXT_extended_dynamic_state after the logical device is successfully initialized. The Vulkan device will have the extension enabled but it will not be used.
2020-08-30externals: Update Xbyak to 5.96Lioncash1-5/+5
I made a request on the Xbyak issue tracker to allow some constructors to be constexpr in order to avoid static constructors from needing to execute for some of our register constants. This request was implemented, so this updates Xbyak so that we can make use of it.
2020-08-29vk_device: Blacklist AMD proprietary from VK_EXT_extended_dynamic_stateReinUsesLisp1-1/+6
Vertex binding's <stride> is bugged on AMD's proprietary drivers when using VK_EXT_extended_dynamic_state. Blacklist it for now while we investigate how to report this issue to AMD.
2020-08-27memory_manager: Make use of [[nodiscard]] in the interfaceLioncash1-17/+17
2020-08-27memory_manager: Make operator+ const qualifiedLioncash1-1/+1
This doesn't modify member state, so it can be marked as const.
2020-08-24async_shaders: Mark getters as const member functionsLioncash2-17/+15
While we're at it, we can also mark them as nodiscard.
2020-08-24memory_manager: Mark IsGranularRange() as a const member functionLioncash2-3/+3
This doesn't modify internal member state, so it can be marked as const.
2020-08-24gl_texture_cache: Take std::string by reference in DecorateViewName()Lioncash2-2/+2
LabelGLObject takes a string_view, so we don't need to make copies of the std::string.
2020-08-24video_core/fence_manager: Remove unnecessary includesLioncash3-9/+4
Avoids pulling in unnecessary things that can cause rebuilds when they aren't required.
2020-08-24video_core/host_shaders: Add CMake integration for string shadersReinUsesLisp7-42/+106
Add the necessary CMake code to copy the contents in a string source shader (GLSL or GLASM) to a header file then consumed by video_core files. This allows editting GLSL in its own files without having to maintain them in source files. For now, only OpenGL presentation shaders are moved, but we can add GLASM presentation shaders and static SPIR-V generation through glslangValidator in the future.
2020-08-24gl_shader_util: Use std::string_view instead of star pointerReinUsesLisp5-9/+21
This allows us passing any type of string and hinting the length of the string to the OpenGL driver.
2020-08-22video_core: Initialize renderer with a GPUReinUsesLisp22-119/+172
Add an extra step in GPU initialization to be able to initialize render backends with a valid GPU instance.
2020-08-21vk_state_tracker: Fix primitive topologyReinUsesLisp3-13/+14
State track the current primitive topology with a regular comparison instead of using dirty flags. This fixes a bug in dirty flags for this particular state and it also avoids unnecessary state changes as this property is stored in a frequently changed bit field.
2020-08-20vk_device: Use Vulkan 1.0 properlyReinUsesLisp5-52/+66
Enable the required capabilities to use Vulkan 1.0 without validation errors and disable those that are not compatible with it.
2020-08-20renderer_vulkan: Create a Vulkan 1.0 instance when 1.1 is not availableReinUsesLisp3-6/+26
This commit doesn't make yuzu compatible with Vulkan 1.0 yet, it only creates an 1.0 instance.
2020-08-18common/telemetry: Migrate namespace into the Common namespaceLioncash2-4/+5
Migrates the Telemetry namespace into the Common namespace to make the code consistent with the rest of our common code.
2020-08-16Remove unneeded newlines, optional Registry in shader paramsameerj5-14/+9
Addressing feedback from Rodrigo
2020-08-16Morph: Update worker allocation commentAmeer J1-1/+1
Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
2020-08-16move thread 1/4 count computation into allocate workers methodameerj4-23/+14
2020-08-16Address feedback, add shader compile notifier, update setting textameerj8-161/+116
2020-08-16Vk Async Worker directly emplace in cacheameerj3-58/+41
2020-08-16Address feedback. Bruteforce delete duplicatesameerj7-80/+116
2020-08-16Vk Async pipeline compilationameerj13-20/+182
2020-08-16common/fileutil: Convert namespace to Common::FSLioncash4-30/+30
Migrates a remaining common file over to the Common namespace, making it consistent with the rest of common files. This also allows for high-traffic FS related code to alias the filesystem function namespace as namespace FS = Common::FS; for more concise typing.
2020-08-15common/compression: Roll back std::span changesLioncash1-1/+2
Seems like all compilers don't support std::span yet.
2020-08-14shader/memory: Amend UNIMPLEMENTED_IF_MSG without a messageLioncash1-1/+2
We need to provide a message for this variant of the macro, so we can simply log out the type being used.
2020-08-14macro-interpreter: Resolve -Wself-assign-field warningLioncash1-1/+0
This was assigning the field to itself, which is a no-op. The size doesn't change between its initial assignment and this one, so this is a safe change to make.
2020-08-14vulkan/wrapper: Avoid unnecessary copy in EnumerateInstanceExtensionProperties()Lioncash1-1/+1
Given this is implicitly creating a std::optional, we can move the vector into it.
2020-08-14gl_shader_disk_cache: Make use of std::nullopt where applicableLioncash1-11/+12
Allows the compiler to avoid unnecessarily zeroing out the internal buffer of std::optional on some implementations.
2020-08-14async_shaders: Resolve -Wpessimizing-move warningLioncash1-2/+2
Prevents pessimization of the move constructor (which thankfully didn't actually happen in practice here, given std::thread isn't copyable).
2020-08-14maxwell_3d: Resolve -Wextra-semi warningLioncash1-1/+1
Semicolons after a function definition aren't necessary.
2020-08-13General: Tidy up clang-format warnings part 2Lioncash3-31/+33
2020-08-12gl_shader_cache: Use std::max() for determining num_workersMorph1-1/+1
Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.
2020-08-11textures/decoders: Fix block linear to pitch copiesReinUsesLisp3-34/+34
There were two issues with block linear copies. First the swizzling was wrong and this commit reimplements them. The other issue was that these copies are generally used to download render targets from the GPU and yuzu was not downloading them from host GPU memory unless the extreme GPU accuracy setting was selected. This commit enables cached memory reads for all accuracy levels. - Fixes level thumbnails in Super Mario Maker 2.
2020-08-03vulkan: Silence more -Wmissing-field-initializer warningsLioncash6-3/+18
2020-08-03yuzu: Resolve C++20 deprecation warnings related to lambda capturesLioncash1-1/+1
C++20 deprecates capturing the this pointer via the '=' capture. Instead, we replace it or extend the capture specification.
2020-07-28renderer_opengl: Use 1/4 of all threads for async shader compilationMorph1-9/+4
2020-07-26video_core/gpu: Correct the size of the puller registersBilly Laws1-2/+2
The puller register array is made up of u32s however the `NUM_REGS` value is the size in bytes, so switch it to avoid making the struct unnecessary large. Also fix a small typo in a comment.
2020-07-26hle: nvdrv: Rewrite of GPU memory management.bunnei2-500/+204
2020-07-25vulkan: Resolve -Wmissing-field-initializer warningsLioncash2-0/+4
2020-07-25zstd_compression: Make use of std::span in interfacesLioncash1-2/+1
Allows condensing the data and size parameters into a single argument.
2020-07-21surface_params: Make use of designated initializers where applicableLioncash1-38/+46
Provides a convenient way to avoid unnecessary zero initializing.
2020-07-21surface_params: Remove redundant assignmentLioncash1-1/+0
This is a redundant assignment that can be removed.
2020-07-21surface_params: Replace questionable usages of the comma operator with semicolonsLioncash1-9/+9
These are bugs waiting to happen.
2020-07-21video_core: Remove unused variablesLioncash8-33/+5
Silences several compiler warnings about unused variables.
2020-07-21vk_rasterizer: Remove unused variable in Clear()Lioncash1-4/+0
The relevant values are already assigned further down in the lambda, so this can be removed entirely.
2020-07-21compatible_formats: Add missing header guardLioncash1-0/+2
Prevents potential inclusion issues from occurring.
2020-07-21video_core: Allow copy elision to take place where applicableLioncash7-26/+26
Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them.
2020-07-21video_core: Remove redundant pixel format typeDavid Marcec1-1/+0
We already get the format type before converting shadow formats and during shadow formats.
2020-07-20buffer_cache: Eliminate redundant map lookup in MarkRegionAsWritten()Lioncash1-6/+3
We can make use of emplace()'s return value to determine whether or not we need to perform an increment. emplace() performs no insertion if an element already exist, so this can eliminate a find() call.
2020-07-18gl_arb_decompiler: Use NV_shader_buffer_{load,store} on assembly shadersReinUsesLisp7-110/+173
NV_shader_buffer_{load,store} is a 2010 extension that allows GL applications to use what in Vulkan is known as physical pointers, this is basically C pointers. On GLASM these is exposed through the LOAD/STORE/ATOM instructions. Up until now, assembly shaders were using NV_shader_storage_buffer_object. These work fine, but have a (probably unintended) limitation that forces us to have the limit of a single stage for all shader stages. In contrast, with NV_shader_buffer_{load,store} we can pass GPU addresses to the shader through local parameters (GLASM equivalent uniform constants, or push constants on Vulkan). Local parameters have the advantage of being per stage, allowing us to generate code without worrying about binding overlaps.
2020-07-18Fix style issuesDavid Marcec2-7/+13
2020-07-18vk_device: Fix build error on old MSVC versionsReinUsesLisp1-3/+3
Designated initializers on old MSVC versions fail to build when they take the address of a constant.
2020-07-17Remove duplicate configDavid Marcec1-0/+1
2020-07-17Use conditional varDavid Marcec2-9/+15
2020-07-17Drop max workers from 8->2 for testingDavid Marcec1-1/+1
2020-07-17Rebase for per game settingsDavid Marcec1-1/+1
2020-07-17async shadersDavid Marcec14-58/+571
2020-07-17macro_hle: Remove unnecessary static keywordsLioncash1-7/+4
These functions are already in an anonymous namespace which makes the functions internally linked.
2020-07-17macro_hle: Simplify shift expression in HLE_771BB18C62444DA0()Lioncash1-2/+1
Given the expression involves a 32-bit value, this simplifies down to just: 0x3ffffff. This is likely a remnant from testing that was never cleaned up. Resolves a -Wshift-overflow warning.
2020-07-17macro_hle: Remove unnecessary std::make_pair callsLioncash1-3/+3
The purpose of make_pair is generally to deduce the types within the pair without explicitly specifying the types, so these usages were generally unnecessary, particularly when the type is enforced by the array declaration.
2020-07-17macro: Resolve missing parameter in doxygen commentLioncash1-1/+2
Resolves a -Wdocumentation warning.
2020-07-17wrapper: Make use of designated initializers where applicableLioncash1-56/+64
2020-07-17vk_texture_cache: Make use of designated initializers where applicableLioncash1-96/+135
2020-07-17vk_texture_cache: Amend mismatched access masks and indices in UploadBufferLioncash1-6/+4
Discovered while converting relevant parts of the codebase over to designated initializers.
2020-07-17vk_swapchain: Make use of designated initializers where applicableLioncash1-43/+51
2020-07-17vk_stream_buffer: Make use of designated initializers where applicableLioncash1-19/+16
2020-07-17vk_staging_buffer_pool: Make use of designated initializers where applicableLioncash1-13/+12
2020-07-17vk_shader_util: Make use of designated initializers where applicableLioncash1-7/+7
2020-07-17vk_scheduler: Make use of designated initializers where applicableLioncash1-27/+30
2020-07-17vk_sampler_cache: Make use of designated initializers where applicableLioncash1-24/+27
2020-07-17vk_resource_manager: Make use of designated initializers where applicableLioncash1-15/+14
2020-07-17vk_renderpass_cache: Make use of designated initializers where applicableLioncash1-59/+70
2020-07-17vk_rasterizer: Make use of designated initializers where applicableLioncash1-41/+47
2020-07-17vk_query_cache: Make use of designated initializers where applicableLioncash1-8/+8
2020-07-17vk_pipeline_cache: Make use of designated initializers where applicableLioncash1-31/+35
2020-07-17vk_memory_manager: Make use of designated initializers where applicableLioncash1-7/+6
2020-07-17vk_image: Make use of designated initializers where applicableLioncash1-15/+23
2020-07-17vk_descriptor_pool: Make use of designated initializers where applicableLioncash1-15/+18
2020-07-17vk_graphics_pipeline: Resolve narrowing warningsLioncash1-2/+4
For whatever reason, VK_TRUE and VK_FALSE aren't defined as having a VkBool32 type, so we need to cast to it explicitly.
2020-07-16vk_compute_pipeline: Make use of designated initializers where applicableLioncash1-63/+68
2020-07-16vk_compute_pass: Make use of designated initializers where applicableLioncash1-95/+99
Note: Some barriers can't be converted over yet, as they ICE MSVC.
2020-07-16vk_buffer_cache: Make use of designated initializers where applicableLioncash1-30/+33
Note: An array within CopyFrom() cannot be converted over yet, as it ICEs MSVC when converted over.
2020-07-16decode/other: Implement S2R.LaneIdReinUsesLisp1-2/+1
This maps to host's thread id. - Fixes graphical issues on Paper Mario.
2020-07-16gl_arb_decompiler: Execute BAR even when inside control flowReinUsesLisp1-4/+0
Unlike GLSL, GLASM allows us to call BAR inside control flow. - Fixes graphical artifacts in Paper Mario.
2020-07-16renderer_{opengl,vulkan}: Clamp shared memory to host's limitReinUsesLisp6-9/+42
This stops shaders from failing to build when the exceed host's shared memory size limit. An error is logged.
2020-07-14shader_cache: Make use of std::erase_ifLioncash1-2/+2
Now that we use C++20, we can also make use of std::erase_if instead of needing to do the erase-remove idiom.
2020-07-14vk_device: Make use of designated initializers where applicableLioncash1-124/+152
Avoids redundant repetitions of variable names, and allows assignment all in one statement.
2020-07-14vk_graphics_pipeline: Make use of designated initializers where applicableLioncash1-198/+223
Avoids redundant variable name repetitions.
2020-07-13video_core: Rearrange pixel format namesReinUsesLisp19-1179/+1077
Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs.
2020-07-13video_core: Fix DXT4 and RGB565ReinUsesLisp7-37/+31
2020-07-13video_core/format_lookup_table: Add formats with existing PixelFormatReinUsesLisp1-1/+9
2020-07-13video_core: Fix B5G6R5_UNORM render target formatReinUsesLisp5-1/+10
2020-07-13video_core: Fix B5G6R5UReinUsesLisp2-2/+2
2020-07-13video_core: Implement RGBA32_SINT render targetReinUsesLisp7-58/+71
2020-07-13video_core: Implement RGBA32_SINT render targetReinUsesLisp7-0/+13
2020-07-13video_core: Implement RGBA16_SINT render targetReinUsesLisp7-0/+13
2020-07-13video_core: Implement RGBA8_SINT render targetReinUsesLisp7-0/+13
2020-07-13video_core: Implement RG32_SINT render targetReinUsesLisp7-0/+13
2020-07-13video_core: Implement RG8_SINT render target and fix RG8_UINTReinUsesLisp7-1/+14
2020-07-13video_core: Implement R8_SINT render targetReinUsesLisp7-0/+13
2020-07-13video_core: Implement R8_SNORM render targetReinUsesLisp7-0/+13
2020-07-13video_core/surface: Remove explicit values on PixelFormat's definitionReinUsesLisp1-80/+80
2020-07-13video_core/surface: Reorder render target to pixel format switchReinUsesLisp1-53/+51
2020-07-13vk_blit_screen: Make use of designated initializers where applicableLioncash1-334/+384
Now that we make use of C++20, we can use designated initializers to make things a little nicer to read.
2020-07-13vk_state_tracker: Fix dirty flags for stencil_enable on VK_EXT_extended_dynamic_stateReinUsesLisp1-0/+1
Fixes a regression on any game using stencil on devices with VK_EXT_extended_dynamic_state.
2020-07-10vk_rasterizer: Pass <pSizes> to CmdBindVertexBuffers2EXTReinUsesLisp1-6/+6
This has been fixed in Nvidia's public beta driver 451.74. The previous beta driver will be broken, people using these will have to update.
2020-07-10video_core/textures: Add and use SwizzleSliceToVoxel, and minor style changesReinUsesLisp5-84/+125
Change GOB sizes from free-functions to constexpr constants. Add SwizzleSliceToVoxel, a function that swizzles a 2D array of pixels into a 3D texture and use it for 3D copies.
2020-07-10configuration: implement per-game configurations (#4098)lat9nq11-21/+23
* Switch game settings to use a pointer In order to add full per-game settings, we need to be able to tell yuzu to switch to using either the global or game configuration. Using a pointer makes it easier to switch. * configuration: add new UI without changing existing funcitonality The new UI also adds General, System, Graphics, Advanced Graphics, and Audio tabs, but as yet they do nothing. This commit keeps yuzu to the same functionality as originally branched. * configuration: Rename files These weren't included in the last commit. Now they are. * configuration: setup global configuration checkbox Global config checkbox now enables/disables the appropriate tabs in the game properties dialog. The use global configuration setting is now saved to the config, defaulting to true. This also addresses some changes requested in the PR. * configuration: swap to per-game config memory for properties dialog Does not set memory going in-game. Swaps to game values when opening the properties dialog, then swaps back when closing it. Uses a `memcpy` to swap. Also implements saving config files, limited to certain groups of configurations so as to not risk setting unsafe configurations. * configuration: change config interfaces to use config-specific pointers When a game is booted, we need to be able to open the configuration dialogs without changing the settings pointer in the game's emualtion. A new pointer specific to just the configuration dialogs can be used to separate changes to just those config dialogs without affecting the emulation. * configuration: boot a game using per-game settings Swaps values where needed to boot a game. * configuration: user correct config during emulation Creates a new pointer specifically for modifying the configuration while emulation is in progress. Both the regular configuration dialog and the game properties dialog now use the pointer Settings::config_values to focus edits to the correct struct. * settings: split Settings::values into two different structs By splitting the settings into two mutually exclusive structs, it becomes easier, as a developer, to determine how to use the Settings structs after per-game configurations is merged. Other benefits include only duplicating the required settings in memory. * settings: move use_docked_mode to Controls group `use_docked_mode` is set in the input settings and cannot be accessed from the system settings. Grouping it with system settings causes it to be saved with per-game settings, which may make transferring configs more difficult later on, especially since docked mode cannot be set from within the game properties dialog. * configuration: Fix the other yuzu executables and a regression In main.cpp, we have to get the title ID before the ROM is loaded, else the renderer will reflect only the global settings and now the user's game specific settings. * settings: use a template to duplicate memory for each setting Replaces the type of each variable in the Settings::Values struct with a new class that allows basic data reading and writing. The new struct Settings::Setting duplicates the data in memory and can manage global overrides per each setting. * configuration: correct add-ons config and swap settings when apropriate Any add-ons interaction happens directly through the global values struct. Swapping bewteen structs now also includes copying the necessary global configs that cannot be changed nor saved in per-game settings. General and System config menus now update based on whether it is viewing the global or per-game settings. * settings: restore old values struct No longer needed with the Settings::Setting class template. * configuration: implement hierarchical game properties dialog This sets the apropriate global or local data in each setting. * clang format * clang format take 2 can the docker container save this? * address comments and style issues * config: read and write settings with global awareness Adds new functions to read and write settings while keeping the global state in focus. Files now generated per-game are much smaller since often they only need address the global state. * settings: restore global state when necessary Upon closing a game or the game properties dialog, we need to restore all global settings to the original global state so that we can properly open the configuration dialog or boot a different game. * configuration: guard setting values incorrectly This disables setting values while a game is running if the setting is overwritten by a per game setting. * config: don't write local settings in the global config Simple guards to prevent writing the wrong settings in the wrong files. * configuration: add comments, assume less, and clang format No longer assumes that a disabled UI element means the global state is turned off, instead opting to directly answer that question. Still however assumes a game is running if it is in that state. * configuration: fix a logic error Should not be negated * restore settings' global state regardless of accept/cancel Fixes loading a properties dialog and causing the global config dialog to show local settings. * fix more logic errors Fixed the frame limit would set the global setting from the game properties dialog. Also strengthened the Settings::Setting member variables and simplified the logic in config reading (ReadSettingGlobal). * fix another logic error In my efforts to guard RestoreGlobalState, I accidentally negated the IsPowered condition. * configure_audio: set toggle_stretched_audio to tristate * fixed custom rtc and rng seed overwriting the global value * clang format * rebased * clang format take 4 * address my own review Basically revert unintended changes * settings: literal instead of casting "No need to cast, use 1U instead" Thanks, Morph! Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com> * Revert "settings: literal instead of casting " This reverts commit 95e992a87c898f3e882ffdb415bb0ef9f80f613f. * main: fix status buttons reporting wrong settings after stop emulation * settings: Log UseDockedMode in the Controls group This should have happened when use_docked_mode was moved over to the controls group internally. This just reflects this in the log. * main: load settings if the file has a title id In other words, don't exit if the loader has trouble getting a title id. * use a zero * settings: initalize resolution factor with constructor instead of casting * Revert "settings: initalize resolution factor with constructor instead of casting" This reverts commit 54c35ecb46a29953842614620f9b7de1aa9d5dc8. * configure_graphics: guard device selector when Vulkan is global Prevents the user from editing the device selector if Vulkan is the global renderer backend. Also resets the vulkan_device variable when the users switches back-and-forth between global and Vulkan. * address reviewer concerns Changes function variables to const wherever they don't need to be changed. Sets Settings::Setting to final as it should not be inherited from. Sets ConfigurationShared::use_global_text to static. Co-Authored-By: VolcaEM <volcaem@users.noreply.github.com> * main: load per-game settings after LoadROM This prevents `Restart Emulation` from restoring the global settings *after* the per-game settings were applied. Thanks to BSoDGamingYT for finding this bug. * Revert "main: load per-game settings after LoadROM" This reverts commit 9d0d48c52d2dcf3bfb1806cc8fa7d5a271a8a804. * main: only restore global settings when necessary Loading the per-game settings cannot happen after the ROM is loaded, so we have to specify when to restore the global state. Again thanks to BSoD for finding the bug. * configuration_shared: address reviewer concerns except operator overrides Dropping operator override usage in next commit. Co-Authored-By: LC <lioncash@users.noreply.github.com> * settings: Drop operator overrides from Setting template Requires using GetValue and SetValue explicitly. Also reverts a change that broke title ID formatting in the game properties dialog. * complete rebase * configuration_shared: translate "Use global configuration" Uses ConfigurePerGame to do so, since its usage, at least as of now, corresponds with ConfigurationShared. * configure_per_game: address reviewer concern As far as I understand, it prevents the program from unnecessarily copying strings. Co-Authored-By: LC <lioncash@users.noreply.github.com> Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com> Co-authored-by: VolcaEM <volcaem@users.noreply.github.com> Co-authored-by: LC <lioncash@users.noreply.github.com>
2020-07-10vk_stream_buffer: set allocable_size to 9 MiBlat9nq1-1/+1
This solves the crash on Linux systems running the current Linux Long Lived branch nVidia driver.
2020-07-08maxwell_dma: Rename registers to match official docs and reorderReinUsesLisp2-287/+355
Rename registers in the MaxwellDMA class to match Nvidia's official documentation. This one can be found here: https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/dma-copy/clb0b5.h While we are at it, reorganize the code in MaxwellDMA to be separated in different functions.
2020-07-01shader_cache: Fix use-after-free and orphan invalidation cache entriesReinUsesLisp1-29/+41
This fixes some cases where entries could have been removed multiple times reading freed memory. To address this issue this commit removes duplicates from entries marked for removal and sorts out the removal process to fix another use-after-free situation. Another issue fixed in this commit is orphan invalidation cache entries. Previously only the entries that were invalidated in the current operations had its entries removed. This led to more use-after-free situations when these entries were actually invalidated but referenced an object that didn't exist.
2020-06-30maxwell_to_gl: Implement MirrorOnceClampOGL using GL_MIRROR_CLAMP_EXTMorph1-0/+6
Like MirrorOnceBorder, this requires the GL_EXT_texture_mirror_clamp extension. This extension is unfortunately not available on Intel's drivers (both Windows proprietary and Linux Mesa). Use GL_MIRROR_CLAMP_TO_EDGE as a fallback if the extension is unavailable.
2020-06-30macro: Add support for "middle methods" on the code cache (#4112)David1-8/+27
Macro code is just uploaded sequentially from a starting address, however that does not mean the entry point for the macro is at that address. This PR adds preliminary support for executing macros in the middle of our cached code.
2020-06-29maxwell_to_gl: Rename VertexType() to VertexFormat()Morph2-4/+5
2020-06-28maxwell_to_vk: Reorder vertex formats and add A2B10G10R10 for all types except floatMorph1-75/+69
2020-06-28maxwell_to_gl: Add 32 bit component sizes to (un)signed scaled formatsMorph1-30/+4
Add 32 bit component sizes to (un)signed scaled formats and group (un)signed normalized, scaled, and integer formats together.
2020-06-27General: Tune the priority of main emulation threads so they have higher priority than less important helper threads.Fernando Sahmkow2-0/+3
2020-06-27General: Correct rebase, sync gpu and context management.Fernando Sahmkow5-2/+25
2020-06-27General: Setup yuzu threads' microprofile, naming and registry.Fernando Sahmkow1-1/+5
2020-06-27General: Recover Prometheus project from harddrive failure Fernando Sahmkow1-2/+3
This commit: Implements CPU Interrupts, Replaces Cycle Timing for Host Timing, Reworks the Kernel's Scheduler, Introduce Idle State and Suspended State, Recreates the bootmanager, Initializes Multicore system.
2020-06-27vk_rasterizer: Use nullptr for <pSizes> in CmdBindVertexBuffers2EXTReinUsesLisp1-6/+6
Disable this temporarily.
2020-06-27vk_pipeline_cache: Avoid hashing and comparing dynamic state when possibleReinUsesLisp6-23/+51
With extended dynamic states, some bytes don't have to be collected from the pipeline key, hence we can avoid hashing and comparing them on lookups.
2020-06-27vulkan/fixed_pipeline_state: Move state out of individual structuresReinUsesLisp4-121/+84
2020-06-27vk_rasterizer: Use VK_EXT_extended_dynamic_stateReinUsesLisp5-46/+356
2020-06-27renderer_vulkan/wrapper: Add VK_EXT_extended_dynamic_state functionsReinUsesLisp2-0/+64
2020-06-27fixed_pipeline_state: Add requirements for VK_EXT_extended_dynamic_stateReinUsesLisp7-155/+143
This moves dynamic state present in VK_EXT_extended_dynamic_state to a separate structure in FixedPipelineState. This is structure is at the bottom allowing us to hash and memcmp only when the extension is not supported.
2020-06-27vk_device: Enable VK_EXT_extended_dynamic_state when availableReinUsesLisp2-0/+32
2020-06-27texture_cache: Test format compatibility before copyingReinUsesLisp2-6/+21
Avoid illegal copies. This intercepts the last step of a copy to avoid generating validation errors or corrupting the driver on some instances. We can create views and emit copies accordingly in future commits and remove this last-step validation.
2020-06-27video_core/compatible_formats: Table to test if two formats are legal to view or copyReinUsesLisp3-0/+196
Add a flat table to test if it's legal to create a texture view between two formats or copy betweem them. This table is based on ARB_copy_image and ARB_texture_view. Copies are more permissive than views.
2020-06-26gl_buffer_cache: Copy to buffers created as STREAM_READ before downloadingReinUsesLisp5-18/+24
After marking buffers as resident, Nvidia's driver seems to take a slow path. To workaround this issue, copy to a STREAM_READ buffer and then call GetNamedBufferSubData on it. This is a temporary solution until we have asynchronous flushing.
2020-06-25gl_device: Fix IsASTCSupportedDavid Marcec1-1/+1
Other targets were never actually checked
2020-06-25gl_device: Enable NV_vertex_buffer_unified_memory on Turing devicesReinUsesLisp1-19/+1
Once we make sure not to corrupt Nvidia's driver, we can safely use resident buffers on Turing devices. See GitHub pull request #4156
2020-06-24buffer_cache: Use buffer methods instead of cache virtual methodsReinUsesLisp5-99/+90
2020-06-24gl_stream_buffer: Use InvalidateBufferData instead unmap and mapReinUsesLisp2-15/+5
Making the stream buffer resident increases GPU usage significantly on some games. This seems to be addressed invalidating the stream buffer with InvalidateBufferData instead of using a Unmap + Map (with invalidation flags).
2020-06-24gl_rasterizer: Use NV_vertex_buffer_unified_memory for vertex buffer robustnessReinUsesLisp3-9/+39
Switch games are allowed to bind less data than what they use in a vertex buffer, the expected behavior here is that these values are read as zero. At the moment of writing this only D3D12, OpenGL and NVN through NV_vertex_buffer_unified_memory support vertex buffer with a size limit. In theory this could be emulated on Vulkan creating a new VkBuffer for each (handle, offset, length) tuple and binding the expected data to it. This is likely going to be slow and memory expensive when used on the vertex buffer and we have to do it on all draws because we can't know without analyzing indices when a game is going to read vertex data out of bounds. This is not a problem on OpenGL's BufferAddressRangeNV because it takes a length parameter, unlike Vulkan's CmdBindVertexBuffers that only takes buffers and offsets (the length is implicit in VkBuffer). It isn't a problem on D3D12 either, because D3D12_VERTEX_BUFFER_VIEW on IASetVertexBuffers takes SizeInBytes as a parameter (although I am not familiar with robustness on D3D12). Currently this only implements buffer ranges for vertex buffers, although indices can also be affected. A KHR_robustness profile is not created, but Nvidia's driver reads out of bound vertex data as zero anyway, this might have to be changed in the future. - Fixes SMO random triangles when capturing an enemy, getting hit, or looking at the environment on certain maps.
2020-06-24gl_buffer_cache: Mark buffers as residentReinUsesLisp10-67/+111
Make stream buffer and cached buffers as resident and query their address. This allows us to use GPU addresses for several proprietary Nvidia extensions.
2020-06-24gl_device: Expose NV_vertex_buffer_unified_memory except on TuringReinUsesLisp2-1/+30
Expose NV_vertex_buffer_unified_memory when the driver supports it. This commit adds a function the determine if a GL_RENDERER is a Turing GPU. This is required because on Turing GPUs Nvidia's driver crashes when the buffer is marked as resident or on DeleteBuffers. Without a synchronous debug output (single threaded driver), it's likely that the driver will crash in the first blocking call.
2020-06-24gl_stream_buffer: Always use a non-coherent bufferReinUsesLisp2-14/+10
2020-06-24gl_stream_buffer: Always use persistent memory mapsReinUsesLisp2-30/+14
yuzu no longer supports platforms without persistent maps.
2020-06-24addressed issuesDavid Marcec2-4/+7
2020-06-24clear mme draw modeDavid Marcec1-0/+3
We already draw, so we can clear it
2020-06-24Addressed issuesDavid Marcec5-13/+17
2020-06-24Fix constbuffer for 0217920100488FF7David Marcec1-6/+6
2020-06-24Macro HLE supportDavid Marcec9-10/+209
2020-06-24gl_shader_cache: Avoid use after move for program sizeReinUsesLisp2-6/+7
All programs had a size of zero due to this bug, skipping invalidations. While we are at it, remove some unused forward declarations.
2020-06-23shader/half_set: Implement HSET2_IMMReinUsesLisp2-21/+75
Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone.
2020-06-22TextureCache: Fix case where layer goes off bound.Fernando Sahmkow1-0/+3
The returned layer is expected to be between 0 and the depth of the surface, anything larger is off bounds.
2020-06-22renderer_vulkan: Update validation layer name and test before enablingReinUsesLisp3-5/+43
Update validation layer string to VK_LAYER_KHRONOS_validation. While we are at it, properly check for available validation layers before enabling them.
2020-06-21gl_shader_decompiler: Enable GL_EXT_texture_shadow_lod if availableMorph1-7/+43
Enable GL_EXT_texture_shadow_lod if available. If this extension is not available, such as on Intel/AMD proprietary drivers, use textureGrad as a workaround.
2020-06-21gl_device: Check for GL_EXT_texture_shadow_lodMorph2-0/+7
2020-06-20macro_jit_x64: Use ecx for shift registerMerryMage1-2/+2
shl/shr only accept cl as their second argument
2020-06-20texture_cache: Fix incorrect address used in a DeduceSurface() callLioncash1-1/+1
Previously the source was being deduced twice in a row.
2020-06-20decode/image: Implement B10G11R11FMorph1-9/+17
- Used by Kirby Star Allies
2020-06-20gl_arb_decompiler: Avoid several string copiesLioncash1-32/+31
Variables that are marked as const cannot have the move constructor invoked when returning from a function (the move constructor requires a non-const variable so it can "steal" the resources from it.
2020-06-20vulkan/wrapper: Remove noexcept from GetSurfaceCapabilitiesKHR()Lioncash2-3/+2
Check() can throw an exception if the Vulkan result isn't successful. We remove the check so that std::terminate isn't outright called and allows for better debugging (should it ever actually fail).
2020-06-20macro_jit_x64: Correct readability of Compile_ExtractShiftLeftImmediate()Lioncash1-3/+3
Previously dst wasn't being used.
2020-06-20macro_jit_x64: Correct readability of Compile_ExtractShiftLeftRegister()Lioncash1-3/+4
Previously dst wasn't being used.
2020-06-20macro_jit_x64: Remove unused variableLioncash1-2/+1
Removes a completely unused label and marks another variable as unused, given it seems like it has potential uses in the future.
2020-06-20memory_manager: Eliminate variable shadowingLioncash2-24/+28
Renames some variables to prevent ones in inner scopes from shadowing outer-scoped variables. The Copy* functions have no shadowing, but we rename them anyways to remain consistent with the other functions.
2020-06-20macro_jit_x64: Eliminate variable shadowing in Compile_ProcessResult()Lioncash1-2/+2
We can reduce the capture scope so that it's not possible for both "reg" variables to clash with one another. While we're at it, we can prevent unnecessary copies while we're at it.
2020-06-20buffer_cache: Eliminate local variable shadowingLioncash1-2/+1
We can just make use of the instance in the scope above this one.
2020-06-19macro_jit_x64: Remove unused function ReadMerryMage1-8/+4
2020-06-18vk_rasterizer: Don't preserve contents on full screen clearsReinUsesLisp2-7/+58
There's no need to load contents from the CPU when a clear resets all the contents of the underlying memory. This is already implemented on OpenGL and the texture cache.
2020-06-18vk_update_descriptor: Upload descriptor sets data directlyReinUsesLisp3-42/+30
Instead of copying to a temporary payload before sending the update task to the worker thread, insert elements to the payload directly.
2020-06-18vk_rasterizer: BindTransformFeedbackBuffersEXT accepts a size of type VkDeviceSizeMerryMage1-1/+1
2020-06-18renderer_vulkan: Fix macOS GetBundleDirectory referenceMerryMage1-1/+3
2020-06-18memory_util: boost hashes are size_tMerryMage1-2/+2
* boost::hash_value returns a size_t * boost::hash_combine takes a size_t& argument
2020-06-18Rename PAGE_SHIFT to PAGE_BITSMerryMage2-10/+10
macOS header files #define PAGE_SHIFT
2020-06-18vk_sampler_cache: Emulate GL_LINEAR/NEAREST minification filtersMorph1-2/+4
Emulate GL_LINEAR/NEAREST minification filters using minLod = 0 and maxLod = 0.25 during sampler creation
2020-06-18maxwell_to_vk: Reorder filter cases and correct mipmap_filter=NoneMorph1-17/+15
maxwell_to_vk: Reorder filtering modes to start with None, then Nearest, then Linear. maxwell_to_vk: Logs filter modes under UNREACHABLE_MSG instead of UNIMPLEMENTED_MSG, since any unknown filter modes are invalid and not unimplemented. maxwell_to_vk: Return VK_SAMPLER_MIPMAP_MODE_NEAREST instead of VK_SAMPLER_MIPMAP_MODE_LINEAR when mipmap_filter is None with the description from the VkSamplerCreateInfo(3) man page.
2020-06-18maxwell_to_gl: Miscellaneous changesMorph1-48/+34
maxwell_to_gl: Log unimplemented features under UNIMPLEMENTED_MSG instead of LOG_ERROR to bring into parity with maxwell_to_vk maxwell_to_gl: Deduplicate logging in VertexType(), merging them into one. maxwell_to_gl: Return GL_NEAREST instead of GL_LINEAR if an unknown texture filter mode is encountered. maxwell_to_gl: Log the mipmap filter mode if an unknown value is passed in. maxwell_to_gl: Reorder filtering modes to start with None, then Nearest, then Linear.
2020-06-17macro_jit_x64: Inline Engines::Maxwell3D::GetRegisterValueMerryMage2-6/+18
2020-06-17macro_jit_x64: Optimization implicitly assumes same destinationMerryMage1-1/+2
2020-06-17macro_jit_x64: Should not skip zero registers for certain ALU opsMerryMage1-1/+3
The code generated for these ALU ops assume src_a and src_b are always valid.
2020-06-16gl_device: Reserve at least 4 image bindings for fragment stageMorph1-6/+14
Due to the limitation of GL_MAX_IMAGE_UNITS being low (8) on Intel's and Nvidia's proprietary drivers, we have to reserve an appropriate amount of image bindings for each of the stages. So far games have been observed to use 4 image bindings on the fragment stage (Kirby Star Allies) and 1 on the vertex stage (TWD series). No games thus far in my limited testing used more than 4 images concurrently and across all currently active programs. This fixes shader compilation errors on Kirby Star Allies on OpenGL (GLSL/GLASM)
2020-06-15macro_jit_x64: Remove NEXT_PARAMETERMerryMage1-5/+2
Not required, as PARAMETERS can just be incremented directly.
2020-06-15macro_jit_x64: Remove unused function Compile_WriteCarryMerryMage2-9/+0
2020-06-15macro_jit_x64: Select better registersMerryMage1-8/+8
All registers are now callee-save registers. RBX and RBP selected for STATE and RESULT because these are most commonly accessed; this is to avoid the REX prefix. RBP not used for STATE because there are some SIB restrictions, RBX emits smaller code.
2020-06-15macro_jit_x64: Remove REGISTERSMerryMage1-7/+3
Unnecessary since this is just an offset from STATE.
2020-06-15macro_jit_x64: Remove JITState::parametersMerryMage2-6/+3
This can be passed in as an argument instead.
2020-06-15macro_jit_x64: Remove METHOD_ADDRESS_64MerryMage1-2/+1
Unnecessary variable.
2020-06-15macro_jit_x64: Remove RESULT_64MerryMage2-16/+3
This Reg64 codepath has the exact same behaviour as the Reg32 one.
2020-06-15xbyak_abi: Remove *GPS variants of stack manipulation functionsMerryMage1-6/+6
2020-06-15video_core/macro_jit_x64: Remove initializer in member variableReinUsesLisp1-2/+2
Fix build time issues on gcc. Confirmed through asan that avoiding this initialization is safe.
2020-06-12gl_arb_decompiler: Implement FSwizzleAddReinUsesLisp1-4/+27
2020-06-12gl_arb_decompiler: Implement an assembly shader decompilerReinUsesLisp6-1/+2091
Emit code compatible with NV_gpu_program5. This should emit code compatible with Fermi, but it wasn't tested on that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-09buffer_cache: Avoid passing references of shared pointers and misc style changesReinUsesLisp9-174/+150
Instead of using as template argument a shared pointer, use the underlying type and manage shared pointers explicitly. This can make removing shared pointers from the cache more easy. While we are at it, make some misc style changes and general improvements (like insert_or_assign instead of operator[] + operator=).
2020-06-09gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidationReinUsesLisp1-1/+10
Vertex buffers bindings become invalid after the stream buffer is invalidated. We were originally doing this, but it got lost at some point. - Fixes Animal Crossing: New Horizons, but it affects everything.
2020-06-09buffer_cache: Return stream buffer invalidation in Map instead of UnmapReinUsesLisp1-7/+9
We have to invalidate whatever cache is being used before uploading the data, hence it makes more sense to return this on Map instead of Unmap.
2020-06-08texture_cache: Port original code management for 2D vs 3D texturesReinUsesLisp2-16/+35
Handle blits to images as 2D, even when they have block depth. - Fixes rendering issues on Luigi's Mansion 3
2020-06-08texture_cache: Simplify blit codeReinUsesLisp1-9/+7
2020-06-08texture_cache: Handle 3D texture blits with one layerReinUsesLisp3-5/+10
2020-06-08texture_cache: Implement rendering to 3D texturesReinUsesLisp10-139/+191
This allows rendering to 3D textures with more than one slice. Applications are allowed to render to more than one slice of a texture using gl_Layer from a VTG shader. This also requires reworking how 3D texture collisions are handled, for now, this commit allows rendering to slices but not to miplevels. When a render target attempts to write to a mipmap, we fallback to the previous implementation (copying or flushing as needed). - Fixes color correction 3D textures on UE4 games (rainbow effects). - Allows Xenoblade games to render to 3D textures directly.
2020-06-07rasterizer_cache: Remove files and includesReinUsesLisp7-269/+3
The rasterizer cache is no longer used. Each cache has its own generic implementation optimized for the cached data.
2020-06-07vk_pipeline_cache: Use generic shader cacheReinUsesLisp5-58/+55
Trivial port the generic shader cache to Vulkan.
2020-06-07gl_shader_cache: Use generic shader cacheReinUsesLisp4-93/+80
Trivially port the generic shader cache to OpenGL.
2020-06-07shader_cache: Implement a generic shader cacheReinUsesLisp2-0/+229
Implement a generic shader cache for fast lookups and invalidations. Invalidations are cheap but expensive when a shader is invalidated. Use two mutexes instead of one to avoid locking invalidations for lookups and vice versa. When a shader has to be removed, lookups are locked as expected.
2020-06-06gl_device: Black list NVIDIA 443.24 for fast buffer uploadsReinUsesLisp1-2/+10
Skip fast buffer uploads on Nvidia 443.24 Vulkan beta driver on OpenGL. This driver throws the following error when calling BufferSubData or BufferData on buffers that are candidates for fast constant buffer uploads. This is the equivalens to push constants on Vulkan, except that they can access the full buffer. The error: Unknown internal debug message. The NVIDIA OpenGL driver has encountered an out of memory error. This application might behave inconsistently and fail. If this error persists on future drivers, we might have to look deeper into this issue. For now, we can black list it and log it as a temporary solution.
2020-06-06renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabledReinUsesLisp1-4/+2
Avoids logging when it's not relevant. This can potentially reduce driver's internal thread overhead.
2020-06-05shader/texture: Join separate image and sampler pairs offlineReinUsesLisp16-88/+234
Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch
2020-06-05shader/track: Move bindless tracking to a separate functionReinUsesLisp2-25/+39
2020-06-04Default init labels and use initializer list for macro engineDavid Marcec2-2/+2
2020-06-04gl_rasterizer: Use NV_transform_feedback for XFB on assembly shadersReinUsesLisp3-1/+96
NV_transform_feedback, NV_transform_feedback2 and ARB_transform_feedback3 with NV_transform_feedback interactions allows implementing transform feedbacks as dynamic state. Maxwell implements transform feedbacks as dynamic state, so using these extensions with TransformFeedbackStreamAttribsNV allows us to properly emulate transform feedbacks without having to recompile shaders when the state changes.
2020-06-03Mark parameters as constDavid Marcec8-11/+11
2020-06-02Pass by reference instead of copying parametersDavid Marcec4-7/+9
2020-06-02vk_shader_decompiler: Implement atomic image operationsReinUsesLisp1-40/+24
Implement atomic operations on images. On GLSL these are atomicImage* functions (e.g. atomicImageAdd).
2020-06-02vk_rasterizer: Implement storage texelsReinUsesLisp8-52/+120
This is the equivalent of an image buffer on OpenGL. - Used by Octopath Traveler
2020-06-02maxwell_to_vk: Add R16UI image formatReinUsesLisp2-71/+74
- Used by Octopath Traveler
2020-06-01gl_shader_decompiler: Declare gl_Layer and gl_ViewportIndex within gl_PerVertex for vertex and tessellation shadersMorph1-6/+16
2020-06-01gl_shader_decompiler: Fix geometry shader outputs for Intel driversMorph1-13/+15
On Intel's proprietary drivers, gl_Layer and gl_ViewportIndex are not allowed members of gl_PerVertex block, causing the shader to fail to compile. Fix this by declaring these variables outside of gl_PerVertex.
2020-06-01gl_device: Avoid devices with CAVEAT_SUPPORT on ASTCReinUsesLisp2-8/+19
This avoids using Nvidia's ASTC decoder on OpenGL. The last time it was profiled, it was slower than yuzu's decoder. While we are at it, fix a bug in the texture cache when native ASTC is not supported.
2020-06-01glsl: Squash constant buffers into a single SSBO when we hit the limitReinUsesLisp7-79/+173
Avoids compilation errors at the cost of shader build times and runtime performance when a game hits the limit of uniform buffers we can use.
2020-05-31gl_device: Enable compute shaders for Intel proprietary driversMorph3-13/+0
Previously we were disabling compute shaders on Intel's proprietary driver due to broken compute. This has been fixed in the latest Intel drivers. Re-enable compute for Intel proprietary drivers and remove the check for broken compute.
2020-05-30shader/other: Fix hardcoded value in S2R INVOCATION_INFOReinUsesLisp1-1/+1
Geometry shaders built from Nvidia's compiler check for bits[16:23] to be less than or equal to 0 with VSETP to default to a "safe" value of 0x8000'0000 (safe from hardware's perspective). To avoid hitting this path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO. This seems to be the maximum number of vertices a geometry shader can emit in a primitive.
2020-05-30texture_cache: More relaxed reconstructionReinUsesLisp1-13/+9
Only reupload textures when they've not been modified from the GPU.
2020-05-30Favor switch case over jump tableDavid Marcec2-18/+26
Easier to read and will emit a jump table automatically.
2020-05-30Implement macro JITDavid Marcec9-189/+1010
2020-05-30Add xbyak externalDavid Marcec1-1/+1
2020-05-30texture_cache: Only copy textures that were modified from hostReinUsesLisp1-2/+6
2020-05-30texture_cache: Reload textures when number of resources mismatchReinUsesLisp1-0/+9
2020-05-29vk_rasterizer: Skip transform feedbacks when extension is unavailableReinUsesLisp1-0/+7
Avoids calling transform feedback procedures when VK_EXT_transform_feedback is not available.
2020-05-29texture_cache: Handle overlaps with multiple subresourcesReinUsesLisp1-27/+33
Implement more surface reconstruct cases. Allow overlaps with more than one layer and mipmap and copies all of them to the new texture. - Fixes textures moving around objects on Xenoblade games
2020-05-28maxwell_3d: Reduce severity of logs that can be spammedReinUsesLisp1-6/+7
These logs were killing performance on some games when they were spammed. Reduce them to Debug severity.
2020-05-28format_lookup_table: Implement G24S8 format as S8Z24ReinUsesLisp1-1/+2
2020-05-28buffer_cache: Avoid copying twice on certain casesReinUsesLisp1-17/+23
Avoid copying to a staging buffer on non-granular memory addresses. Add a callable argument to StreamBufferUpload to be able to copy to the staging buffer directly from ReadBlockUnsafe.
2020-05-27texture_cache: Use unordered_map::find instead of operator[] on hot codeReinUsesLisp1-15/+19
2020-05-27texture_cache: Use small vector for surface vectorsReinUsesLisp1-9/+10
This avoids most heap allocations when collecting surfaces into a vector.
2020-05-27maxwell_3d: Initialize line widthsReinUsesLisp1-0/+2
Initialize line widths to avoid setting a line width of zero.
2020-05-27maxwell_3d: Initialize polygon modesReinUsesLisp1-0/+2
NVN expects this to be initialized as Fill, otherwise games that never bind a rasterizer state will log an invalid polygon mode.
2020-05-27shader/other: Implement MEMBAR.CTSReinUsesLisp4-9/+27
This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
2020-05-26texture_cache: Fix layered null surfacesReinUsesLisp1-1/+3
Null texture cubes were not considered arrays, causing issues on Vulkan and OpenGL when creating views.
2020-05-26gl_texture_cache: Implement small texture view cache for swizzlesReinUsesLisp3-37/+44
This fixes cases where the texture swizzle was applied twice on the same draw to a texture bound to two different slots.
2020-05-26texture_cache: Implement depth stencil texture swizzlesReinUsesLisp3-36/+42
Stop ignoring image swizzles on depth and stencil images. This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL texture changes swizzles twice before being used. A proper fix would be having a small texture view cache for this like we do on Vulkan.
2020-05-26gl_rasterizer: Port front face flip check from VulkanReinUsesLisp1-5/+20
While Vulkan was assuming we had no negative viewports, OpenGL code was assuming we had them. Port the old code from Vulkan to OpenGL, checking if the first viewport is negative before flipping faces. This is not a complete implementation since we only check for the first viewport to be negative. That said, unless a game is using Vulkan, OpenGL and NVN games should be fine here, and we can always compare with our Vulkan backend to see if there's a difference.
2020-05-26fixed_pipeline_state: Remove unnecessary check for front faces flipReinUsesLisp1-2/+1
The check to flip faces when viewports are negative were a left over from the old OpenGL code. This is not required on Vulkan where we have negative viewports.
2020-05-26gl_shader_manager: Unbind GLSL program when binding a host pipelineReinUsesLisp1-0/+4
Fixes regression in Link's Awakening caused by 420cc13248350ef5c2d19e0b961cb4185cd16a8a
2020-05-22shader/other: Implement BAR.SYNC 0x0ReinUsesLisp4-0/+33
Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
2020-05-22shader/memory: Implement non-addition operations in REDReinUsesLisp1-2/+1
Trivially implement these instructions. They are used in Astral Chain.
2020-05-22shader/other: Implement thread comparisons (NV_shader_thread_group)ReinUsesLisp4-0/+72
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-22shader_decompiler: Visit source nodes even when they assign to RZReinUsesLisp2-2/+6
Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.
2020-05-22vk_shader_decompiler: Don't assert for void returnsReinUsesLisp1-2/+1
Atomic instructions can be used without returning anything and this is valid code. Remove the assert.
2020-05-21buffer_cache: Remove unused boost headersReinUsesLisp1-2/+0
2020-05-21map_interval: Add interval allocator and drop hackReinUsesLisp4-3/+79
Drop the std::list hack to allocate memory indefinitely. Instead use a custom allocator that keeps references valid until destruction. This allocates fixed chunks of memory and puts pointers in a free list. When an allocation is no longer used put it back to the free list, this doesn't heap allocate because std::vector doesn't change the capacity. If the free list is empty, allocate a new chunk.
2020-05-21buffer_cache: Use boost::container::small_vector for maps in rangeReinUsesLisp1-13/+15
Most overlaps in the buffer cache only contain one mapped address. We can avoid close to all heap allocations once the buffer cache is warmed up by using a small_vector with a stack size of one.
2020-05-21buffer_cache: Use boost::intrusive::set for cachingReinUsesLisp6-30/+48
Instead of using boost::icl::interval_map for caching, use boost::intrusive::set. interval_map is intended as a container where the keys can overlap with one another; we don't need this for caching buffers and a std::set-like data structure that allows us to search with lower_bound is enough.
2020-05-21buffer_cache: Remove shared pointersReinUsesLisp2-70/+72
Removing shared pointers is a first step to be able to use intrusive objects and keep allocations close to one another in memory.
2020-05-21buffer_cache: Minor style changesReinUsesLisp2-129/+65
Minor style changes. Mostly done so I avoid editing it while doing other changes.
2020-05-19renderer_opengl: Add assembly program code pathsReinUsesLisp12-109/+339
Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does **not** include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.
2020-05-18maxwell_to_vk: Add format B8G8R8A8_SRGBMorph2-2/+3
Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM Used by Bravely Default II
2020-05-18OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled.Fernando Sahmkow1-0/+3
This commit aims to help easing debugging of driver crashes without having to modify existing code.
2020-05-16DmaPusher: Remove dead code in stepDavid Marcec2-9/+1
2020-05-16vk_rasterizer: Match OpenGL's FlushAndInvalidate behaviorReinUsesLisp1-1/+3
Match OpenGL's behavior. This can fix or simplify bisecting issues on Vulkan.
2020-05-13vk_rasterizer: Implement constant attributesReinUsesLisp4-13/+26
Constant attributes (in OpenGL known disabled attributes) are not supported on Vulkan, even with extensions. To emulate this behavior we return zero on reads from disabled vertex attributes in shader code. This has no caching cost because attribute formats are not dynamic state on Vulkan and we have to store it in the pipeline cache anyway. - Fixes Animal Crossing: New Horizons terrain borders
2020-05-13vk_rasterizer: Remove buffer check in attribute selectionReinUsesLisp1-4/+0
This was a left over from OpenGL when disabled buffers where not properly emulated. We no longer have to assert this as it is checked in vertex buffer initialization.
2020-05-10gl_shader_decompiler: Properly emulate NaN behaviour on NEReinUsesLisp1-0/+9
"Not equal" operators on GLSL seem to behave as unordered when we expect an ordered comparison. Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-10RasterizerCache: Correct documentation.Fernando Sahmkow1-2/+2
2020-05-10VkPipelineCache: Use a null shader on invalid address.Fernando Sahmkow1-2/+1
2020-05-10VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation.Fernando Sahmkow3-5/+61
2020-05-09shader_ir: Separate float-point comparisons in ordered and unorderedReinUsesLisp7-135/+163
This allows us to use native SPIR-V instructions without having to manually check for NAN.
2020-05-05Update src/video_core/gpu.cppbunnei1-1/+1
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
2020-05-05Update src/video_core/gpu.cppbunnei1-1/+1
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
2020-05-05vk_sampler_cache: Use VK_EXT_custom_border_color when availableReinUsesLisp3-2/+44
This should fix grass interactions on Breath of the Wild on Vulkan. It is currently untested against validation layers. Nvidia's Windows 443.09 beta driver or Linux 440.66.12 is required for now.
2020-05-04vk_graphics_pipeline: Implement viewport swizzles with NV_viewport_swizzleReinUsesLisp8-0/+84
2020-05-04gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzleReinUsesLisp2-0/+13
2020-05-04maxwell_3d: Add viewport swizzlesReinUsesLisp2-1/+24
2020-05-02vk_graphics_pipeline: Implement rasterizer_enable on VulkanReinUsesLisp3-1/+3
We can simply enable rasterizer discard matching the current pipeline key.
2020-05-02fixed_pipeline_state: explicitly use template keyword after 1f345ebe3a55Jan Beich1-2/+4
In file included from src/video_core/renderer_opengl/renderer_opengl.cpp:25: In file included from src/./video_core/renderer_opengl/gl_rasterizer.h:26: In file included from src/./video_core/renderer_opengl/gl_fence_manager.h:11: src/./video_core/fence_manager.h:91:32: error: use 'template' keyword to treat 'Write' as a dependent template name memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload()); ^ template src/./video_core/fence_manager.h:137:32: error: use 'template' keyword to treat 'Write' as a dependent template name memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload()); ^ template
2020-04-30maxwell_to_vk: implement missing signed int formatsDan1-2/+14
2020-04-30texture: Implement R8G8UIMorph8-38/+52
- Used by The Walking Dead: The Final Season
2020-04-29vulkan: Remove unnecessary includesLioncash31-59/+3
Reduces some header churn and reduces rebuilds when some header internals change. While we're at it we can also resolve a missing include in buffer_cache.
2020-04-28shader/arithmetic_integer: Fix tracking issue in temporaryReinUsesLisp1-4/+0
This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented. It caused issues when tracking global buffers.
2020-04-28Clang Format and Documentation.Fernando Sahmkow10-10/+20
2020-04-28MaxwellDMA: Optimize micro copies.Fernando Sahmkow3-0/+57
2020-04-28vk_rasterizer: Skip index buffer setup when vertices are zeroReinUsesLisp1-0/+3
Xenoblade 2 invokes a draw call with zero vertices. This is likely due to indirect drawing (glDrawArraysIndirect). This causes a crash in the staging buffer pool when trying to create a buffer with a size of zero. To workaround this, skip index buffer setup entirely when the number of indices is zero.
2020-04-28{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registersReinUsesLisp13-16/+57
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).
2020-04-28VideoCore/GPU: Delegate subchannel engines to the dma pusher.Fernando Sahmkow3-4/+49
2020-04-28VideoCore/Engines: Refactor Engines CallMethod.Fernando Sahmkow13-62/+82
2020-04-28maxwell_3d: Fix depth clamping registerReinUsesLisp5-8/+5
Using deko3d as reference: https://github.com/devkitPro/deko3d/blob/4e47ba0013552e592a86ab7a2510d1e7dadf236a/source/maxwell/gpu_3d_state.cpp#L42 We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27texture_cache: Reintroduce preserve_contents accuratelyReinUsesLisp4-41/+81
This reverts commit 94b0e2e5dae4e0bd0021ac2d8fe1ff904a93ee69. preserve_contents proved to be a meaningful optimization. This commit reintroduces it but properly implemented on OpenGL. We have to make sure the clear removes all the previous contents of the image. It's not currently implemented on Vulkan because we can do smart things there that's preferred to be introduced in a separate commit.
2020-04-26shader/memory_util: Deduplicate codeReinUsesLisp9-159/+153
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.
2020-04-26shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplementedReinUsesLisp1-1/+6
IADD.X Rd.CC requires some extra logic that is not currently implemented. Abort when this is hit.
2020-04-26shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflowReinUsesLisp1-2/+2
Signed integer addition overflow might be undefined behavior. It's free to change operations to UAdd and use unsigned integers to avoid potential bugs.
2020-04-26shader/arithmetic_integer: Implement IADD.XReinUsesLisp2-0/+10
IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.
2020-04-26shader/arithmetic_integer: Implement CC for IADDReinUsesLisp4-3/+42
2020-04-26decode/register_set_predicate: Implement CCReinUsesLisp1-9/+14
P2R CC takes the state of condition codes and puts them into a register. We already have this implemented for PR (predicates). This commit implements CC over that.
2020-04-26decode/register_set_predicate: Use move for shared pointersReinUsesLisp1-16/+17
Avoid atomic counters used by shared pointers.
2020-04-25vk_rasterizer: Pack texceptions and color formats on invalid formatsReinUsesLisp2-5/+19
Sometimes for unknown reasons NVN games can bind a render target format of 0. This may be a yuzu bug. With the commits before this the formats were specified without being "packed", assuming all formats and texceptions will be written like in the color_attachments vector. To address this issue, iterate all render targets and pack them as they are valid. This way they will match color_attachments. - Fixes validation errors and graphical issues on Breath of the Wild.
2020-04-24Revert: shader_decode: Fix LD, LDG when track constant buffer.Fernando Sahmkow1-14/+6
2020-04-24Fix -Wdeprecated-copy warning.Markus Wick1-0/+1
2020-04-24Fix -Werror=conversion error.Markus Wick1-1/+1
2020-04-23decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bitsReinUsesLisp2-16/+37
The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68 That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23shader/texture: Support multiple unknown sampler propertiesReinUsesLisp2-62/+87
This allows deducing some properties from the texture instruction before asking the runtime. By doing this we can handle type mismatches in some instructions from the renderer instead of the shader decoder. Fixes texelFetch issues with games using 2D texture instructions on a 1D sampler.
2020-04-23shader_ir: Turn classes into data structuresReinUsesLisp13-299/+197
2020-04-23vk_rasterizer: Fix framebuffer creation validation errorsReinUsesLisp1-2/+4
Framebuffer creation was ignoring the number of color attachments.
2020-04-23vk_pipeline_cache: Unify pipeline cache keys into a single operationReinUsesLisp5-47/+59
This allows us to call Common::CityHash and std::memcmp only once for GraphicsPipelineCacheKey. While we are at it, do the same for compute.
2020-04-23vk_renderpass_cache: Pack renderpass cache key to 12 bytesReinUsesLisp4-84/+59
2020-04-23kernel: memory: Improve implementation of device shared memory. (#3707)bunnei1-13/+5
* kernel: memory: Improve implementation of device shared memory. * fixup! kernel: memory: Improve implementation of device shared memory. * fixup! kernel: memory: Improve implementation of device shared memory.
2020-04-23Clang Format.Fernando Sahmkow5-10/+17
2020-04-23GPU: Add Fast GPU Time Option.Fernando Sahmkow1-1/+5
2020-04-23Maxwell3D: Process Macros on MultiMethod.Fernando Sahmkow1-25/+47
2020-04-23DMAPusher: Propagate multimethod writes into the engines.Fernando Sahmkow14-14/+164
2020-04-23vk_pipeline_cache: Fix unintentional memcpy into optionalReinUsesLisp1-2/+4
The intention behind this was to assign a float to from an uint32_t, but it was unintentionally being copied directly into the std::optional. Copy to a temporary and assign that temporary to std::optional. This can be replaced with std::bit_cast<float> once we are in C++20.
2020-04-23GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop,Fernando Sahmkow1-2/+1
2020-04-22Address Feedback.Fernando Sahmkow3-24/+18
2020-04-22Async GPU: Correct flushing behavior to be similar to old async GPU behavior.Fernando Sahmkow3-0/+11
2020-04-22MaxwellDMA: Correct copying on accuracy level.Fernando Sahmkow1-2/+7
2020-04-22ShaderCache/PipelineCache: Cache null shaders.Fernando Sahmkow4-8/+31
2020-04-22Address Feedback.Fernando Sahmkow13-132/+117
2020-04-22Fix GCC error.Fernando Sahmkow2-6/+5
2020-04-22QueryCache: Only do async flushes on async gpu.Fernando Sahmkow1-1/+4
2020-04-22Async GPU: Only do reactive flushing on Extreme Level.Fernando Sahmkow1-1/+1
2020-04-22vk_fence_manager: Initial implementationReinUsesLisp8-12/+222
2020-04-22QueryCache: Implement Async Flushes.Fernando Sahmkow5-12/+77
2020-04-22OpenGL: Guarantee writes to Buffers.Fernando Sahmkow3-4/+2
2020-04-22GPU: Implement Flush Requests for Async mode.Fernando Sahmkow6-8/+70
2020-04-22FenceManager: Manage syncpoints and rename fences to semaphores.Fernando Sahmkow11-25/+123
2020-04-22BufferCache: Refactor async managing.Fernando Sahmkow2-10/+27
2020-04-22FenceManager: Implement async buffer cache flushes on High settingsFernando Sahmkow6-10/+69
2020-04-22Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.Fernando Sahmkow5-4/+35
2020-04-22GPU: Fix rebase errors.Fernando Sahmkow2-4/+4
2020-04-22Rasterizer: Disable fence managing in synchronous gpu.Fernando Sahmkow2-1/+11
2020-04-22ThreadManager: Sync async reads on accurate gpu.Fernando Sahmkow9-8/+48
2020-04-22FenceManager: Implement should wait.Fernando Sahmkow2-2/+17
2020-04-22GPU: Implement a Fence Manager.Fernando Sahmkow6-23/+208
2020-04-22OpenGL: Implement Fencing backend.Fernando Sahmkow12-19/+94
2020-04-22TextureCache: Flush linear textures after finishing rendering.Fernando Sahmkow1-2/+8
2020-04-22GPU: Delay Fences.Fernando Sahmkow6-2/+20
2020-04-22BufferCache: Implement OnCPUWrite and SyncGuestHostFernando Sahmkow6-7/+67
2020-04-22GPU: Refactor synchronization on Async GPUFernando Sahmkow11-7/+56
2020-04-22Texture Cache: Implement OnCPUWrite and SyncGuestHostFernando Sahmkow2-3/+63
2020-04-22UI: Replasce accurate GPU option for GPU Accuracy LevelFernando Sahmkow3-6/+6
2020-04-22vk_memory_manager: Remove unified memory model flagReinUsesLisp5-35/+6
All drivers (even Intel) seem to have a device local memory type that is not host visible. Remove this flag so all devices follow the same path. This fixes a crash when trying to map to host device local memory on integrated devices.
2020-04-22vk_rasterizer: Add lazy default buffer maker and use it for empty buffersReinUsesLisp3-4/+40
Introduce a default buffer getter that lazily constructs an empty buffer. This is intended to match OpenGL's buffer 0. Use this for disabled vertex and uniform buffers. While we are at it, include vertex buffer usages for staging buffers to silence validation errors.
2020-04-22gl_rasterizer: Fix buffers without sizeReinUsesLisp3-8/+13
On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. https://github.com/devkitPro/deko3d/blob/1d1930beea093b5a663419e93b0649719a3ca5da/source/maxwell/gpu_3d_vbo.cpp#L62-L63
2020-04-21shader/arithmetic_integer: Fix LEA_IMM encodingReinUsesLisp1-2/+2
The operand order in LEA_IMM was flipped compared to nvdisasm. Fix that using nxas as reference: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L122
2020-04-20Initialize quad_indexed_pass before uint8_passAmit Prakash Ambasta1-1/+1
Fixes Werror=reorder in gcc
2020-04-19dma_pusher: Remove reliance on the global system instanceLioncash3-6/+11
With this, the video core is now has no calls to the global system instance at all.
2020-04-19renderer_vulkan: assume X11 if not Windows/macOS after bf1d66b7c074Jan Beich1-3/+3
Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateInstance:131: Presentation not supported on this platform Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateSurface:378: Presentation not supported on this platform Core <Critical> core/core.cpp:Load:199: Failed to initialize system (Error 5)!
2020-04-19vulkan/wrapper: Sort physical devicesReinUsesLisp1-1/+20
Sort discrete GPUs over the rest, Nvidia over AMD, AMD over Intel, Intel over the rest. This gives us a somewhat consistent order when Optimus is removed (renderdoc does this when it's attached). This can break the configuration of users with an Intel GPU that manually remove Optimus on yuzu. That said, it's a very unlikely to happen.
2020-04-19fixed_pipeline_state: Hash and compare the whole structureReinUsesLisp2-105/+9
Pad FixedPipelineState's size to 384 bytes to be a multiple of 16. Compare the whole struct with std::memcmp and hash with CityHash. Using CityHash instead of a naive hash should reduce the number of collisions. Improve used type traits to ensure this operation is safe. With these changes the improvements to the hashable pipeline state are: Optimized structure Hash: 89 ns Comparison: 103 ns Construction*: 164 ns Struct size: 384 bytes Original structure Hash: 148 ns Equal: 174 ns Construction*: 281 ns Size: 1384 bytes * Attribute state initialization is not measured These measures are averages taken with std::chrono::high_accuracy_clock on MSVC shipped on Visual Studio 16.6.0 Preview 2.1.
2020-04-19fixed_pipeline_state: Pack blending stateReinUsesLisp3-98/+227
Reduce FixedPipelineState's size to 364 bytes.
2020-04-19fixed_pipeline_state: Pack rasterizer stateReinUsesLisp4-163/+155
Reduce FixedPipelineState's size to 600 bytes.
2020-04-19fixed_pipeline_state: Pack depth stencil stateReinUsesLisp3-97/+140
Reduce FixedPipelineState's size to 632 bytes.
2020-04-19fixed_pipeline_state: Pack attribute stateReinUsesLisp6-101/+85
Reduce FixedPipelineState's size from 1384 to 664 bytes
2020-04-18video_core: gl_shader_decompiler: Fix implicit fallthrough errors.bunnei1-0/+1
2020-04-18gl_shader_decompiler: Avoid copies where applicableLioncash1-3/+3
Avoids unnecessary reference count increments where applicable and also avoids reallocating a vector. Unlikely to make a huge difference, but given how trivial of an amendment it is, why not?
2020-04-17video_code: Fix implicit switch fallthrough.Markus Wick1-0/+2
Since yesterday, this breaks the build on linux. So let's fix it.
2020-04-17vk_stream_buffer: Fix out of memory on boot on recent Nvidia driversReinUsesLisp2-33/+48
Nvidia recently introduced a new memory type for data streaming (awesome!), but yuzu was assuming that all heaps had enough memory for the assumed stream buffer size (256 MiB). This worked fine on AMD but Nvidia's new memory heap was smaller than 256 MiB. This commit changes this assumption and allocates a bit less than the size of the preferred heap, with a maximum of 256 MiB (to avoid allocating all system memory on integrated devices). - Fixes a crash on NVIDIA 450.82.0.0
2020-04-17Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL"Rodrigo Locatti1-3/+1
2020-04-17video_core: memory_manager: Updates for Common::PageTable changes.bunnei2-67/+34
2020-04-17core: memory: Move to Core::Memory namespace.bunnei2-8/+8
- helpful to disambiguate Kernel::Memory namespace.
2020-04-17General: Resolve warnings related to missing declarationsLioncash4-6/+8
2020-04-17decode/memory: Resolve unused variable warningLioncash1-1/+1
Only the first element of the returned pair is ever used.
2020-04-17decode/texture: Resolve unused variable warnings.Lioncash1-5/+7
Some variables aren't used, so we can remove these. Unfortunately, diagnostics are still reported on structured bindings even when annotated with [[maybe_unused]], so we need to unpack the elements that we want to use manually.
2020-04-17decode/texture: Collapse loop down into std::generateLioncash1-3/+1
Same behavior, less code.
2020-04-17decode/texture: Eliminate trivial missing field initializer warningsLioncash1-3/+4
We can just specify the initializers.
2020-04-17maxwell_3d: Initialize format attributes constant as oneReinUsesLisp1-0/+4
nouveau expects this to be true but it doesn't set it.
2020-04-17vk_compute_pass: Implement indexed quadsReinUsesLisp5-12/+280
Implement indexed quads (GL_QUADS used with glDrawElements*) with a compute pass conversion. The compute shader converts from uint8/uint16/uint32 indices to uint32. The format is passed through push constants to avoid having different variants of the same shader. - Used by Fast RMX - Used by Xenoblade Chronicles 2 (it still has graphical due to synchronization issues on Vulkan)
2020-04-16buffer_cache: Return handles instead of pointer to handlesReinUsesLisp14-228/+90
The original idea of returning pointers is that handles can be moved. The problem is that the implementation didn't take that in mind and made everything harder to work with. This commit drops pointer to handles and returns the handles themselves. While it is still true that handles can be invalidated, this way we get an old handle instead of a dangling pointer. This problem can be solved in the future with sparse buffers.
2020-04-16decode/shift: Remove unused variable within Shift()Lioncash1-1/+0
Removes a redundant variable that is already satisfied by the IsFull() utility function.
2020-04-16surface_view: Add missing operator!= to ViewParamsLioncash2-0/+5
Provides logical symmetry to the interface.
2020-04-16surface_base: Make IsInside() a const member functionLioncash1-2/+2
This doesn't modify internal state, so this can be made const.
2020-04-16texture_cache/format_lookup_table: Fix incorrect green, blue, and alpha indicesLioncash1-3/+3
Previously these were all using the red component to derive the indices, which is definitely not intentional.
2020-04-16control_flow: Make use of std::move in TryInspectAddress()Lioncash1-3/+3
Eliminates redundant atomic reference count increments and decrements.
2020-04-16video_core: Amend doxygen comment referencesLioncash2-5/+5
Fixes broken documentation references.
2020-04-16decode/image: Fix typo in assert in GetComponentSize()Lioncash1-3/+3
2020-04-16gl_query_cache: Resolve use-after-move in CachedQuery move assignment operatorLioncash1-1/+1
Avoids potential invalid junk data from being read.
2020-04-16decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize()Lioncash1-2/+2
The components' sizes were mismatched. This corrects that.
2020-04-16gl_device: Mark stage_swizzle as constexprLioncash1-1/+1
Previously this was mutable even though it shouldn't be.
2020-04-16track: Eliminate redundant copiesLioncash1-5/+6
Two variables can be references, while two others can be std::moved. Makes for 4 less atomic reference count increments and decrements.
2020-04-16CMakeLists: Specify -Wextra on linux buildsLioncash7-14/+20
Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.
2020-04-15CMakeLists: Make -Wreorder a compile-time errorLioncash2-4/+5
This can result in silent logic bugs within code, and given the amount of times these kind of warnings are caused, they should be flagged at compile-time so no new code is submitted with them.
2020-04-15Texture Cache: Read current data when flushing a 3D segment.Fernando Sahmkow1-0/+6
This PR corrects flushing of 3D segments when data of other segments is mixed, this aims to preserve the data in place.
2020-04-15maxwell_to_vk: Add uint16 vertex formatsReinUsesLisp1-0/+8
2020-04-15maxwell_to_vk: Add missing breaksReinUsesLisp1-0/+2
Avoid invalid fallbacks.
2020-04-15vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfoReinUsesLisp1-0/+2
When the dynamic state is specified, pViewports and pScissors are ignored, quoting the specification: pViewports is a pointer to an array of VkViewport structures, defining the viewport transforms. If the viewport state is dynamic, this member is ignored. That said, AMD's proprietary driver itself seem to read it regardless of what the specification says.
2020-04-15Texture Cache: Only do buffer copies on accurate GPU. (#3634)Fernando Sahmkow1-1/+3
This is a simple optimization as Buffer Copies are mostly used for texture recycling. They are, however, useful when games abuse undefined behavior but most 3D APIs forbid it.
2020-04-15Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"ReinUsesLisp1-2/+4
This reverts commit 05cf27083608bebd3ee4c38f2f948c8f2030f881. Apparently the first approach using floats instead of bitfieldInert worked better for Fire Emblem: Three Houses. Reverting to get that behavior back.
2020-04-15shader/arithmetic: Add FCMP_CR variantReinUsesLisp2-3/+6
Adds another variant of FCMP.
2020-04-14gl_rasterizer: Implement constant vertex attributesReinUsesLisp2-2/+6
Credits go to gdkchan from Ryujinx for finding constant attributes are used in retail games.
2020-04-14vk_rasterizer: Default to 1 viewports with a size of 0ReinUsesLisp1-3/+6
Silence validation layer errors.
2020-04-14gl_shader_cache: Use CompileDepth::FullDecompile on GLSLReinUsesLisp1-1/+3
From my testing on a Splatoon 2 shader that takes 3800ms on average to compile changing to FullDecompile reduces it to 900ms on average. The shader decoder will automatically fallback to a more naive method if it can't use full decompile.
2020-04-14renderer_vulkan: Integrate Nvidia Nsight Aftermath on WindowsReinUsesLisp9-22/+360
Adds optional support for Nsight Aftermath. It is enabled through ENABLE_NSIGHT_AFTERMATH in cmake. A path to the SDK has to be provided by the environment variable NSIGHT_AFTERMATH_SDK. Nsight Aftermath allows an application to generate "minidumps" of the GPU state when a device loss happens. By analysing these on Nsight we can know what a game was doing and why it triggered a device loss. The dump is generated inside %APPDATA%\yuzu\log\gpucrash and this directory is deleted every time a new instance is initialized with Nsight enabled. To enable it on yuzu there has a to be a driver and device capable of running Nsight Aftermath on Vulkan. That means only Turing based GPUs on the latest stable driver, beta drivers won't work for now. It is manually enabled in Configuration>Debug>Enable Graphics Debugging because when using all debugging capabilities there is a runtime cost.
2020-04-13gl_texture_cache: Fix layered texture attachment base levelReinUsesLisp1-1/+1
The base level is already included in the texture view. If we specify the base level in the texture again, this will end up in the incorrect level and potentially out of bounds.
2020-04-13renderer_vulkan: Remove Nvidia checkpointsReinUsesLisp4-34/+0
2020-04-13renderer_vulkan: Catch device losses in more placesReinUsesLisp3-21/+29
2020-04-13gl_rasterizer: Implement line widths and smooth linesReinUsesLisp5-2/+33
Implements "legacy" features from OpenGL present on hardware such as smooth lines and line width.
2020-04-13gl_shader_decompiler: Implement merges with bitfieldInsertReinUsesLisp1-4/+2
This also fixes Turing issues but it avoids doing more bitcasts. This should improve the generated code while also avoiding more points where compilers can flush floats.
2020-04-12gl_shader_decompiler: Improve generated code in HMergeH*ReinUsesLisp1-6/+8
Avoiding bitwise expressions, this fixes Turing issues in shaders using half float merges that affected several games.
2020-04-12shader/video: Partially implement VMNMXReinUsesLisp3-0/+116
Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).
2020-04-12video_core: Add MSAA registers in 3D engine and TICReinUsesLisp2-6/+76
This adds the registers used for multisampling. It doesn't implement anything for now.
2020-04-11texture_cache: Remove preserve_contentsReinUsesLisp3-47/+31
preserve_contents was always true. We can't assume we don't have to preserve clears because scissored and color masked clears exist. This removes preserve_contents and assumes it as true at all times.
2020-04-11renderer_vulkan: Drop Vulkan-HppReinUsesLisp51-2272/+2881
2020-04-10shader/texture: Remove type mismatches management from shader decoderReinUsesLisp1-14/+0
Since commit e22816a5bb we handle type mismatches from the CPU. We don't need to hack our shader decoder due to game bugs anymore. Removed in this commit.
2020-04-09astc: Hard code bit depth changes to 8 and use fast replicateReinUsesLisp1-21/+15
2020-04-09astc: Use boost's static_vector to avoid heap allocationsReinUsesLisp1-10/+14
2020-04-09astc: Implement a fast precompiled alternative for ReplicateReinUsesLisp1-2/+57
2020-04-09astc: Move Replicate to a constexpr LUT when possibleReinUsesLisp1-8/+38
2020-04-09astc: Make InputBitStream constexprReinUsesLisp1-11/+11
2020-04-09astc: OutputBitStream style changes and make it constexprReinUsesLisp1-32/+26
2020-04-09gl_texture_cache: Attach view instead of base texture for layered attachmentsReinUsesLisp1-2/+2
This way we are not ignoring the base layer of the current texture.
2020-04-09VkRasterizer: Eliminate Legacy code.Fernando Sahmkow1-1/+0
2020-04-09Memory: Correct GCC errors.Fernando Sahmkow2-2/+3
2020-04-08Memory: Address Feedback.Fernando Sahmkow3-4/+7
2020-04-08GPUMemoryManager: Improve safety of memory reads.Fernando Sahmkow3-55/+47
2020-04-08video_core/textures: Move GetMaxAnisotropy to cpp fileReinUsesLisp2-19/+23
2020-04-08video_core/texture: Use a LUT to convert sRGB texture bordersReinUsesLisp3-9/+61
This is a reversed look up table extracted from https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62 that is used in https://github.com/devkitPro/deko3d/blob/04d4e9e587fa3dc5447b43d273bc45f440226e41/source/maxwell/tsc_generate.cpp#L38 Games usually bind 0xFD expecting a float texture border of 1.0f. The conversion previous to this commit was multiplying the uint8 sRGB texture border color by 255. This is close to 1.0f but when that difference matters, some graphical glitches appear. This look up table is manually changed in the edges, clamping towards 0.0f and 1.0f. While we are at it, move this logic to its own translation unit.
2020-04-07yuzu: Drop SDL2 and Qt frontend Vulkan requirementsReinUsesLisp5-105/+238
Create Vulkan instances and surfaces from the Vulkan backend.
2020-04-07address nit.Nguyen Dac Nam1-1/+1
2020-04-07renderer_vulkan: Query device names from the backendReinUsesLisp3-0/+73
2020-04-07shader/conversion: Implement I2I sign extension, saturation and selectionReinUsesLisp2-14/+101
Reimplements I2I adding sign extension, saturation (clamp source value to the destination), selection and destination sizes that are not 32 bits wide. It doesn't implement CC yet.
2020-04-07Apply suggestions from code reviewNguyen Dac Nam1-9/+9
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-04-06Clang Format.Fernando Sahmkow1-6/+3
2020-04-06Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing.Fernando Sahmkow7-87/+62
2020-04-06Query Cache: Use VAddr instead of physical memory for adressing.Fernando Sahmkow3-23/+22
2020-04-06Buffer Cache: Use vAddr instead of physical memory.Fernando Sahmkow10-106/+129
2020-04-06Texture Cache: Use vAddr instead of physical memory for caching.Fernando Sahmkow5-130/+81
2020-04-06GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddrFernando Sahmkow13-61/+71
2020-04-06shader_decode: SULD.D using std::pair instead of out parameternamkazy2-19/+15
2020-04-06shader_decode: SULD.D avoid duplicate code block.namkazy1-39/+2
2020-04-06shader_decode: SULD.D fix conversion error.namkazy1-3/+3
2020-04-06shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage.namkazy5-46/+105
2020-04-06shader/memory: Implement RED.E.ADDReinUsesLisp5-28/+99
Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.
2020-04-06shader/memory: Add "using std::move"ReinUsesLisp1-11/+13
2020-04-06shader/memory: Minor fixes in ATOMReinUsesLisp1-32/+30
2020-04-05silent warning (conversion error)namkazy1-3/+2
2020-04-05shader_decode: SULD.D -> SINT actually same as UNORM.namkazy1-5/+4
2020-04-05shader_decode: SULD.D fix decode SNORM componentnamkazy1-10/+9
2020-04-05clang-formatnamkazy1-2/+2
2020-04-05shader_decode: get sampler descriptor from registry.namkazy1-77/+93
2020-04-05tweaking.namkazy1-3/+3
2020-04-05clang-formatNguyen Dac Nam1-2/+1
2020-04-05cleanup unuse paramsnamkazy1-8/+6
2020-04-05cleanup debug code.namkazy1-14/+3
2020-04-05reimplement get component type, uncomment mistaken codenamkazy1-18/+93
2020-04-05remove disable optimizenamkazy1-2/+0
2020-04-05[wip] reimplement SULD.Dnamkazy1-22/+229
2020-04-05add shader stage when init shader irnamkazy4-9/+12
2020-04-05clang-fixNguyen Dac Nam1-1/+1
2020-04-05shader: image - import PredConditionNguyen Dac Nam1-0/+1
2020-04-05shader: SULD.D bits32 implement more complexer method.Nguyen Dac Nam1-4/+28
2020-04-05shader: SULD.D import StoreTypeNguyen Dac Nam1-0/+1
2020-04-05shader: implement SULD.D bits32Nguyen Dac Nam1-11/+27
2020-04-04shader/other: Add error message for some S2R registersReinUsesLisp1-0/+6
2020-04-04shader_bytecode: Rename MOV_SYS to S2RReinUsesLisp2-5/+5
2020-04-04shader_bytecode: Add encoding for BARReinUsesLisp1-0/+2
2020-04-04shader_ir: Add error message for EXIT.FCSM_TRReinUsesLisp1-0/+3
2020-04-04shader_bytecode: Add encoding for VOTE.VTGReinUsesLisp1-0/+2
2020-04-04Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"ReinUsesLisp1-4/+2
This reverts commit 41905ee467b24172ba93e3fcd665bb4e4806a45a, reversing changes made to 35145bd529c3517e2c366efc764a762092d96edf. It causes regressions in several games.
2020-04-02shader/memory: Silence no return value warningReinUsesLisp1-0/+3
Silences a warning about control paths not all returning a value.
2020-04-02shader_decompiler: Remove FragCoord.w hack and change IPA implementationReinUsesLisp4-68/+74
Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01gl_texture_cache: Fix software ASTC fallbackReinUsesLisp1-7/+12
2020-04-01vk_device: Add missing ASTC queriesReinUsesLisp1-14/+29
2020-04-01video_core: Use native ASTC when availableReinUsesLisp10-281/+176
2020-04-01gl_device: Detect if ASTC is reported and expose itReinUsesLisp2-0/+31
2020-04-01renderer_vulkan/wrapper: Add vkEnumerateInstanceExtensionProperties wrapperReinUsesLisp2-0/+17
2020-04-01renderer_vulkan/wrapper: Add command buffer handleReinUsesLisp1-0/+192
2020-04-01renderer_vulkan/wrapper: Add physical device handleReinUsesLisp2-0/+123
2020-04-01renderer_vulkan/wrapper: Add device handleReinUsesLisp2-0/+277
2020-04-01renderer_vulkan/wrapper: Add swapchain handleReinUsesLisp2-0/+15
2020-04-01renderer_vulkan/wrapper: Add fence handleReinUsesLisp1-0/+17
2020-04-01renderer_vulkan/wrapper: Add device memory handleReinUsesLisp1-0/+15
2020-04-01renderer_vulkan/wrapper: Add pool handlesReinUsesLisp2-0/+47
2020-04-01renderer_vulkan/wrapper: Add buffer and image handlesReinUsesLisp2-0/+24
2020-04-01renderer_vulkan/wrapper: Add queue handleReinUsesLisp2-0/+36
2020-04-01renderer_vulkan/wrapper: Add instance handleReinUsesLisp2-0/+87
2020-03-31gl_rasterizer: Mark cleared textures as dirtyReinUsesLisp1-2/+5
Fixes a potential edge case where cleared textures read from the CPU were not flushed.
2020-03-31clang-formatNguyen Dac Nam1-2/+1
2020-03-31shader_decode: fix by suggestionNguyen Dac Nam1-27/+22
2020-03-30clang-formatnamkazy1-3/+3
2020-03-30gl_decompiler: min/max op not implement yetnamkazy1-0/+4
2020-03-30shader_decode: ATOM/ATOMS: add function to avoid code repetitionnamkazy2-70/+53
2020-03-30shader_decode: merge GlobalAtomicOp to AtomicOpnamkazy1-13/+1
2020-03-30shader_decode: implement ATOM operation for S32 and U32Nguyen Dac Nam1-6/+39
2020-03-30clang-formatnamkazy1-3/+3
2020-03-30shader_decode: implement ATOMS instr partial.Nguyen Dac Nam1-10/+42
2020-03-30vk_decompiler: add atomic op and handler function.Nguyen Dac Nam1-6/+25
2020-03-30gl_decompiler: add atomic opNguyen Dac Nam1-0/+16
2020-03-30shader: node - update correct commentNguyen Dac Nam1-15/+15
2020-03-30shader_decode: add Atomic op for common usageNguyen Dac Nam1-1/+15
2020-03-28shader_bytecode: Fix I2I_IMM encodingReinUsesLisp1-1/+1
2020-03-28renderer_vulkan/wrapper: Address feedbackReinUsesLisp1-3/+24
2020-03-28shader/lea: Simplify generated LEA codeReinUsesLisp1-3/+2
2020-03-27shader/lea: Fix op_a and op_b usagesReinUsesLisp1-2/+2
They were swapped.
2020-03-27shader/lea: Remove const and use move when possibleReinUsesLisp1-11/+5
2020-03-27renderer_vulkan/wrapper: Add owning handlesReinUsesLisp1-0/+18
2020-03-27renderer_vulkan/wrapper: Add pool allocations owning templated classReinUsesLisp1-0/+81
2020-03-27renderer_vulkan/wrapper: Add owning handle templated classReinUsesLisp1-0/+144
2020-03-27renderer_vulkan/wrapper: Add destroy and free overload setReinUsesLisp2-0/+133
2020-03-27renderer_vulkan/wrapper: Add dispatch table and loadersReinUsesLisp2-0/+283
2020-03-27renderer_vulkan/wrapper: Add exception classReinUsesLisp2-0/+34
2020-03-27renderer_vulkan/wrapper: Add ToString function for VkResultReinUsesLisp3-0/+91
2020-03-27renderer_vulkan/wrapper: Add Vulakn wrapper and a span helperReinUsesLisp2-0/+84
The intention behind a Vulkan wrapper is to drop Vulkan-Hpp. The issues with Vulkan-Hpp are: - Regular breaks of the API. - Copy constructors that do the same as the aggregates (fixed recently) - External dynamic dispatch that is hard to remove - Alias KHR handles with non-KHR handles making it impossible to use smart handles on Vulkan 1.0 instances with extensions that were included on Vulkan 1.1. - Dynamic dispatchers silently change size depending on preprocessor definitions. Different files will have different dispatch definitions, generating all kinds of hard to debug memory issues. In other words, Vulkan-Hpp is not "production ready" for our needs and this wrapper aims to replace it without losing RAII and exception safety.
2020-03-27engines/const_buffer_engine_interface: Store image format typeReinUsesLisp1-4/+10
This information is required to properly implement SULD.B. It might also be handy for all image operations, since it would allow us to implement them on devices that require the image format to be specified (on desktop, this would be AMD on OpenGL and Intel on OpenGL and Vulkan).
2020-03-27maxwell_to_vk: implement signedscaled vertex formatsDan1-0/+20
2020-03-26Address review and fix broken yuzu-tester buildJames Rowe3-5/+7
2020-03-26shader/conversion: Fix F2F rounding operations with different sizesReinUsesLisp1-5/+10
Rounding operations only matter when the conversion size of source and destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64. When there is a mismatch (.F16.F32), these bits are used for IEEE rounding, we don't emulate this because GLSL and SPIR-V don't support configuring it per operation.
2020-03-26gl_rasterizer: Update stencil test regardless of it being disabledReinUsesLisp1-5/+1
2020-03-26gl_rasterizer: Synchronize stencil testing on clearsReinUsesLisp1-0/+1
2020-03-25Frontend/GPU: Refactor context managementJames Rowe16-97/+129
Changes the GraphicsContext to be managed by the GPU core. This eliminates the need for the frontends to fool around with tricky MakeCurrent/DoneCurrent calls that are dependent on the settings (such as async gpu option). This also refactors out the need to use QWidget::fromWindowContainer as that caused issues with focus and input handling. Now we use a regular QWidget and just access the native windowHandle() directly. Another change is removing the debug tool setting in FrameMailbox. Instead of trying to block the frontend until a new frame is ready, the core will now take over presentation and draw directly to the window if the renderer detects that its hooked by NSight or RenderDoc Lastly, since it was in the way, I removed ScopeAcquireWindowContext and replaced it with a simple subclass in GraphicsContext that achieves the same result
2020-03-23xmad: fix clang build errormakigumo1-4/+5
2020-03-22apply replay logic to all writes. remove replay from MacroInterpreter::Send (@fincs)namkazy2-12/+9
2020-03-22maxwell_3d: change declaration ordernamkazy1-1/+3
2020-03-22maxwell_3d: init shadow_statenamkazy1-0/+2
2020-03-22gl_rasterizer: Use transformed viewport for depth rangesReinUsesLisp1-4/+6
Implement depth ranges using the transformed viewport instead of the generic one. This matches the current Vulkan implementation but doesn't support negative depth ranges. An update to glad is required for this.
2020-03-22maxwell_3d: this seem more correct.namkazy1-2/+2
2020-03-22maxwell_3d: update comments for shadow ram usagenamkazy3-2/+6
2020-03-22marco_interpreter: write hw value when shadow ram requestedNguyen Dac Nam1-0/+6
2020-03-22maxwell_3d: track shadow ram ctrl and hw reg valueNguyen Dac Nam1-0/+10
2020-03-22maxwell_3d: implement MME shadow RAMNguyen Dac Nam1-1/+14
2020-03-19vk_texture_cache: Silence misc warningsReinUsesLisp1-3/+3
2020-03-19vk_staging_buffer_pool: Silence unused constant warningReinUsesLisp1-1/+1
2020-03-19vk_rasterizer: Remove unused variableReinUsesLisp1-2/+0
2020-03-19vk_pipeline_cache: Remove unused variableReinUsesLisp1-1/+0
2020-03-19maxwell_to_vk: Sielence -Wswitch warningReinUsesLisp1-0/+2
2020-03-19gl_shader_decompiler: Remove deprecated function and its usagesReinUsesLisp1-11/+8
2020-03-19gl_rasterizer: Silence misc warningsReinUsesLisp1-7/+2
2020-03-19kepler_compute: Remove unused variablesReinUsesLisp1-8/+0
2020-03-18astc: Fix clang build issuesReinUsesLisp1-12/+12
2020-03-18gl_shader_decompiler: Don't redeclare gl_VertexID and gl_InstanceIDReinUsesLisp1-8/+0
2020-03-16renderer_opengl: Move some logic to an anonymous namespaceReinUsesLisp1-151/+151
2020-03-16renderer_opengl: Detect Nvidia Nsight as a debugging toolReinUsesLisp3-7/+22
Use getenv to detect Nsight.
2020-03-16gl_shader_decompiler: Implement legacy varyingsReinUsesLisp1-6/+57
Legacy varyings are special attributes carried over in hardware from the OpenGL 1 and OpenGL 2 days. These were generally used instead of the generic attributes we use today. They are deprecated or removed from most APIs, but Nvidia still ships them in hardware. To implement these, this commit maps them 1:1 to OpenGL compatibility.
2020-03-16shader/shader_ir: Track usage in input attribute and of legacy varyingsReinUsesLisp3-34/+64
2020-03-16shader/shader_ir: Fix clip distance usage storesReinUsesLisp1-2/+1
2020-03-16shader/shader_ir: Change declare output attribute to a switchReinUsesLisp1-9/+9
2020-03-15maxwell_to_vk: Implement RG32 and RGB32 integer vertex formatsReinUsesLisp1-0/+4
2020-03-15vk_rasterizer: Implement layered clearsReinUsesLisp1-2/+2
2020-03-15vk_shader_decompiler: fix linux buildmakigumo1-1/+1
2020-03-15vk_rasterizer: Fix vertex range assertReinUsesLisp1-1/+1
End can be equal to start in CalculateVertexArraysSize. This is quite common when the vertex size is zero.
2020-03-15vk_rasterizer: Reimplement clears with vkCmdClearAttachmentsReinUsesLisp4-45/+53
2020-03-14renderer_opengl: Keep presentation frames in lock-step when GPU debugging.bunnei1-1/+32
- Fixes renderdoc with OpenGL renderer.
2020-03-14gl_device: Add option to check GL_EXT_debug_tool.bunnei2-0/+6
2020-03-14DirtyFlags: relax need to set render_targets as dirty Fernando Sahmkow4-13/+0
The texture cache already takes care of setting a render target to dirty when invalidated.
2020-03-14PageTable: move backing addresses to a children class as the CPU page table does not need them.Fernando Sahmkow1-1/+1
This PR aims to reduce the memory usage in the CPU page table by moving GPU specific parameters into a child class. This saves 1Gb of Memory for most games.
2020-03-14astc: Fix typos from search and replaceReinUsesLisp1-3/+3
2020-03-14astc: Minor changes to InputBitStreamReinUsesLisp1-28/+34
2020-03-14astc: Pass val in Replicate by copyReinUsesLisp1-1/+1
2020-03-14astc: Call std::vector:reserve on decodedClolorValues to avoid reallocatingReinUsesLisp1-0/+2
2020-03-14clang-formatNguyen Dac Nam1-2/+1
2020-03-14nitNguyen Dac Nam1-1/+1
2020-03-14astc: Call std::vector::reserve on texelWeightValues to avoid reallocatingReinUsesLisp1-0/+2
2020-03-14astc: Create a LUT at compile time for encoding valuesReinUsesLisp1-7/+19
2020-03-14astc: Make IntegerEncodedValue a trivial structureReinUsesLisp1-212/+177
2020-03-14astc: Make IntegerEncodedValue constructor constexprReinUsesLisp1-5/+6
2020-03-14astc: Make IntegerEncodedValue trivially copyableReinUsesLisp1-9/+2
2020-03-14astc: Rename C types to common_typesReinUsesLisp1-79/+78
2020-03-14astc: Move Popcnt to an anonymous namespace and make it constexprReinUsesLisp1-9/+13
2020-03-14astc: Use common types instead of stdint.h integer typesReinUsesLisp1-284/+282
2020-03-14astc: Use 'enum class' instead of 'enum' for EIntegerEncodingReinUsesLisp1-25/+25
2020-03-13vk/gl_shader_decompiler: Silence assertion on computeReinUsesLisp2-6/+12
2020-03-13vk_shader_decompiler: Fix default varying regressionReinUsesLisp1-2/+6
2020-03-13maxwell_3d: Add padding words to XFB entriesReinUsesLisp1-2/+4
Use INSERT_UNION_PADDING_WORDS instead of alignas to ensure a size requirement.
2020-03-13gl_shader_decompiler: Fix implicit conversion errorsReinUsesLisp1-3/+3
2020-03-13vk_shader_decompiler: Fix implicit type conversionRodrigo Locatti1-1/+1
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13vk_rasterizer: Implement transform feedback binding zeroReinUsesLisp2-0/+46
2020-03-13vk_shader_decompiler: Add XFB decorations to generic varyingsReinUsesLisp1-16/+89
2020-03-13vk_device: Enable VK_EXT_transform_feedback when availableReinUsesLisp2-7/+40
2020-03-13vk_device: Shrink formatless capability name sizeReinUsesLisp3-26/+23
2020-03-13shader/transform_feedback: Expose buffer strideReinUsesLisp3-1/+4
2020-03-13vk_shader_decompiler: Use registry for specializationReinUsesLisp4-31/+37
2020-03-13gl_rasterizer: Implement transform feedback bindingsReinUsesLisp3-10/+83
2020-03-13gl_shader_decompiler: Decorate output attributes with XFB layoutReinUsesLisp1-29/+105
We sometimes have to slice attributes in different parts. This is needed for example in instances where the game feedbacks 3 components but writes 4 from the shader (something that is possible with GL_NV_transform_feedback).
2020-03-13shader/transform_feedback: Add host API friendly TFB builderReinUsesLisp3-0/+138
2020-03-13nit & remove some optional paramNguyen Dac Nam1-10/+11
2020-03-13shader_decode: implement XMAD mode CSfuNguyen Dac Nam1-9/+41
2020-03-13fix formattingmakigumo1-1/+1
2020-03-13maxwell_to_vk: add vertex format eA2B10G10R10UnormPack32makigumo1-1/+3
2020-03-13clang-formatNguyen Dac Nam1-4/+8
2020-03-13Apply suggestions from code reviewNguyen Dac Nam1-5/+5
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13shader_decode: BFE add ref of reverse parallel method.Nguyen Dac Nam1-0/+3
2020-03-13shader_decode: implement BREV on BFENguyen Dac Nam1-6/+25
Implement reverse parallel follow: https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
2020-03-13shader_bytecode: update BFE instructions struct.Nguyen Dac Nam1-8/+3
2020-03-13node_helper: add IBitfieldExtract caseNguyen Dac Nam1-0/+2
2020-03-13shader_decode: Reimplement BFE instructionsNguyen Dac Nam1-25/+27
2020-03-13gl_shader_decompiler: Initialize gl_Position on vertex shadersReinUsesLisp1-0/+4
2020-03-13gl_shader_decompiler: Add missing {} on smem GLSL emissionReinUsesLisp1-1/+1
2020-03-13video_core: Implement RGBA16_SNORMReinUsesLisp8-69/+84
Implement RGBA16_SNORM with the current API. Nothing special here.
2020-03-12texture_cache: Report incompatible textures as blackReinUsesLisp1-2/+39
Some games bind incompatible texture types to certain types. For example Astral Chain binds a 2D texture with 1 layer (non-array) to a cubemap slot (that's how it's used in the shader). After testing this in hardware, the expected "undefined behavior" is to report all pixels as black. We already have a path for reporting black textures in the texture cache. When textures types are incompatible, this commit binds these kind of textures. This is done on the API agnostic texture cache so no extra code has to be inserted on OpenGL or Vulkan. As a side effect, this fixes invalidations of ASTC textures on Astral Chain. This happened because yuzu detected a cube texture and forced 6 faces, generating a texture larger than what the TIC reported.
2020-03-12texture_cache/surface_params: Force depth=1 on 2D texturesReinUsesLisp1-2/+4
Sometimes games will sample a 2D array TIC with a 2D access in the shader. This causes bad interactions with the rest of the texture cache. To emulate what the game wants to do, force a depth=1 on 2D textures (not 2D arrays) and let the texture cache handle the rest.
2020-03-12gl_shader_decompiler: Add layer component to texelFetchReinUsesLisp1-6/+9
TexelFetch was not emitting the array component generating invalid GLSL.
2020-03-12gl_shader_decompiler: Fix regression in render target declarationsReinUsesLisp1-12/+2
A previous commit introduced a way to declare as few render targets as possible. Turns out this introduced a regression in some games.
2020-03-11gl_shader_manager: Fix interaction between graphics and computeReinUsesLisp4-29/+39
After a compute shader was set to the pipeline, no graphics shader was invoked again. To address this use glUseProgram to bind compute shaders (without state tracking) and call glUseProgram(0) when transitioning out of it back to the graphics pipeline.
2020-03-10gl_rasterizer: Implement polygon modes and fill rectanglesReinUsesLisp7-2/+99
2020-03-09engines/maxwell_3d: Add TFB registers and store them in shader registryReinUsesLisp4-6/+45
2020-03-09shader/registry: Address feedbackReinUsesLisp3-13/+18
2020-03-09gl_shader_decompiler: Add identifier to decompiled codeReinUsesLisp3-8/+16
2020-03-09gl_shader_decompiler: Roll back to GLSL core 430ReinUsesLisp1-1/+1
RenderDoc won't build shaders if we use GLSL compatibility.
2020-03-09const_buffer_engine_interface: Store component typesReinUsesLisp4-46/+27
This is required for Vulkan. Sampling integer textures with float handles is illegal.
2020-03-09yuzu/loading_screen: Remove unused shader progress modeReinUsesLisp1-1/+0
2020-03-09gl_shader_cache: Reduce registry consistency to debug assertReinUsesLisp1-3/+1
Registry consistency is something that practically can't happen and it has a measurable runtime cost. Reduce it to a DEBUG_ASSERT.
2020-03-09shader/registry: Cache tessellation stateReinUsesLisp3-3/+10
2020-03-09shader/registry: Store graphics and compute metadataReinUsesLisp8-75/+176
Store information GLSL forces us to provide but it's dynamic state in hardware (workgroup sizes, primitive topology, shared memory size).
2020-03-09video_core: Rename "const buffer locker" to "registry"ReinUsesLisp15-93/+98
2020-03-09gl_shader_cache: Rework shader cache and remove post-specializationsReinUsesLisp17-1092/+544
Instead of pre-specializing shaders and then post-specializing them, drop the later and only "specialize" the shader while decoding it.
2020-03-08textures: Fix anisotropy hackReinUsesLisp1-14/+16
Previous code could generate an anisotropy value way higher than x16.
2020-03-08vk_reasterizer: fix mistype on SetupGraphicsImagesNguyen Dac Nam1-1/+1
This should use Maxwell3D engine. Fixed some GPU error on Kirby and maybe other games.
2020-03-06vk_rasterizer: Support disabled uniform buffersReinUsesLisp2-1/+9
2020-03-06maxwell_to_vk: Remove Storage capability for A1B5G5R5UReinUsesLisp1-1/+1
2020-02-29nit: move comment to right place.Nguyen Dac Nam1-2/+2
2020-02-28video_core/dirty_flags: Address feedbackReinUsesLisp1-4/+4
2020-02-28renderer_opengl: Fix edge-case where alpha testing might cull presentationReinUsesLisp2-0/+7
2020-02-28gl_texture_cache: Remove blending disable on blitsReinUsesLisp1-5/+0
Blending doesn't affect blits. Rasterizer discard does, update the commentaries.
2020-02-28gl_rasterizer: Don't disable blending on clearsReinUsesLisp1-4/+0
Blending doesn't affect clears.
2020-02-28dirty_flags: Deduplicate code between OpenGL and VulkanReinUsesLisp5-77/+73
2020-02-28vk_rasterizer: Pass Maxwell registers to dynamic updatesReinUsesLisp2-26/+21
2020-02-28state_tracker: Remove type traits with named structuresReinUsesLisp4-18/+22
2020-02-28vk_state_tracker: Implement dirty flags for stencil propertiesReinUsesLisp3-0/+21
2020-02-28vk_state_tracker: Implement dirty flags for depth boundsReinUsesLisp3-0/+14
2020-02-28vk_state_tracker: Implement dirty flags for blend constantsReinUsesLisp3-0/+14
2020-02-28vk_state_tracker: Implement dirty flags for depth biasReinUsesLisp3-0/+17
2020-02-28vk_state_tracker: Implement dirty flags for scissorsReinUsesLisp3-0/+14
2020-02-28vk_state_tracker: Initial implementationReinUsesLisp10-52/+198
Add support for render targets and viewports.
2020-02-28gl_rasterizer: Remove num vertex buffers magic numberReinUsesLisp1-2/+4
2020-02-28gl_rasterizer: Only apply polygon offset clamp if enabledReinUsesLisp1-3/+6
2020-02-28gl_state_tracker: Implement dirty flags for depth clamp enablingReinUsesLisp3-3/+15
2020-02-28gl_rasterizer: Disable scissor 0 when scissor is not used on clearReinUsesLisp1-0/+3
2020-02-28gl_rasterizer: Notify depth mask changes on clearReinUsesLisp2-1/+6
2020-02-28gl_rasterizer: Minor sort changes to clearingReinUsesLisp1-11/+9
2020-02-28maxwell_3d: Use two tables instead of three for dirty flagsReinUsesLisp1-1/+1
2020-02-28gl_state_tracker: Track state of index buffersReinUsesLisp4-5/+23
2020-02-28gl_state_tracker: Implement dirty flags for clip controlReinUsesLisp5-15/+31
2020-02-28gl_state_tracker: Implement dirty flags for point sizesReinUsesLisp3-4/+25
2020-02-28gl_state_tracker: Implement dirty flags for fragment color clampReinUsesLisp3-2/+14
2020-02-28gl_state_tracker: Implement dirty flags for logic opReinUsesLisp4-2/+22
2020-02-28gl_state_tracker: Implement dirty flags for sRGBReinUsesLisp5-2/+21
2020-02-28gl_state_tracker: Implement dirty flags for rasterize enableReinUsesLisp5-2/+21
2020-02-28gl_state_tracker: Implement dirty flags for multisampleReinUsesLisp3-0/+13
2020-02-28gl_state_tracker: Implement dirty flags for alpha testingReinUsesLisp4-6/+24
2020-02-28gl_state_tracker: Implement dirty flags for polygon offsetsReinUsesLisp4-2/+24
2020-02-28gl_state_tracker: Implement dirty flags for primitive restartReinUsesLisp3-5/+19
2020-02-28gl_state_tracker: Implement dirty flags for stencil testingReinUsesLisp4-3/+29
2020-02-28gl_state_tracker: Implement depth dirty flagsReinUsesLisp4-6/+31
2020-02-28gl_state_tracker: Implement dirty flags for front face and cullingReinUsesLisp4-7/+38
2020-02-28gl_state_tracker: Implement dirty flags for blendingReinUsesLisp5-14/+67
2020-02-28gl_state_tracker: Implement dirty flags for clip distances and shadersReinUsesLisp7-14/+43
2020-02-28gl_state_tracker: Add dirty flags for buffers and divisorsReinUsesLisp4-22/+56
2020-02-28maxwell_3d: Change write dirty flags to a bitsetReinUsesLisp3-16/+16
2020-02-28gl_state_tracker: Implement dirty flags for vertex formatsReinUsesLisp4-9/+44
2020-02-28gl_state_tracker: Implement dirty flags for color masksReinUsesLisp4-9/+53
2020-02-28gl_state_tracker: Implement dirty flags for scissorsReinUsesLisp5-10/+58
2020-02-28gl_state_tracker: Implement dirty flags for viewportsReinUsesLisp4-9/+54
2020-02-28renderer_opengl: Reintroduce dirty flags for render targetsReinUsesLisp9-13/+195
2020-02-28maxwell_3d: Flatten cull and front face registersReinUsesLisp8-50/+47
2020-02-28video_core: Reintroduce dirty flags infrastructureReinUsesLisp9-1/+71
2020-02-28gl_state: Remove completelyReinUsesLisp13-152/+4
2020-02-28gl_state: Remove program trackingReinUsesLisp9-94/+62
2020-02-28gl_state: Remove framebuffer trackingReinUsesLisp7-82/+23
2020-02-28gl_state: Remove image trackingReinUsesLisp5-24/+12
2020-02-28gl_state: Remove texture and sampler trackingReinUsesLisp5-60/+8
2020-02-28gl_state: Remove blend state trackingReinUsesLisp5-104/+28
2020-02-28gl_state: Remove stencil test trackingReinUsesLisp4-92/+18
2020-02-28gl_state: Remove clip control trackingReinUsesLisp5-19/+8
2020-02-28gl_state: Remove clip distances trackingReinUsesLisp4-29/+3
2020-02-28gl_state: Remove rasterizer disable trackingReinUsesLisp6-13/+8
2020-02-28gl_state: Remove viewport and depth range trackingReinUsesLisp7-101/+39
2020-02-28gl_state: Remove scissor test trackingReinUsesLisp6-69/+12
2020-02-28gl_state: Remove color mask trackingReinUsesLisp4-40/+12
2020-02-28gl_state: Remove clamp framebuffer color trackingReinUsesLisp3-17/+6
This commit doesn't reset it for screen draws because clamping doesn't change anything there.
2020-02-28gl_state: Remove multisample trackingReinUsesLisp3-16/+2
2020-02-28gl_state: Remove framebuffer sRGB trackingReinUsesLisp6-21/+25
2020-02-28gl_state: Remove VAO cache and trackingReinUsesLisp10-153/+53
2020-02-28gl_state: Remove depth clamp trackingReinUsesLisp4-25/+13
2020-02-28gl_state: Remove depth trackingReinUsesLisp4-34/+7
2020-02-28gl_state: Remove primitive restart trackingReinUsesLisp3-18/+2
2020-02-28gl_state: Remove logic op trackerReinUsesLisp4-24/+5
2020-02-28gl_state: Remove blend color trackingReinUsesLisp3-18/+1
2020-02-28gl_state: Remove polygon offset trackingReinUsesLisp4-39/+7
2020-02-28gl_state: Remove alpha test trackingReinUsesLisp4-21/+4
2020-02-28gl_state: Remove cull mode trackingReinUsesLisp4-19/+4
2020-02-28gl_state: Remove front face trackingReinUsesLisp4-6/+5
2020-02-28gl_state: Remove point size trackingReinUsesLisp3-22/+4
2020-02-28gl_rasterizer: Add oglEnablei helperReinUsesLisp1-0/+4
2020-02-28gl_rasterizer: Add OpenGL enable/disable helperReinUsesLisp1-0/+4
2020-02-28gl_rasterizer: Remove dirty flagsReinUsesLisp18-457/+7
2020-02-28renderer_opengl: Fix SRGB presentation frame tracking.bunnei2-5/+2
- Fixes SRGB in Super Smash Bros. Ultimate.
2020-02-28shader_decode: Fix LD, LDG when track constant bufferNguyen Dac Nam1-4/+12
2020-02-28shader_decode: keep it search on all codeNguyen Dac Nam1-4/+12
It fixed opcode LD, LDG on Pokemon Sword that can't find the constant buffer. Not sure if it helps any on visual.
2020-02-28Create an "Advanced" tab in the graphics configuration tab and add anisotropic filtering levels.Morph2-2/+24
2020-02-28renderer_opengl: Reduce swap chain size to 3.bunnei1-3/+2
2020-02-27shader: FMUL switch to using LUT (#3441)Nguyen Dac Nam1-19/+14
* shader: add FmulPostFactor LUT table * shader: FMUL apply LUT * Update src/video_core/engines/shader_bytecode.h Co-Authored-By: Mat M. <mathew1800@gmail.com> * nit: mistype * clang-format & add missing import * shader: remove post factor LUT. * shader: move post factor LUT to function and fix incorrect order. * clang-format * shader: FMUL: add static to post factor LUT * nit: typo Co-authored-by: Mat M. <mathew1800@gmail.com>
2020-02-27renderer_opengl: Use more concise lock syntax.bunnei1-4/+4
2020-02-27renderer_opengl: Move Frame/FrameMailbox to OpenGL namespace.bunnei2-36/+42
2020-02-26vk_swapchain: Silence TOCTOU race conditionReinUsesLisp1-9/+12
It's possible that the window is resized from the moment we ask for its size to the moment a swapchain is created, causing validation issues. To workaround this Vulkan issue request the capabilities again just before creating the swapchain, making the race condition less likely.
2020-02-26renderer_opengl: Create gl_framebuffer_data if empty.bunnei1-1/+2
2020-02-26frontend: qt: bootmanager: Vulkan: Restore support for VK backend.bunnei2-9/+14
2020-02-26core: frontend: Refactor scope_acquire_window_context to scope_acquire_context.bunnei1-2/+2
2020-02-26renderer_opengl: Add texture mailbox support for presenter thread.bunnei3-35/+268
2020-02-26renderer_opengl: Add OGLRenderbuffer to resource/state management.bunnei4-0/+62
2020-02-25video_core/surface: Add R32_SINT render target formatReinUsesLisp2-0/+3
2020-02-25video_core/gpu: Remove unused functionsReinUsesLisp2-71/+0
2020-02-24vk_shader_decompiler: Implement indexed texturesReinUsesLisp6-54/+99
Implement accessing textures through an index. It uses the same interface as OpenGL, the main difference is that Vulkan bindings are forced to be arrayed (the binding index doesn't change for stacked textures in SPIR-V).
2020-02-24shader: Simplify indexed sampler usagesReinUsesLisp2-20/+8
2020-02-24video_core: Implement more scaler attribute formatsReinUsesLisp3-4/+40
While changing this, fix assert in vk_shader_decompiler. We now know scaled formats are expected to be float in shaders attributes.
2020-02-21shader/texture: Fix illegal 3D texture assertReinUsesLisp1-1/+1
Fix typo in the illegal 3D texture assert logic. We care about catching arrayed 3D textures or 3D shadow textures, not regular 3D textures.
2020-02-21nit: add const to where it need.Nguyen Dac Nam1-14/+14
2020-02-21shader: implement LOP3 fast replace for old functionNguyen Dac Nam1-36/+58
ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/
2020-02-21vk_device: remove left over from other branchNguyen Dac Nam1-1/+0
2020-02-20clang-formatNguyen Dac Nam1-1/+1
2020-02-20shader_decompiler: only add StorageImageReadWithoutFormat when availableNguyen Dac Nam1-1/+4
2020-02-20video_core: memory_manager: Flush/invalidate asynchronously on Unmap.bunnei1-1/+10
- Minor perf improvement.
2020-02-19shader_decompiler: add check in case of device not support ShaderStorageImageReadWithoutFormatNguyen Dac Nam1-0/+4
2020-02-19vk_device: setup shaderStorageImageReadWithoutFormatNguyen Dac Nam1-0/+5
2020-02-19vk_device: add check for shaderStorageImageReadWithoutFormatNguyen Dac Nam1-0/+7
2020-02-19shader_conversion: I2F : add Assert for case src_size is ShortNguyen Dac Nam1-0/+3
2020-02-19fix warningNguyen Dac Nam1-1/+1
2020-02-19clang-format fixNguyen Dac Nam1-1/+1
2020-02-19shader_conversion: add conversion I2F for ShortNguyen Dac Nam1-9/+6
2020-02-19vk_shader: add Capability StorageImageReadWithoutFormatNguyen Dac Nam1-0/+1
2020-02-19vk_shader: Implement function ImageLoad (Used by Kirby Start Allies)Nguyen Dac Nam1-2/+6
Please enter the commit message for your changes. Lines starting
2020-02-18fixups mistake auto commit.Nguyen Dac Nam1-9/+0
2020-02-18Update code structureNguyen Dac Nam1-0/+7
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-02-18add vertex UnsignedInt size RGBANguyen Dac Nam1-0/+2
2020-02-18add eBc2SrgbBlock to formatsNguyen Dac Nam1-0/+1
2020-02-18vulkan: add DXT23_SRGBNguyen Dac Nam1-1/+1
2020-02-18renderer_vulkan: Add the rest of case for TryConvertBorderColorNguyen Dac Nam1-3/+10
2020-02-16texture_cache: Implement layered framebuffer attachmentsReinUsesLisp8-51/+74
Layered framebuffer attachments is a feature that allows applications to write attach layered textures to a single attachment. What layer the fragments are written to is decided from the shader using gl_Layer.
2020-02-16vk_shader_decompiler: Implement Layer output attributeReinUsesLisp1-6/+30
SPIR-V's Layer is GLSL's gl_Layer. It lets the application choose from a shader stage (vertex, tessellation or geometry) which framebuffer layer write the output fragments to.
2020-02-16texture_cache: Avoid matches in 3D texturesReinUsesLisp1-8/+11
Code before this commit was trying to match 3D textures with another target. Fix that.
2020-02-16surface_base: Implement texture buffer flushesReinUsesLisp2-0/+11
Implement downloads to guest memory from texture buffers on the generic cache and OpenGL.
2020-02-15Revert "video_core: memory_manager: Use GPU interface for cache functions."bunnei3-9/+14
2020-02-15texture: Implement R32IReinUsesLisp6-34/+46
2020-02-15shader/texture: Allow 2D shadow arrays and simplify codeReinUsesLisp1-43/+28
Shadow sampler 2D arrays are supported on OpenGL, so there's no reason to forbid these. Enable textureLod usage on these. Minor style changes.
2020-02-14maxwell_3d: Unify draw methodsReinUsesLisp6-36/+6
Pass instanced state of a draw invocation as an argument instead of having two separate virtual methods.
2020-02-14query_cache: Address feedbackReinUsesLisp2-16/+18
2020-02-14query_cache: Fix ambiguity in CacheAddr getterReinUsesLisp1-4/+5
2020-02-14query_cache: Add a recursive mutex for concurrent usageReinUsesLisp1-0/+6
2020-02-14vk_query_cache: Implement generic query cache on VulkanReinUsesLisp11-20/+327
2020-02-14query_cache: Abstract OpenGL implementationReinUsesLisp4-339/+394
Abstract the current OpenGL implementation into the VideoCommon namespace and reimplement it on top of that. Doing this avoids repeating code and logic in the Vulkan implementation.
2020-02-14gl_query_cache: Optimize query cacheReinUsesLisp6-79/+217
Use a custom cache instead of relying on a ranged cache.
2020-02-14gl_query_cache: Implement host queries using a deferred cacheReinUsesLisp7-86/+328
Instead of waiting immediately for executed commands, defer the query until the guest CPU reads it. This way we get closer to what the guest program is doing. To archive this we have to build a dependency queue, because host APIs (like OpenGL and Vulkan) use ranged queries instead of counters like NVN. Waiting for queries implicitly uses fences and this requires a command being queued, otherwise the driver will lock waiting until a timeout. To fix this when there are no commands queued, we explicitly call glFlush.
2020-02-14gl_rasterizer: Sort method declarationsReinUsesLisp1-16/+15
2020-02-14gl_rasterizer: Add queued commands counterReinUsesLisp2-0/+16
Keep track of the queued OpenGL commands that can signal a fence if waited on. As a side effect, we avoid calls to glFlush when no commands are queued.
2020-02-14maxwell_3d: Slow implementation of passed samples (query 21)ReinUsesLisp8-17/+201
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14gl_resource_manager: Add managed query classReinUsesLisp2-0/+42
2020-02-14gl_rasterizer: Use the least generic OpenGL draw function possibleReinUsesLisp1-8/+28
This may help some implementations.
2020-02-14vk_shader_decompiler: Fix vertex id and instance idReinUsesLisp1-4/+13
Vulkan's VertexIndex and InstanceIndex don't match with hardware. This is because Nvidia implements gl_VertexID and gl_InstanceID. The math that relates these is: gl_VertexIndex = gl_BaseVertex + gl_VertexID gl_InstanceIndex = gl_InstanceIndex + gl_InstanceID To emulate it using what Vulkan's SPIR-V offers (the *Index variants) this commit substracts gl_Base* from gl_*Index to obtain the OpenGL and hardware's equivalent.
2020-02-13GPU: Address Feedback.Fernando Sahmkow2-11/+10
2020-02-10GPU: Implement GPU Clock correctly.Fernando Sahmkow3-2/+17
2020-02-10Maxwell3D: Correct query reporting.Fernando Sahmkow2-51/+58
2020-02-08gpu_thread: Use MPSCQueue for GPU commands.bunnei1-1/+1
- Necessary for multiple service threads.
2020-02-08video_core: memory_manager: Use GPU interface for cache functions.bunnei3-14/+9
2020-02-05shader/decode: Fix constant buffer offsetsReinUsesLisp3-5/+5
Some instances were using cbuf34.offset instead of cbuf34.GetOffset(). This returned the an invalid offset. Address those instances and rename offset to "shifted_offset" to avoid future bugs.
2020-02-05maxwell_to_gl: Implement R8G8_USCALEDReinUsesLisp1-0/+8
2020-02-05maxwell_to_gl: Reduce unimplemented formats to LOG_ERRORReinUsesLisp1-8/+4
2020-02-04vk_rasterizer: Use noexcept variants of std::bitsetReinUsesLisp1-4/+5
Removes bounds checking from "texceptions" instances.
2020-02-04gl_rasterizer: Implement GL_POINT_SPRITEReinUsesLisp4-1/+9
OpenGL core defaults to GL_POINT_SPRITE, meanwhile on OpenGL compatibility we have to explicitly enable it. This fixes gl_PointCoord's behaviour.
2020-02-02maxwell_3d: Fix stencil back maskReinUsesLisp1-3/+3
2020-02-02shader: Remove curly braces initializers on shared pointersReinUsesLisp5-12/+12
2020-02-02shader/shift: Implement SHIFT_RIGHT_{IMM,R}ReinUsesLisp1-26/+58
Shifts a pair of registers to the right and returns the low register.
2020-02-02shader/shift: Implement SHF_LEFT_{IMM,R}ReinUsesLisp2-10/+89
Shifts a pair of registers to the left and returns the high register.
2020-01-30gl_rasterizer: Fix instanced draw arraysReinUsesLisp2-106/+28
glDrawArrays was being used when the draw had a base instance specified. This commit removes the draw parameters abstraction and fixes the mentioned issue.
2020-01-29yuzu: Implement Vulkan frontendReinUsesLisp3-1/+280
Adds a Qt and SDL2 frontend for Vulkan. It also finishes the missing bits on Vulkan initialization.
2020-01-29settings: Add settings for graphics backendReinUsesLisp1-2/+4
2020-01-29shader/other: Fix skips for SYNC and BRKReinUsesLisp1-2/+2
2020-01-29shader/other: Stub S2R LaneIdReinUsesLisp1-1/+4
2020-01-29buffer_cache: Delay buffer destructionsReinUsesLisp1-1/+4
Delay buffer destruction some extra frames to avoid destroying buffers that are still being used from older frames. This happens on Nvidia's driver with mailbox.
2020-01-28gl_shader_decompiler: Remove UNIMPLEMENTED for gl_PointSizeReinUsesLisp1-1/+0
This was implemented by a previous commit and it's no longer required.
2020-01-28gl_texture_cache: Silence implicit sign cast warningsReinUsesLisp1-3/+6
2020-01-27shader/bfi: Implement register-constant buffer variantReinUsesLisp2-2/+7
It's the same as the variant that was implemented, but it takes the operands from another source.
2020-01-27shader/arithmetic: Implement FCMPReinUsesLisp2-1/+17
Compares the third operand with zero, then selects between the first and second.
2020-01-27texture_cache/surface_base: Fix layered break downReinUsesLisp1-1/+1
Layered break downs was passing "layer" as a "depth" parameter. This commit addresses that.
2020-01-27gl_texture_cache: Properly implement depth/stencil samplingReinUsesLisp1-4/+27
This addresses the long standing issue of compatibility vs. core profiles on OpenGL, properly implementing depth vs. stencil sampling depending on the texture swizzle.
2020-01-26shader/memory: Implement ATOM.ADDReinUsesLisp5-39/+86
ATOM operates atomically on global memory. For now only add ATOM.ADD since that's what was found in commercial games. This asserts for ATOM.ADD.S32 (handling the others as unimplemented), although ATOM.ADD.U32 shouldn't be any different. This change forces us to change the default type on SPIR-V storage buffers from float to uint. We could also alias the buffers, but it's simpler for now to just use uint. While we are at it, abstract the code to avoid repetition.
2020-01-25Shader_IR: Address feedback.Fernando Sahmkow10-36/+40
2020-01-25shader/memory: Implement STL.S16 and STS.S16ReinUsesLisp1-3/+10
2020-01-25shader/memory: Implement unaligned LDL.S16 and LDS.S16ReinUsesLisp1-5/+3
2020-01-25shader/memory: Move unaligned load/store to functionsReinUsesLisp1-18/+27
2020-01-25shader/memory: Implement LDL.S16 and LDS.S16ReinUsesLisp1-12/+23
2020-01-24Shader_IR: Change name of TrackSampler function so it does not confuse with the type.Fernando Sahmkow3-7/+10
2020-01-24Shader_IR: Corrections, styling and extras.Fernando Sahmkow1-2/+4
2020-01-24Shader_IR: Correct Custom Variable assignment.Fernando Sahmkow2-0/+4
2020-01-24Shader_IR: Propagate bindless index into the GL compiler.Fernando Sahmkow5-24/+54
2020-01-24Shader_IR: Implement Injectable Custom Variables to the IR.Fernando Sahmkow5-1/+70
2020-01-24GL Backend: Introduce indexed samplers into the GL backendFernando Sahmkow2-10/+39
2020-01-24Shader_IR: deduce size of indexed samplersFernando Sahmkow4-8/+60
2020-01-24Shader_IR: Setup Indexed Samplers on the IRFernando Sahmkow1-20/+46
2020-01-24Shader_IR: Implement initial code for tracking indexed samplers.Fernando Sahmkow4-0/+139
2020-01-24Shader_IR: Address FeedbackFernando Sahmkow5-35/+37
2020-01-24Shader_IR: Allow constant access of guest driver.Fernando Sahmkow7-1/+18
2020-01-24Shader_IR: Address FeedbackFernando Sahmkow4-21/+29
2020-01-24Guest_driver: Correct compiling errors in GCC.Fernando Sahmkow2-1/+5
2020-01-24Shader_IR: Store Bound buffer on Shader UsageFernando Sahmkow5-5/+41
2020-01-24GPU: Implement guest driver profile and deduce texture handler sizes.Fernando Sahmkow13-0/+127
2020-01-24vk_shader_decompiler: Disable default values on unwritten render targetsReinUsesLisp3-19/+16
Some games like The Legend of Zelda: Breath of the Wild assign render targets without writing them from the fragment shader. This generates Vulkan validation errors, so silence these I previously introduced a commit to set "vec4(0, 0, 0, 1)" for these attachments. The problem is that this is not what games expect. This commit reverts that change.
2020-01-21gl_shader_cache: Disable fastmath on NvidiaReinUsesLisp1-0/+4
2020-01-20vk_blit_screen: Address feedbackReinUsesLisp4-22/+25
2020-01-20vk_blit_screen: Initial implementationReinUsesLisp3-0/+745
This abstraction takes care of presenting accelerated and non-accelerated or "framebuffer" images to the Vulkan swapchain.
2020-01-19vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-VReinUsesLisp1-3/+11
Also updates sirit to include atomic instructions.
2020-01-18gl_state: Use bool instead of GLbooleanReinUsesLisp2-3/+3
This fixes template resolution considering GLboolean an integer instead of a bool.
2020-01-18vk_graphics_pipeline: Set front facing properlyReinUsesLisp2-2/+2
Front face was being forced to a certain value when cull face is disabled. Set a default value on initialization and drop the forcefully set front facing value with culling disabled.
2020-01-18vk_rasterizer: Address feedbackReinUsesLisp2-25/+32
2020-01-18gl_shader_decompiler: Fix decompilation of condition codesReinUsesLisp1-27/+5
Use Visit instead of reimplementing it. Fixes unimplemented negations for condition codes.
2020-01-17vk_rasterizer: Implement Vulkan's rasterizerReinUsesLisp3-1/+1386
This abstraction is Vulkan's equivalent to OpenGL's rasterizer. It takes care of joining all parts of the backend and rendering accordingly on demand.
2020-01-17renderer_vulkan: Add header as placeholderReinUsesLisp2-0/+73
2020-01-16vk_texture_cache: Address feedbackReinUsesLisp2-22/+8
2020-01-16shader/memory: Implement ATOMS.ADD.U32ReinUsesLisp5-3/+74
2020-01-16format_lookup_table: Fix ZF32_X24S8 component typesReinUsesLisp1-1/+1
Component types for ZF32_X24S8 were using UNORM. Drivers will set FLOAT, UINT, UNORM, UNORM; causing a format mismatch. This commit addresses that.
2020-01-16vk_texture_cache: Fix typo in commentaryRodrigo Locatti1-1/+1
Co-Authored-By: MysticExile <30736337+MysticExile@users.noreply.github.com>
2020-01-16maxwell_3d: Make dirty_pointers privateLioncash1-2/+2
This isn't used outside of the class itself, so we can make it private for the time being.
2020-01-15gl_state: Implement PROGRAM_POINT_SIZEReinUsesLisp4-2/+13
For gl_PointSize to have effect we have to activate GL_PROGRAM_POINT_SIZE.
2020-01-15renderer_opengl/utils: Remove unused header inclusionsLioncash1-3/+0
Nothing from these headers are used, so they can be removed.
2020-01-15renderer_opengl/utils: Forward declare private structsLioncash2-12/+16
Keeps the definitions hidden and allows changes to the structs without needing to recompile all users of classes containing said structs.
2020-01-14gl_texture_cache: Use local variables to simplify DownloadTextureReinUsesLisp1-6/+4
2020-01-14gl_texture_cache: Fix format for RGBX16FReinUsesLisp1-1/+1
2020-01-14gl_texture_cache: Use Snorm internal format for RG8SReinUsesLisp1-1/+1
2020-01-14gl_texture_cache: Use Snorm internal format for ABGR8SReinUsesLisp1-1/+1
2020-01-14control_flow: Silence -Wreorder warning for CFGRebuildStateLioncash1-1/+1
Organizes the initializer list in the same order that the variables would actually be initialized in.
2020-01-14gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constantLioncash1-3/+0
Given this isn't used, this can be removed entirely.
2020-01-14gl_shader_cache: std::move entries in CachedShader constructorLioncash1-3/+4
Avoids several reallocations of std::vector instances where applicable.
2020-01-14gl_shader_cache: Remove unused entries variable in BuildShader()Lioncash1-1/+0
Eliminates a few unnecessary constructions of std::vectors.
2020-01-14vk_texture_cache: Implement generic texture cache on VulkanReinUsesLisp4-1/+733
It currently ignores PBO linearizations since these should be dropped as soon as possible on OpenGL.
2020-01-14texture_cache/surface_params: Make GetNumLayers publicReinUsesLisp1-4/+5
2020-01-11vk_compute_pass: Address feedbackRodrigo Locatti1-0/+2
Comment hardcoded SPIR-V modules.
2020-01-10maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driverReinUsesLisp3-6/+11
Nvidia's driver defaults invalid enumerations to GL_CLAMP. Vulkan doesn't expose GL_CLAMP through its API, but we can hack it on Nvidia's driver using the internal driver defaults.
2020-01-09shader_ir/texture: Simplify AOFFI codeReinUsesLisp1-10/+6
2020-01-09shader_ir/memory: Implement u16 and u8 for STG and LDGReinUsesLisp2-34/+52
Using the same technique we used for u8 on LDG, implement u16. In the case of STG, load memory and insert the value we want to set into it with bitfieldInsert. Then set that value.
2020-01-08vk_compute_pass: Add compute passes to emulate missing Vulkan featuresReinUsesLisp3-0/+416
This currently only supports quad arrays and u8 indices. In the future we can remove quad arrays with a table written from the CPU, but this was used to bootstrap the other passes helpers and it was left in the code. The blob code is generated from the "shaders/" directory. Read the instructions there to know how to generate the SPIR-V.
2020-01-08vk_shader_util: Add helper to build SPIR-V shadersReinUsesLisp3-0/+53
2020-01-07vk_pipeline_cache: Initial implementationReinUsesLisp2-1/+460
Given a pipeline key, this cache returns a pipeline abstraction (for graphics or compute).
2020-01-07vk_graphics_pipeline: Initial implementationReinUsesLisp4-0/+395
This abstractio represents the state of the 3D engine at a given draw. Instead of changing individual bits of the pipeline how it's done in APIs like D3D11, OpenGL and NVN; on Vulkan we are forced to put everything together into a single, immutable object. It takes advantage of the few dynamic states Vulkan offers.
2020-01-07vk_compute_pipeline: Initial implementationReinUsesLisp4-0/+219
This abstraction represents a Vulkan compute pipeline.
2020-01-07vk_pipeline_cache: Add file and define descriptor update template fillerReinUsesLisp3-0/+67
This function allows us to share code between compute and graphics pipelines compilation.
2020-01-07fixed_pipeline_state: Add depth clampReinUsesLisp2-10/+18
2020-01-07vk_rasterizer: Add placeholderReinUsesLisp2-0/+14
2020-01-06vk_renderpass_cache: Initial implementationReinUsesLisp3-0/+199
The renderpass cache is used to avoid creating renderpasses on each draw. The hashed structure is not currently optimized.
2020-01-06vk_update_descriptor: Initial implementationReinUsesLisp3-1/+146
The update descriptor is used to store in flat memory a large chunk of staging data used to update descriptor sets through templates. It provides a push interface to easily insert descriptors following the current pipeline. The order used in the descriptor update template has to be implicitly followed. We can catch bugs here using validation layers.
2020-01-06vk_stream_buffer/vk_buffer_cache: Avoid halting and use generic cacheReinUsesLisp4-62/+340
The stream buffer before this commit once it was full (no more bytes to write before looping) waiting for all previous operations to finish. This was a temporary solution and had a noticeable performance penalty in performance (from what a profiler showed). To avoid this mark with fences usages of the stream buffer and once it loops wait for them to be signaled. On average this will never wait. Each fence knows where its usage finishes, resulting in a non-paged stream buffer. On the other side, the buffer cache is reimplemented using the generic buffer cache. It makes use of the staging buffer pool and the new stream buffer.
2020-01-06vk_memory_manager: Misc changesReinUsesLisp2-88/+142
* Allocate memory in discrete exponentially increasing chunks until the 128 MiB threshold. Allocations larger thant that increase linearly by 256 MiB (depending on the required size). This allows to use small allocations for small resources. * Move memory maps to a RAII abstraction. To optimize for debugging tools (like RenderDoc) users will map/unmap on usage. If this ever becomes a noticeable overhead (from my profiling it doesn't) we can transparently move to persistent memory maps without harming the API, getting optimal performance for both gameplay and debugging. * Improve messages on exceptional situations. * Fix typos "requeriments" -> "requirements". * Small style changes.
2020-01-06vk_buffer_cache: Temporarily remove buffer cacheReinUsesLisp2-226/+0
This is intended for a follow up commit to avoid circular dependencies.
2020-01-04Shader_IR: Address FeedbackFernando Sahmkow5-38/+19
2020-01-04Shader_IR: Implement TXD Array.Fernando Sahmkow1-5/+12
This commit extends the compilation of TXD to support array samplers on TXD.
2020-01-03Update src/video_core/renderer_vulkan/vk_descriptor_pool.cppRodrigo Locatti1-1/+1
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-01-03yuzu: Remove Maxwell debuggerReinUsesLisp4-239/+0
This was carried from Citra and wasn't really used on yuzu. It also adds some runtime overhead. This commit removes it from yuzu's codebase.
2020-01-01vk_descriptor_pool: Initial implementationReinUsesLisp3-0/+147
Create a large descriptor pool where we allocate all our descriptors from. It has to be wide enough to support any pipeline, hence its large numbers. If the descritor pool is filled, we allocate more memory at that moment. This way we can take advantage of permissive drivers like Nvidia's that allocate more descriptors than what the spec requires.
2019-12-30Shader_IR: add the ability to amend code in the shader ir.Fernando Sahmkow5-3/+72
This commit introduces a mechanism by which shader IR code can be amended and extended. This useful for track algorithms where certain information can derived from before the track such as indexes to array samplers.
2019-12-30vk_image: Avoid unnecesary equalsRodrigo Locatti1-1/+1
2019-12-30video_core: Block in WaitFence.Markus Wick2-4/+8
This function is called rarely and blocks quite often for a long time. So don't waste power and let the CPU sleep. This might also increase the performance as the other cores might be allowed to clock higher.
2019-12-29vk_staging_buffer_pool: Initialize last epoch to zeroRodrigo Locatti1-1/+1
2019-12-26gl_rasterizer: Allow rendering without fragment shaderReinUsesLisp2-0/+7
Rendering without a fragment shader is usually used in depth-only passes.
2019-12-25vk_staging_buffer_pool: Add a staging pool for temporary operationsReinUsesLisp3-0/+212
The job of this abstraction is to provide staging buffers for temporary operations. Think of image uploads or buffer uploads to device memory. It automatically deletes unused buffers.
2019-12-25vk_image: Add an image object abstractionReinUsesLisp3-0/+192
This object's job is to contain an image and manage its transitions. Since Nvidia hardware doesn't know what a transition is but Vulkan requires them anyway, we have to state track image subresources individually. To avoid the overhead of tracking each subresource in images with many subresources (think of cubemap arrays with several mipmaps), this commit tracks when subresources have diverged. As long as this doesn't happen we can check the state of the first subresource (that will be shared with all subresources) and update accordingly. Image transitions are deferred to the scheduler command buffer.
2019-12-24fixed_pipeline_state: Define symetric operator!= and mark as noexceptReinUsesLisp2-40/+92
Marks as noexcept Hash, operator== and operator!= for consistency.
2019-12-23fixed_pipeline_state: Define structure and loadersReinUsesLisp3-0/+528
The intention behind this hasheable structure is to describe the state of fixed function pipeline state that gets compiled to a single graphics pipeline state object. This is all dynamic state in OpenGL but Vulkan wants it in an immutable state, even if hardware can edit it freely. In this commit the structure is defined in an optimized state (it uses booleans, has paddings and many data entries that can be packed to single integers). This is intentional as an initial implementation that is easier to debug, implement and review. It will be optimized in later stages, or it might change if Vulkan gets more dynamic states.
2019-12-23maxwell_3d: Add depth bounds registersReinUsesLisp1-6/+14
2019-12-23maxwell_to_gl: Implement missing primitive topologiesReinUsesLisp1-4/+18
Many of these topologies are exclusively available in OpenGL.
2019-12-22Texture Cache: Improve documentationFernando Sahmkow2-4/+5
2019-12-22Texture Cache: Address FeedbackFernando Sahmkow2-11/+11
2019-12-22Texture Cache: Add HLE methods for building 3D textures within the GPU in certain scenarios.Fernando Sahmkow4-1/+143
This commit adds a series of HLE methods for handling 3D textures in general. This helps games that generate 3D textures on every frame and may reduce loading times for certain games.
2019-12-21gl_shader_cache: Update commentary for shared memoryReinUsesLisp1-9/+6
Remove false commentary. Not dividing by 4 the size of shared memory is not a hack; it describes the number of integers, not bytes. While we are at it sort the generated code to put preprocessor lines on the top.
2019-12-21gl_shader_cache: Remove unused entry in GetPrimitiveDescriptionReinUsesLisp1-11/+9
2019-12-21vk_shader_decompiler: Use Visit instead of reimplementing itReinUsesLisp1-23/+1
ExprCondCode visit implements the generic Visit. Use this instead of that one. As an intended side effect this fixes unwritten memory usages in cases when a negation of a condition code is used.
2019-12-20shader/p2r: Implement P2R PrReinUsesLisp1-1/+15
P2R dumps predicate or condition codes state to a register. This is useful for unit testing.
2019-12-20shader/r2p: Refactor P2R to support P2RReinUsesLisp2-17/+33
2019-12-19vk_resource_manager: Add entry to VKFence to test its usageReinUsesLisp1-0/+8
2019-12-19vk_reosurce_manager: Add assert for releasing fencesReinUsesLisp1-0/+1
Notify the programmer when a request to release a fence is invalid because the fence is already free.
2019-12-19vk_resource_manager: Implement VKFenceWatch move constructorReinUsesLisp2-0/+32
This allows us to put VKFenceWatch inside a std::vector without storing it in heap. On move we have to signal the fences where the new protected resource is, adding some overhead.
2019-12-19vk_device: Add entry to catch device lossesReinUsesLisp3-1/+40
VK_NV_device_diagnostic_checkpoints allows us to push data to a Vulkan queue and then query it even after a device loss. This allows us to push the current pipeline object and see what was the call that killed the device.
2019-12-19vk_shader_decompiler: Fix full decompilationReinUsesLisp1-3/+5
When full decompilation was enabled, labels were not being inserted and instructions were misused. Fix these bugs.
2019-12-19vk_shader_decompiler: Skip NDC correction when it is nativeReinUsesLisp2-1/+2
Avoid changing gl_Position when the NDC used by the game is [0, 1] (Vulkan's native).
2019-12-19vk_shader_decompiler: Normalize output fragment attachmentsReinUsesLisp2-12/+12
Some games write from fragment shaders to an unexistant framebuffer attachment or they don't write to one when it exists in the framebuffer. Fix this by skipping writes or adding zeroes.
2019-12-19vk_device: Add query for RGBA8UintReinUsesLisp1-0/+1
2019-12-19vk_shader_decompiler: Update sirit and implement Texture AOFFIReinUsesLisp1-22/+30
2019-12-18gl_rasterizer: Implement RASTERIZE_ENABLEReinUsesLisp6-4/+28
RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it naturally using this. NVN games expect rasterize to be enabled by default, reflect that in our initial GPU state.
2019-12-18shader/memory: Implement LDG.U8 and unaligned U8 loadsReinUsesLisp1-6/+32
LDG can load single bytes instead of full integers or packs of integers. These have the advantage of loading bytes that are not aligned to 4 bytes. To emulate these this commit gets the byte being referenced (by doing "address & 3" and then using that to extract the byte from the loaded integer: result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
2019-12-18shader/conversion: Implement byte selector in I2FReinUsesLisp1-2/+13
I2F's byte selector is used to choose what bytes to convert to float. e.g. if the input is 0xaabbccdd and the selector is ".B3" it will convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in that example the default would convert 0xdd to float.
2019-12-18shader/texture: Properly shrink unused entries in size mismatchesReinUsesLisp1-4/+9
When a image format mismatches we were inserting zeroes to the texture itself. This was not handling cases were the mismatch uses less coordinates than the guest shader code. Address that by resizing the vector.
2019-12-18gl_shader_decompiler: Add missing DeclareImagesReinUsesLisp1-0/+1
2019-12-18shader_bytecode: Fix TLD4S encodingReinUsesLisp1-1/+1
2019-12-16renderer_vulkan/shader: Add helper GLSL shadersReinUsesLisp4-0/+122
These shaders are used to specify code that is not dynamically generated in the Vulkan backend. Instead of packing it inside the build system, it's manually built and copied to the C++ file to avoid adding unnecessary build time dependencies. quad_array should be dropped in the future since it can be emulated with a memory pool generated from the CPU.
2019-12-16shader/texture: Implement TLD4.PTPReinUsesLisp5-56/+120
2019-12-16shader/texture: Enable arrayed TLD4ReinUsesLisp1-1/+0
2019-12-16gl_shader_decompiler: Rename "sepparate" to "separate"ReinUsesLisp1-3/+3
2019-12-16shader/texture: Implement AOFFI for TLD4SReinUsesLisp1-13/+18
2019-12-16shader/texture: Remove unnecesary parenthesisReinUsesLisp1-2/+2
2019-12-13maxwell_to_vk: Improve image format table and add more formatsReinUsesLisp2-89/+127
A1B5G5R5 uses A1R5G5B5. This is flipped with image view swizzles; flushing is still not properly implemented on Vulkan for this particular format.
2019-12-13maxwell_to_vk: Implement more vertex formatsReinUsesLisp1-7/+81
2019-12-13maxwell_to_vk: Implement more primitive topologiesReinUsesLisp2-2/+11
Add an extra argument to query device capabilities in the future. The intention behind this is to use native quads, quad strips, line loops and polygons if these are released for Vulkan.
2019-12-13maxwell_to_vk: Approach GL_CLAMP closer to the GL specReinUsesLisp3-9/+17
The OpenGL spec defines GL_CLAMP's formula similarly to CLAMP_TO_EDGE and CLAMP_TO_BORDER depending on the filter mode used. It doesn't exactly behave like this, but it's the closest we can get with what Vulkan offers without emulating it by injecting shader code.
2019-12-13maxwell_to_vk: Use VK_EXT_index_type_uint8 when availableReinUsesLisp2-4/+7
2019-12-13vk_scheduler: Delegate commands to a worker thread and state trackReinUsesLisp2-37/+311
Introduce a worker thread approach for delegating Vulkan work derived from dxvk's approach. https://github.com/doitsujin/dxvk Now that the scheduler is what handles all Vulkan work related to command streaming, store state tracking in itself. This way we can know when to reupload Vulkan dynamic state to the queue (since this one is invalidated between command buffers unlike NVN). We can also store the renderpass state and graphics pipeline bound to avoid redundant binds and renderpass begins/ends.
2019-12-12Shader_IR: Correct TLD4S Depth Compare.Fernando Sahmkow2-9/+16
2019-12-12Shader_Ir: Correct TLD4S encoding and implement f16 flag.Fernando Sahmkow3-11/+15
2019-12-12Gl_Shader_compiler: Correct Depth Compare for Texture Gather operations.Fernando Sahmkow1-8/+21
2019-12-12Shader_Ir: default failed tracks on bindless samplers to null values.Fernando Sahmkow2-24/+77
2019-12-11Gl_Rasterizer: Skip Tesselation Control and Eval stages as they are un implemented.Fernando Sahmkow1-0/+8
This commit ensures the OGL backend does not execute tesselation shader stages as they are currently unimplemented.
2019-12-11Added missing includeJoel Holdsworth1-0/+1
2019-12-11gl_device: Enable compute shaders for Intel Mesa driversReinUsesLisp1-1/+4
Previously we naively checked for "Intel" in GL_VENDOR, but this includes both Intel's proprietary driver and the mesa driver. Re-enable compute shaders for mesa.
2019-12-11gl_shader_cache: Add missing new-line on emitted GLSLReinUsesLisp1-2/+2
Add missing new-line. This caused shaders using local memory and shared memory to inject a preprocessor GLSL line after an expression (resulting in invalid code). It looked like this: shared uint smem[8];#define LOCAL_MEMORY_SIZE 16 It should look like this (addressed by this commit): shared uint smem[8]; \#define LOCAL_MEMORY_SIZE 16
2019-12-11Maxwell3D: Implement Depth Mode.Fernando Sahmkow4-8/+15
This commit finishes adding depth mode that was reverted before due to other unresolved issues.
2019-12-10shader: Implement MEMBAR.GLReinUsesLisp5-1/+46
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10vk_shader_decompiler: Fix build issues on old gcc versionsReinUsesLisp1-2/+3
2019-12-10vk_shader_decompiler: Reduce YNegate's severityReinUsesLisp1-1/+1
2019-12-10shader_ir/other: Implement S2R InvocationIdReinUsesLisp4-0/+9
2019-12-10vk_shader_decompiler: Misc changesReinUsesLisp2-697/+1648
Update Sirit and its usage in vk_shader_decompiler. Highlights: - Implement tessellation shaders - Implement geometry shaders - Implement some missing features - Use native half float instructions when available.
2019-12-10shader: Keep track of shaders using warp instructionsReinUsesLisp2-0/+8
2019-12-10shader_ir/memory: Implement patch storesReinUsesLisp4-20/+38
2019-12-09vk_device: Misc changesReinUsesLisp2-117/+276
- Setup more features and requirements. - Improve logging for missing features. - Collect telemetry parameters. - Add queries for more image formats. - Query push constants limits. - Optionally enable some extensions.
2019-12-09externals: Update Vulkan-HeadersReinUsesLisp2-2/+14
2019-12-07vk_swapchain: Add support for swapping sRGBReinUsesLisp2-24/+31
We don't know until the game is running if it's using an sRGB color space or not. Add support for hot-swapping swapchain surface formats.
2019-12-07maxwell_3d: Add tessellation tess level registersReinUsesLisp1-1/+6
2019-12-07maxwell_3d: Add tessellation mode registerReinUsesLisp1-1/+28
2019-12-07maxwell_3d: Add patch vertices registerReinUsesLisp1-1/+4
2019-12-07shader_bytecode: Remove corrupted characterReinUsesLisp1-1/+1
2019-11-29texture_cache/surface_base: Fix out of bounds texture viewsReinUsesLisp1-7/+4
Some texture views were being created out of bounds (with more layers or mipmaps than what the original texture has). This is because of a miscalculation in mipmap bounding. end_layer and end_mipmap are out of bounds (e.g. layer 6 in a cubemap), there's no need to add one more there. Fixes OpenGL errors and Vulkan crashes on Splatoon 2.
2019-11-29gl_framebuffer_cache: Optimize framebuffer keyReinUsesLisp3-46/+60
Pack color attachment enumerations into a single u32. To determine the number of buffers, the highest color attachment with a shared pointer that doesn't point to null is used.
2019-11-29gl_rasterizer: Re-enable framebuffer cache for clear buffersReinUsesLisp3-32/+15
2019-11-29renderer_opengl: Make ScreenRectVertex's constructor constexprReinUsesLisp1-12/+7
2019-11-29renderer_opengl: Remove C castsReinUsesLisp1-4/+5
2019-11-29renderer_opengl: Use explicit binding for presentation shadersReinUsesLisp2-34/+20
2019-11-29renderer_opengl: Drop macros for message decorationsReinUsesLisp1-21/+26
2019-11-29renderer_opengl: Move static definitions to anonymous namespaceReinUsesLisp1-62/+66
2019-11-29renderer_opengl: Move commentaries to header fileReinUsesLisp2-20/+13
2019-11-27video_core/gpu_thread: Tidy up SwapBuffers()Lioncash1-2/+1
We can just use std::nullopt and std::make_optional to make this a little bit less noisy.
2019-11-27video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys()Lioncash1-2/+3
Tidies it up a little bit visually.
2019-11-27video_core/const_buffer_locker: Remove unused includesLioncash2-2/+2
2019-11-27video_core/const_buffer_locker: Remove #pragma once from cpp fileLioncash1-2/+0
Silences a compiler warning.
2019-11-27core/memory: Migrate over RasterizerMarkRegionCached() to the Memory classLioncash1-2/+2
This is only used within the accelerated rasterizer in two places, so this is also a very trivial migration.
2019-11-27core/memory: Migrate over GetPointer()Lioncash4-7/+8
With all of the interfaces ready for migration, it's trivial to migrate over GetPointer().
2019-11-27core: Prepare various classes for memory read/write migrationLioncash5-7/+21
Amends a few interfaces to be able to handle the migration over to the new Memory class by passing the class by reference as a function parameter where necessary. Notably, within the filesystem services, this eliminates two ReadBlock() calls by using the helper functions of HLERequestContext to do that for us.
2019-11-26gl_shader_decompiler: Fix casts from fp32 to f16ReinUsesLisp1-1/+2
Casts from f32 to f16 zeroes the higher half of the target register.
2019-11-25gl_device: Deduce indexing bug from device instead of heuristicReinUsesLisp2-48/+2
The heuristic to detect AMD's driver was not working properly since it also included Intel. Instead of using heuristics to detect it, compare the GL_VENDOR string.
2019-11-24gl_texture_cache: Apply sRGB on blitsReinUsesLisp1-0/+1
glBlitFramebuffer keeps in mind GL_FRAMEBUFFER_SRGB's state. Enable this depending on the target surface pixel format.
2019-11-23gpu_thread: Don't spin wait if there are no GPU commands.bunnei1-17/+15
2019-11-23gl_device: Reserve base bindings on limited devicesReinUsesLisp1-36/+76
SSBOs and other resources are limited per pipeline on Intel and AMD. Heuristically reserve resources per stage having in mind the reported OpenGL limits.
2019-11-23gl_state: Skip null texture bindsReinUsesLisp1-1/+5
glBindTextureUnit doesn't support null textures. Skip binding these.
2019-11-23gl_rasterizer: Disable compute shaders on IntelReinUsesLisp3-0/+12
Intel's proprietary driver enters in a corrupt state when compute shaders are executed. For now, disable these.
2019-11-23gl_shader_cache: Hack shared memory sizeReinUsesLisp1-2/+3
The current shared memory size seems to be smaller than what the game actually uses. This makes Nvidia's driver consistently blow up; in the case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked just fine. For now keep this hack since it's still progress over the previous hardcoded shared memory size.
2019-11-23gl_shader_decompiler: Normalize image bindingsReinUsesLisp3-33/+19
2019-11-23gl_shader_decompiler: Normalize cbuf bindingsReinUsesLisp2-10/+6
Stage and compute shaders were using a different binding counter. Normalize these.
2019-11-23gl_rasterizer: Add missing cbuf counter reset on computeReinUsesLisp1-0/+2
2019-11-23gl_shader_cache: Remove dynamic BaseBinding specializationReinUsesLisp16-192/+200
2019-11-23video_core: Unify ProgramType and ShaderStage into ShaderTypeReinUsesLisp22-289/+262
2019-11-23gl_rasterizer: Bind graphics images to draw commandsReinUsesLisp4-33/+54
Images were not being bound to draw invocations because these would require a cache invalidation.
2019-11-23gl_shader_cache: Specialize local memory size for compute shadersReinUsesLisp6-21/+32
Local memory size in compute shaders was stubbed with an arbitary size. This commit specializes local memory size from guest GPU parameters.
2019-11-23gl_shader_cache: Specialize shared memory sizeReinUsesLisp5-29/+25
Shared memory was being declared with an undefined size. Specialize from guest GPU parameters the compute shader's shared memory size.
2019-11-23gl_shader_cache: Specialize shader workgroupReinUsesLisp6-68/+74
Drop the usage of ARB_compute_variable_group_size and specialize compute shaders instead. This permits compute to run on AMD and Intel proprietary drivers.
2019-11-23shader/texture: Handle TLDS texture type mismatchesReinUsesLisp1-1/+10
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets used by instructions of 1D textures. To handle the discrepancy this commit uses the the texture type from the binding and modifies the emitted code IR to build a valid backend expression. E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples a 2D texture in the coordinate ivec2(X, 0).
2019-11-23shader/texture: Deduce texture buffers from lockerReinUsesLisp9-174/+107
Instead of specializing shaders to separate texture buffers from 1D textures, use the locker to deduce them while they are being decoded.
2019-11-20buffer_cache: Remove brace initialized for objects with default constructorReinUsesLisp1-10/+10
2019-11-20Texture_Cache: Redo invalid Surfaces handling.Fernando Sahmkow3-32/+101
This commit aims to redo the full setup of invalid textures and guarantee correct behavior across backends in the case of finding one by using black dummy textures that match the target of the expected texture.
2019-11-20shader/other: Reduce DEPBAR log severityReinUsesLisp1-1/+1
While DEPBAR is stubbed it doesn't change anything from our end. Shading languages handle what this instruction does implicitly. We are not getting anything out fo this log except noise.
2019-11-20gl_shader_gen: Apply default value to gl_PositionReinUsesLisp1-0/+1
Nvidia has sane default output values for varyings, but the other vendors don't apply these. To properly emulate this we would have to analyze the shader header. For the time being, apply the same default Nvidia applies so we get the same behaviour on non-Nvidia drivers.
2019-11-18Shader_IR: Address FeedbackFernando Sahmkow3-11/+9
2019-11-15format_lookup_table: Address feedbackReinUsesLisp2-30/+24
format_lookup_table: Drop bitfields format_lookup_table: Use std::array for definition table format_lookup_table: Include <limits> instead of <numeric>
2019-11-15texture_cache: Use a table instead of switch for texture formatsReinUsesLisp9-261/+290
Use a large flat array to look up texture formats. This allows us to properly implement formats with different component types. It should also be faster.
2019-11-14texture_cache: Drop abstracted ComponentTypeReinUsesLisp8-294/+158
Abstracted ComponentType was not being used in a meaningful way. This commit drops its usage. There is one place where it was being used to test compatibility between two cached surfaces, but this one is implied in the pixel format. Removing the component type test doesn't change the behaviour.
2019-11-14correct the implementation of RGBA16UIgreggameplayer1-0/+2
2019-11-14Shader_IR: Implement TXD instruction.Fernando Sahmkow5-8/+120
2019-11-14Shader_IR: Implement FLO instruction.Fernando Sahmkow5-0/+35
2019-11-14Shader_Bytecode: Add encodings for FLO, SHF and TXDFernando Sahmkow1-0/+18
2019-11-13maxwell_3d: Fix stencil_back_func_mask offsetReinUsesLisp1-3/+3
stencil_back_func_mask and stencil_back_mask were misplaced. This commit addresses that issue.
2019-11-11video_core: Enable sign conversion warningsRodrigo Locatti1-1/+1
Enable sign conversion warnings but don't treat them as errors.
2019-11-08video_core: Treat implicit conversions as errorsReinUsesLisp1-0/+6
2019-11-08video_core: Silence implicit conversion warningsReinUsesLisp9-53/+62
2019-11-08gl_shader_cache: Fix locker constructorsReinUsesLisp1-2/+4
Properly pass engine when a shader is being constructed from memory.
2019-11-08gl_shader_cache: Enable extensions only when availableReinUsesLisp1-6/+14
Silence GLSL compilation warnings.
2019-11-08gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not availableReinUsesLisp3-5/+28
2019-11-08shader_ir/warp: Implement FSWZADDReinUsesLisp5-0/+44
2019-11-08gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsicsReinUsesLisp5-122/+49
2019-11-07GLSLDecompiler: Correct Texture Gather Offset.Fernando Sahmkow1-1/+1
This commit corrects the argument ordering in textureGatherOffset.
2019-11-07buffer_cache: Add missing includes (#3079)Morph1-0/+4
`boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included. Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.
2019-11-07gl_rasterizer: Remove front facing hackReinUsesLisp1-12/+0
2019-11-07gl_shader_decompiler: Fix typo "y_negate"->"y_direction"ReinUsesLisp1-1/+1
2019-11-07gl_shader_manager: Remove unused variable in SetFromRegsReinUsesLisp1-1/+0
2019-11-07gl_rasterizer: Emulate viewport flipping with ARB_clip_controlReinUsesLisp7-76/+52
Emulates negative y viewports with ARB_clip_control. This allows us to more easily emulated pipelines with tessellation and/or geometry shader stages. It also avoids corrupting games with transform feedbacks and negative viewports (gl_Position.y was being modified).
2019-11-07shader/control_flow: Specify constness on caller lambdasRodrigo Locatti1-11/+12
Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com> Update src/video_core/shader/control_flow.cpp Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07shader/control_flow: Use callable template instead of std::functionReinUsesLisp1-6/+5
2019-11-07shader/control_flow: Abstract repeated code chunks in BRX trackingReinUsesLisp1-93/+101
Remove copied and pasted for cycles into a common templated function.
2019-11-07shader/control_flow: Silence Intellisense cast warningsReinUsesLisp1-1/+1
2019-11-07shader/control_flow: Remove brace initializer in std containersReinUsesLisp1-9/+9
These containers have a default constructor.
2019-11-07shader/decode: Reduce severity of arithmetic rounding warningsReinUsesLisp6-15/+17
2019-11-07shader/arithmetic: Reduce RRO stub severityReinUsesLisp1-1/+2
2019-11-07shader/texture: Remove NODEP warningsReinUsesLisp1-35/+0
These warnings don't offer meaningful information while decoding shaders. Remove them.
2019-11-04common_func: Use std::array for INSERT_PADDING_* macros.bunnei7-108/+110
- Zero initialization here is useful for determinism.
2019-11-02gl_rasterizer: Re-enable stream buffer memory due to global memoryReinUsesLisp1-14/+8
Global memory is still using the stream buffer when it shouldn't. As a temporary fix re-enable the stream buffer on compute.
2019-11-02gl_rasterizer: Upload constant buffers with glNamedBufferSubDataReinUsesLisp6-19/+84
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements to a fast. This path has an extra memcpy but updates the buffer without orphaning or waiting for previous calls. It can be seen as a better model for "push constants" that can upload a whole UBO instead of 256 bytes. This path has some requirements established here: http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24 Instead of using the stream buffer, this commits moves constant buffers uploads to calls of glNamedBufferSubData and from my testing it brings a performance improvement. This is disabled when the vendor is not Nvidia since it brings performance regressions.
2019-10-31Shader_IR: Fix regression on TLD4Fernando Sahmkow2-5/+4
Originally on the last commit I thought TLD4 acted the same as TLD4S and didn't have a mask. It actually does have a component mask. This commit corrects that.
2019-10-30Shader_IR: Fix TLD4 and add Bindless Variant.Fernando Sahmkow3-11/+55
This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant.
2019-10-30gl_state: Use std::array::fill instead of std::fillRodrigo Locatti1-1/+1
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30gl_state: Move dirty checks to individual apply calls instead of ApplyReinUsesLisp2-66/+74
This requires removing constness from some methods, but for consistency it's removed in all methods.
2019-10-30gl_state: Remove ApplyDefaultStateReinUsesLisp3-17/+1
OpenGL has defaults values we can trust. Remove these.
2019-10-30gl_state: Change SetDefaultViewports to use default constructorReinUsesLisp1-13/+2
2019-10-30gl_state: Minor style changesReinUsesLisp1-3/+5
2019-10-30gl_state: Remove unused Citra TextureUnitsReinUsesLisp1-23/+0
2019-10-30gl_state: Move initializers from constructor to class declarationReinUsesLisp2-170/+75
2019-10-30shader/node: Unpack bindless texture encodingReinUsesLisp8-142/+116
Bindless textures were using u64 to pack the buffer and offset from where they come from. Drop this in favor of separated entries in the struct. Remove the usage of std::set in favor of std::list (it's not std::vector to avoid reference invalidations) for samplers and images.
2019-10-28maxwell_3d/kepler_compute: Remove unused arguments in GetTextureReinUsesLisp5-37/+20
2019-10-28video_core/textures: Remove unused index entry in FullTextureInfoReinUsesLisp2-2/+0
2019-10-28maxwell_3d: Remove unused method GetStageTexturesReinUsesLisp2-42/+0
2019-10-27Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.Fernando Sahmkow4-5/+22
This commit implements the E5B9G9R9 Texture format into the general system and OpenGL backend.
2019-10-27maxwell_3d: Silence implicit conversion warningsReinUsesLisp2-24/+25
While we are at it, unify types for dirty reg pointers.
2019-10-27rasterizer_accelerated: Add intermediary for GPU rasterizersReinUsesLisp5-45/+98
Add an intermediary class that implements common functions across GPU accelerated rasterizers. This avoids code repetition on different backends.
2019-10-27astc: Silence implicit conversion warningsReinUsesLisp1-7/+8
2019-10-26Shader_IR: Address Feedback.Fernando Sahmkow8-55/+65
2019-10-25Shader_IR: Clang formatFernando Sahmkow1-2/+1
2019-10-25gl_shader_cache: Implement locker variants invalidationReinUsesLisp4-44/+104
2019-10-25gl_shader_disk_cache: Store and load fast BRXReinUsesLisp6-50/+210
2019-10-25const_buffer_locker: Minor style changesReinUsesLisp2-152/+76
2019-10-25gl_shader_decompiler: Move entries to a separate functionReinUsesLisp15-722/+420
2019-10-25Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.Fernando Sahmkow1-1/+1
2019-10-25Shader_IR: Correct typo in Consistent method.Fernando Sahmkow2-2/+2
2019-10-25Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide itFernando Sahmkow9-46/+363
2019-10-25Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.Fernando Sahmkow7-130/+258
2019-10-25Shader_Cache: setup connection of ConstBufferLockerFernando Sahmkow10-43/+82
2019-10-25VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.Fernando Sahmkow10-11/+168
2019-10-25Shader_IR: Implement BRX tracking.Fernando Sahmkow1-0/+113
2019-10-24shader_bytecode: Make Matcher constexpr capableLioncash1-13/+13
Greatly shrinks the amount of generated code for GetDecodeTable(). Collapses an assembly output of 9000+ lines down to ~3621 with Clang, and 6513 down to ~2616 with GCC, given it's now allowed to construct all the entries as a sequence of constant data.
2019-10-24shader_ir: Use std::array with pair instead of unordered_mapLioncash1-53/+67
Given the overall size of the maps are very small, we can use arrays of pairs here instead of always heap allocating a new map every time the functions are called. Given the small size of the maps, the difference in container lookups are negligible, especially given the entries are already sorted.
2019-10-24video_core/shader: Resolve instances of variable shadowingLioncash6-11/+12
Silences a few -Wshadow warnings.
2019-10-22Shader_Ir: Fix TLD4S from using a component mask.Fernando Sahmkow2-5/+5
TLD4S always outputs 4 values, the previous code checked a component mask and omitted those values that weren't part of it. This commit corrects that and makes sure all 4 values are set.
2019-10-22shader_ir/memory: Ignore global memory when tracking failsReinUsesLisp2-18/+26
Ignore global memory operations instead of invoking undefined behaviour when constant buffer tracking fails and we are blasting through asserts, ignore the operation. In the case of LDG this means filling the destination registers with zeroes; for STG this means ignore the instruction as a whole. The default behaviour is still to abort execution on failure.
2019-10-20maxwell_3d: Reduce FlushMMEInlineDraw logging to TraceReinUsesLisp1-1/+1
2019-10-18video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functionsLioncash2-5/+5
These can also trivially be made const member functions, with the addition of a few consts.
2019-10-18video_core/shader/ast: Make ASTManager::Print a const member functionLioncash2-3/+3
Given all visiting functions never modify the nodes, we can trivially make this a const member function.
2019-10-18video_core/shader/ast: Make ExprPrinter members privateLioncash1-1/+2
This member already has an accessor, so there's no need for it to be public.
2019-10-18video_core/shader/ast: Make Indent() return a string_viewLioncash1-14/+24
The returned string is simply a substring of our constexpr tabs string_view, so we can just use a string_view here as well, since the original string_view is guaranteed to always exist. Now the function is fully non-allocating.
2019-10-18video_core/shader/ast: Make Indent() privateLioncash1-9/+9
It's never used outside of this class, so we can narrow its scope down.
2019-10-18video_core/shader/ast: Rename Ident() to Indent()Lioncash1-13/+13
This can be confusing, given "ident" is generally used as a shorthand for "identifier".
2019-10-18video_core/shader/ast: Make use of fmt where applicableLioncash1-14/+14
Makes a few strings nicer to read and also eliminates a bit of string churn with operator+.
2019-10-18vk_shader_decompiler: Mark operator() function parameters as const referencesLioncash1-21/+23
These parameters aren't actually modified in any way, so they can be made const references.
2019-10-18Fermi2D: Use a different formula for delimiting blit areas.Fernando Sahmkow1-14/+28
2019-10-17video_core/macro_interpreter: Make definitions of most private enums/unions hiddenLioncash2-72/+79
This allows the implementation of these types to change without requiring a rebuild of everything that includes the macro interpreter header.
2019-10-17Fermi2D: limit blit area to only available areaFernando Sahmkow1-4/+14
Normaly OpenGL does not care if the areas exceed the texture regions but other backends such as Vulkan do care about the limits of this areas. This PR crops the areas of the blit in order that they don't surpass the limits of the textures. This should help Vulkan and faulty OpenGL drivers
2019-10-16video_core/surface: Add missing break in PixelFormatFromTextureFormat()Lioncash1-0/+1
Prevents fallthrough into the following case.
2019-10-16vk_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()Lioncash1-0/+3
This would previously result in NeverExecute and UnusedIndex being treated as regular predicates.
2019-10-16gl_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()Lioncash1-0/+3
This would previously result in NeverExecute and UnusedIndex being treated as regular predicates.
2019-10-16texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()Lioncash1-2/+2
We can take these by const reference and avoid making unnecessary copies, preventing some atomic reference count increments and decrements.
2019-10-16control_flow: Silence truncation warningsLioncash2-4/+4
This can be trivially fixed by making the input size a size_t. CFGRebuildState's constructor parameter is already a std::size_t, so this just makes the size type fully conform with it.
2019-10-16gl_shader_decompiler: Make ExprDecompiler's GetResult() a const member functionLioncash1-1/+1
This is only ever used to read, but not write, the resulting string, so we can enforce this by making it a const member function.
2019-10-16gl_shader_decompiler: Use a std::string_view with GetDeclarationWithSuffix()Lioncash1-1/+1
This allows the function to be completely non-allocating for inputs of all sizes (i.e. there's no heap cost for an input to convert to a std::string_view).
2019-10-16gl_shader_decompiler: Fold flow_var constant into GetFlowVariable()Lioncash1-3/+1
This is only ever used within this function, so we can narrow it's scope down.
2019-10-16gl_shader_decompiler: Mark ASTDecompiler/ExprDecompiler parameters as const references where applicableLioncash1-21/+21
These member functions don't actually modify the input parameter, so we can make this explicit with the use of const.
2019-10-16gl_shader_decompiler: Pass by reference to GenerateTextureArgument()Lioncash1-2/+2
Avoids an unnecessary atomic reference count increment and decrement.
2019-10-16gl_shader_decompiler: Use std::holds_alternative within GenerateTexture()Lioncash1-1/+1
This only ever queries if the type exists within the variant, but doesn't actually do anything with the return value. We can just use std::holds_alternative for this use case.
2019-10-16shader/node: std::move Meta instance within OperationNode constructorLioncash1-1/+1
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-16gl_shader_decompiler: Avoid unnecessary copies of MetaImageLioncash1-4/+4
MetaImage contains a std::vector, so copying here could result in unnecessary reallocations. Given the operation lives throughout the entire scope, this is safe to do.
2019-10-15maxwell_3d: Silence truncation warningsLioncash1-1/+2
A trivial warning caused by not using size_t as the argument types instead of u32.
2019-10-15video_core/gpu: Remove use of the global system accessorLioncash1-1/+1
We can just make use of the reference member variable instead of accessing the global system instance.
2019-10-15video_core/texture_cache: Amend Doxygen referencesLioncash1-57/+78
Amends the doxygen comments so that they properly resolve. While we're at it, we can correct some typos and fix up some of the comments' formatting in order to make them slightly nicer to read.
2019-10-15common: Rename binary_find.h to algorithm.hLioncash2-3/+3
Makes the header more general for other potential algorithms in the future. While we're at it, include a missing <functional> include to satisfy the use of std::less.
2019-10-11AsyncGpu: Address FeedbackFernando Sahmkow2-2/+2
2019-10-09Surfaces: Implement R4G4B4A4U format.Fernando Sahmkow4-24/+41
2019-10-09Surfaces: Implement ASTC 6x6 10x10 12x12 8x6 6x5Fernando Sahmkow4-70/+185
2019-10-07shader/half_set_predicate: Fix HSETP2 for constant buffersReinUsesLisp1-0/+2
HSETP2 when used with a constant buffer parses the second operand type as F32. This is not configurable.
2019-10-07shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUGReinUsesLisp1-1/+2
2019-10-06gl_shader_disk_cache: Properly ignore existing cacheReinUsesLisp2-16/+17
Previously old entries where appended to the file even if the shader cache was ignored at boot. Address that issue.
2019-10-05video_core/control_flow: Eliminate variable shadowing warningsLioncash1-6/+6
2019-10-05video_core/control_flow: Eliminate pessimizing movesLioncash1-5/+8
These can inhibit the ability of a compiler to perform RVO.
2019-10-05video_core/ast: Unindent most of IsFullyDecompiled() by one levelLioncash1-12/+12
2019-10-05video_core/ast: Make ShowCurrentState() take a string_view instead of std::stringLioncash2-2/+2
Allows the function to be non-allocating in terms of the output string.
2019-10-05video_core/ast: Eliminate variable shadowing warningsLioncash1-3/+3
2019-10-05video_core/ast: Replace std::string with a constexpr std::string_viewLioncash1-3/+1
Same behavior, but without the need to heap allocate
2019-10-05video_core/ast: Default the move constructor and assignment operatorLioncash2-26/+2
This is behaviorally equivalent and also fixes a bug where some members weren't being moved over.
2019-10-05video_core/{ast, expr}: Organize forward declarationLioncash2-10/+10
Keeps them alphabetically sorted for readability.
2019-10-05video_core/expr: Supply operator!= along with operator==Lioncash2-1/+32
Provides logical symmetry to the interface.
2019-10-05video_core/{ast, expr}: Use std::move where applicableLioncash4-45/+47
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05video_core/ast: Supply const accessors for data where applicableLioncash2-37/+41
Provides const equivalents of data accessors for use within const contexts.
2019-10-05maxwell_3d: Add dirty flags for depth bounds valuesReinUsesLisp2-1/+10
This is useful in Vulkan where we want to update depth bounds without caring if it's enabled or disabled through vkCmdSetDepthBounds.
2019-10-05GL_Renderer: Remove lefting snippet.Fernando Sahmkow1-2/+0
2019-10-05Gl_Rasterizer: Protect CPU Memory mapping from multiple threads.Fernando Sahmkow2-0/+4
2019-10-05Core: Wait for GPU to be idle before shutting down.Fernando Sahmkow6-0/+17
2019-10-05Nvdrv: Do framelimiting only in the CPU ThreadFernando Sahmkow1-3/+0
2019-10-05GPU_Async: Correct fences, display events and more.Fernando Sahmkow4-19/+17
This commit uses guest fences on vSync event instead of an articial fake fence we had. It also corrects to keep signaling display events while loading the game as the OS is suppose to send buffers to vSync during that time.
2019-10-05Texture_Cache: Blit Deduction corrections and simplifications.Fernando Sahmkow1-18/+20
2019-10-05TextureCache: Add the ability to deduce if two textures are depth on blit.Fernando Sahmkow1-2/+142
2019-10-05Shader_ir: Address feedbackFernando Sahmkow6-65/+24
2019-10-05Shader_Ir: Address Feedback and clang format.Fernando Sahmkow4-68/+68
2019-10-05vk_shader_decompiler: Correct Branches inside conditionals.Fernando Sahmkow1-1/+11
2019-10-05vk_shader_decompiler: Clean code and be const correct.Fernando Sahmkow2-8/+6
2019-10-05Shader_IR: clean up AST handling and add documentation.Fernando Sahmkow1-2/+6
2019-10-05Shader_IR: Correct OutwardMoves for IfsFernando Sahmkow1-22/+11
2019-10-05vk_shader_compiler: Don't enclose branches with if(true) to avoid crashing AMDFernando Sahmkow1-16/+33
2019-10-05gl_shader_decompiler: Refactor and address feedback.Fernando Sahmkow1-17/+18
2019-10-05Shader_IR: corrections and clang-formatFernando Sahmkow2-70/+64
2019-10-05vk_shader_compiler: Correct SPIR-V AST DecompilingFernando Sahmkow1-4/+11
2019-10-05Shader_IR: allow else derivation to be optional.Fernando Sahmkow7-10/+18
2019-10-05vk_shader_compiler: Implement the decompiler in SPIR-VFernando Sahmkow3-23/+301
2019-10-05Shader_IR: mark labels as unused for partial decompile.Fernando Sahmkow2-3/+9
2019-10-05Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes.Fernando Sahmkow13-82/+334
2019-10-05gl_shader_decompiler: Implement AST decompilingFernando Sahmkow11-63/+358
2019-10-05shader_ir: Declare Manager and pass it to appropiate programs.Fernando Sahmkow7-104/+214
2019-10-05shader_ir: Corrections to outward movements and misc stuffsFernando Sahmkow6-58/+306
2019-10-05shader_ir: Add basic goto eliminationFernando Sahmkow2-38/+484
2019-10-05shader_ir: Initial Decompile SetupFernando Sahmkow6-5/+510
2019-10-01gl_rasterizer: Fix polygon offset unitsReinUsesLisp1-1/+3
For some reason hardware divides polygon offset units by two. This is visible since drivers multiply the application requested polygon offset by two.
2019-09-24gl_shader_decompiler: Add tailing return for HUnpack2ReinUsesLisp1-0/+2
2019-09-24gl_shader_decompiler: Fix clang build issuesReinUsesLisp1-26/+23
2019-09-22Maxwell3D: Corrections and refactors to MME instance refactorFernando Sahmkow4-44/+46
2019-09-22Fix clang-formatFearlessTobi2-2/+2
2019-09-22fermi_2d: Lower surface copy log severity to DEBUGFearlessTobi1-1/+1
2019-09-22video_core: Implement RGBX16F PixelFormatFearlessTobi7-22/+37
2019-09-21gl_shader_decompiler: Use uint for images and fix SUATOMReinUsesLisp7-188/+93
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop.
2019-09-21shader/image: Implement SULD and remove irrelevant codeReinUsesLisp10-47/+110
* Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21shader_bytecode: Add SULD encodingReinUsesLisp1-0/+2
2019-09-21Shader_IR: ICMP corrections and fixesFernando Sahmkow2-6/+11
2019-09-21Mark DrawArrays as LOG_TRACEDavid Marcec1-1/+1
There's no reason to clog logs with DrawArray.
2019-09-20Rasterizer: Correct introduced bug where a conditional render wouldn't stop a draw call from executingFernando Sahmkow1-10/+16
2019-09-20Shader_IR: Implement ICMP.Fernando Sahmkow2-0/+37
2019-09-19Rasterizer: Refactor and simplify DrawBatch Interface.Fernando Sahmkow4-35/+16
2019-09-19Rasterizer: Address Feedback and conscerns.Fernando Sahmkow1-11/+11
2019-09-19Rasterizer: Refactor draw calls, remove deadcode and clean up.Fernando Sahmkow3-106/+68
2019-09-19VideoCore: Corrections to the MME Inliner and removal of hacky instance management.Fernando Sahmkow6-31/+81
2019-09-19Video Core: initial Implementation of InstanceDraw PackagingFernando Sahmkow7-11/+192
2019-09-17shader_ir/warp: Implement SHFLReinUsesLisp6-9/+182
2019-09-17maxwell_to_gl: Fix mipmap filteringReinUsesLisp1-2/+2
OpenGL texture filters follow GL_<texture_filter>_MIPMAP_<mipmap_filter> but we were using them in the opposite way.
2019-09-17gl_rasterizer: Remove unused code paths from ConfigureFramebuffersReinUsesLisp4-121/+33
2019-09-15maxwell_3d: Update firmware 4 call stub commentaryRodrigo Locatti1-1/+2
2019-09-13vk_device: Add miscellaneous features and minor style changesReinUsesLisp3-111/+258
* Increase minimum Vulkan requirements * Require VK_EXT_vertex_attribute_divisor * Require depthClamp, samplerAnisotropy and largePoints features * Search and expose VK_KHR_uniform_buffer_standard_layout * Search and expose VK_EXT_index_type_uint8 * Search and expose native float16 arithmetics * Track current driver with VK_KHR_driver_properties * Query and expose SSBO alignment * Query more image formats * Improve logging overall * Minor style changes * Minor rephrasing of commentaries
2019-09-13video_core/surface: Add function to detect sRGB surfacesReinUsesLisp2-0/+22
This is required for proper conversion to RGBA8_UNORM or RGBA8_SRGB surfaces when a backend can target both native and converted ASTC.
2019-09-11renderer_opengl: Fix rebase mistakeReinUsesLisp1-1/+1
2019-09-11shader/image: Implement SUATOM and fix SUSTReinUsesLisp7-69/+329
2019-09-11gl_rasterizer: Correct sRGB Fix regressionFernando Sahmkow1-0/+12
2019-09-11renderer_opengl: Fix sRGB blitsReinUsesLisp6-43/+10
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget to apply at least once to blit the final texture as sRGB. Instead of doing this apply sRGB if the presented image has sRGB. Also enable sRGB by default on Maxwell3D registers as some games seem to assume this.
2019-09-06gl_shader_decompiler: Avoid writing output attribute when unimplementedReinUsesLisp1-10/+14
2019-09-06gl_shader_decompiler: Keep track of written images and mark them as modifiedReinUsesLisp7-62/+92
2019-09-06texture_cache: Minor changesReinUsesLisp4-19/+17
2019-09-06gl_rasterizer: Apply textures and images stateReinUsesLisp1-0/+2
2019-09-06gl_rasterizer: Add samplers to compute dispatchesReinUsesLisp2-3/+36
2019-09-06gl_rasterizer: Minor code changesReinUsesLisp2-20/+31
2019-09-06gl_state: Split textures and samplers into two arraysReinUsesLisp4-91/+39
2019-09-06gl_rasterizer: Implement image bindingsReinUsesLisp5-32/+106
2019-09-06gl_state: Add support for glBindImageTexturesReinUsesLisp2-0/+24
2019-09-06texture_cache: Pass TIC to texture cacheReinUsesLisp4-27/+25
2019-09-06kepler_compute: Implement texture queriesReinUsesLisp5-5/+99
2019-09-06gl_rasterizer: Split SetupTexturesReinUsesLisp2-22/+38
2019-09-05gl_shader_decompiler: Implement shared memoryReinUsesLisp1-0/+23
2019-09-05shader_ir: Implement LD_SReinUsesLisp1-10/+13
Loads from shared memory.
2019-09-05shader_ir: Implement ST_SReinUsesLisp4-11/+45
This instruction writes to a memory buffer shared with threads within the same work group. It is known as "shared" memory in GLSL.
2019-09-04gl_shader_decompiler: Fixup slow pathReinUsesLisp1-1/+1
2019-09-04gl_rasterizer: Fix stencil testingReinUsesLisp1-11/+11
* Fix stencil dirty flags tracking when stencil is disabled * Attach stencil on clears (previously it only attached depth) * Attach stencil on drawing regardless of stencil testing being enabled
2019-09-04Revert "Revert #2466" and stub FirmwareCall 4ReinUsesLisp3-4/+19
2019-09-04shader/shift: Implement SHR wrapped and clamped variantsReinUsesLisp2-6/+17
Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires.
2019-09-04maxwell_3d: Avoid moving macro_paramsReinUsesLisp4-12/+24
2019-09-04gl_shader_cache: Remove special casing for geometry shadersReinUsesLisp2-80/+9
Now that ProgramVariants holds the primitive topology we no longer need to keep track of individual geometry shaders topologies.
2019-09-04half_set_predicate: Fix predicate assignmentsReinUsesLisp1-10/+9
2019-09-04gl_device: Disable precise in fragment shaders on bugged driversReinUsesLisp3-15/+43
2019-09-04gl_shader_decompiler: Fixup AMD's slow path typeReinUsesLisp1-1/+1
2019-09-04gl_shader_decompiler: Rework GLSL decompiler type systemReinUsesLisp1-416/+505
GLSL decompiler type system was broken. We converted all return values to float except for some cases where returning we couldn't and implicitly broke the rule of returning floats (e.g. for bools or bool pairs). Instead of doing this introduce class Expression that knows what type a return value has and when a consumer wants to use the string it asks for it with a required type, emitting a runtime error if types are incompatible. This has the disadvantage that there's more C++ code, but we can emit better GLSL code that's easier to read.
2019-09-01maxwell_3d: Fix macro binding cursorReinUsesLisp2-10/+4
2019-08-30video_core: Silent miscellaneous warnings (#2820)Rodrigo Locatti23-48/+22
* texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables
2019-08-30gl_buffer_cache: Add missing includeReinUsesLisp1-0/+1
RasterizerInterface was considered an incomplete object by clang.
2019-08-28shader_ir/conversion: Split int and float selector and implement F2F H1ReinUsesLisp2-19/+24
2019-08-28shader_ir/conversion: Implement F2I F16 Ra.H1ReinUsesLisp2-6/+17
2019-08-28float_set_predicate: Add missing negation bit for the second operandReinUsesLisp2-4/+6
2019-08-21shader_ir: Implement VOTEReinUsesLisp11-1/+162
Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.
2019-08-21Buffer Cache: Adress Feedback.Fernando Sahmkow2-7/+6
2019-08-21Buffer_Cache: Implement flushing.Fernando Sahmkow2-1/+30
2019-08-21Buffer_Cache: Implement barriers.Fernando Sahmkow1-0/+4
2019-08-21Buffer_Cache: Optimize and track written areas.Fernando Sahmkow2-12/+104
2019-08-21BufferCache: Rework mapping caching.Fernando Sahmkow2-49/+76
2019-08-21Buffer_Cache: Fixes and optimizations.Fernando Sahmkow2-68/+38
2019-08-21Video_Core: Implement a new Buffer CacheFernando Sahmkow9-327/+560
2019-08-21renderer_opengl: Implement RGB565 framebuffer formatReinUsesLisp3-3/+9
2019-08-21renderer_opengl: Use block linear swizzling for CPU framebuffersReinUsesLisp3-150/+33
2019-08-21renderer_opengl: Use VideoCore pixel formatReinUsesLisp3-23/+11
2019-08-21gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfigReinUsesLisp10-31/+21
2019-08-04shader_ir: Implement NOPReinUsesLisp2-0/+13
2019-08-04half_set_predicate: Fix HSETP2_C constant buffer offsetReinUsesLisp1-1/+1
2019-07-26GPU: Flush commands on every dma pusher step.Fernando Sahmkow6-0/+15
This commit ensures that the host gpu is constantly fed with commands to work with, while the guest gpu keeps producing the rest of the commands. This reduces syncing time between host and guest gpu.
2019-07-26decode/half_set_predicate: Fix predicatesReinUsesLisp1-3/+3
2019-07-26MaxwellDMA: Fixes, corrections and relaxations.Fernando Sahmkow3-23/+36
This commit fixes offsets on Linear -> Tiled copies, corrects z pos fortiled->linear copies, corrects bytes_per_pixel calculation in tiled -> linear copies and relaxes some limitations set by latest dma fixes refactors.
2019-07-22shader/decode: Implement S2R TicReinUsesLisp3-0/+15
2019-07-20Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.Fernando Sahmkow5-18/+75
This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.
2019-07-20Maxwell3D: Reorganize and address feedbackFernando Sahmkow3-20/+33
2019-07-20Shader_Ir: Change Debug Asserts for Log WarningsFernando Sahmkow3-10/+17
2019-07-20shader/half_set_predicate: Fix HSETP2 implementationReinUsesLisp4-44/+23
2019-07-20shader/half_set_predicate: Implement missing HSETP2 variantsReinUsesLisp2-19/+49
2019-07-19video_core/control_flow: Provide operator!= for types with operator==Lioncash1-4/+21
Provides operational symmetry for the respective structures.
2019-07-19video_core/control_flow: Prevent sign conversion in TryGetBlock()Lioncash1-1/+1
The return value is a u32, not an s32, so this would result in an implicit signedness conversion.
2019-07-19video_core/control_flow: Remove unnecessary BlockStack copy constructorLioncash1-2/+1
This is the default behavior of the copy constructor, so it doesn't need to be specified. While we're at it we can make the other non-default constructor explicit.
2019-07-19video_core/control_flow: Use std::move where applicableLioncash1-10/+15
Results in less work being done where avoidable.
2019-07-19video_core/control_flow: Use the prefix variant of operator++ for iteratorsLioncash1-2/+2
Same thing, but potentially allows a standard library implementation to pick a more efficient codepath.
2019-07-19video_core/control_flow: Use empty() member function for checking emptinessLioncash1-2/+2
It's what it's there for.
2019-07-19video_core: Resolve -Wreorder warningsLioncash2-4/+3
Ensures that the constructor members are always initialized in the order that they're declared in.
2019-07-19video_core/control_flow: Make program_size for ScanFlow() a std::size_tLioncash2-5/+4
Prevents a truncation warning from occurring with MSVC. Also the internal data structures already treat it as a size_t, so this is just a discrepancy in the interface.
2019-07-19video_core/control_flow: Place all internally linked types/functions within an anonymous namespaceLioncash1-1/+2
Previously, quite a few functions were being linked with external linkage.
2019-07-19video_core/shader/decode: Prevent sign-conversion warningsLioncash1-2/+2
Makes it explicit that the conversions here are intentional.
2019-07-18Shader_Ir: correct clang formatFernando Sahmkow1-2/+2
2019-07-18GPU: Add missing puller methods.Fernando Sahmkow2-14/+15
This adds some missing puller methods. We don't assert them as these are nop operations for us.
2019-07-18MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.Fernando Sahmkow1-1/+1
This log was just to know which games used DMA. It's no longer important.
2019-07-18Gl_Texture_Cache: Remove assert on component type in GetFormatTupleFernando Sahmkow1-1/+0
Textures can have different components types in different orders. This assert was completely inprecise and the effectiveness of such is better handled by case and within the texture cache.
2019-07-18Shader_Ir: Downgrade precision and rounding asserts to debug asserts.Fernando Sahmkow5-10/+10
This commit reduces the sevirity of asserts for FP precision and rounding as this are well known and have little to no consequences in gpu's accuracy.
2019-07-18gl_shader_decompiler: Rename bufferImage to imageBufferReinUsesLisp1-1/+1
The online OpenGL documentation is wrong. The type definition is imageBuffer.
2019-07-18gl_shader_cache: Fix newline on buffer preprocessor definitionsReinUsesLisp1-2/+6
2019-07-18textures: Fix texture buffer size calculationReinUsesLisp1-1/+1
2019-07-18gl_texture_cache: Do not set texture parameters to buffersReinUsesLisp1-0/+3
2019-07-18gl_texture_cache: Add missing break in CreateTextureReinUsesLisp1-0/+1
2019-07-17GL_State: Feedback and fixesFernando Sahmkow4-14/+27
2019-07-17Maxwell3D: Address FeedbackFernando Sahmkow5-17/+13
2019-07-17Texture_Cache: Rebase FixesFernando Sahmkow1-6/+0
2019-07-17GL_Rasterizer: Corrections to Clearing.Fernando Sahmkow4-12/+28
2019-07-17Maxwell3D: Correct marking dirtiness on CB uploadFernando Sahmkow1-0/+1
2019-07-17GL_Rasterizer: Rework RenderTarget/DepthBuffer clearingFernando Sahmkow3-7/+63
2019-07-17Maxwell3D: Implement State Dirty Flags.Fernando Sahmkow6-44/+199
2019-07-17Maxwell3D: Rework CBData UploadFernando Sahmkow2-8/+45
2019-07-17Maxwell3D: Rework the dirty system to be more consistant and scaleableFernando Sahmkow10-80/+211
2019-07-17maxwell3d: Implement Conditional RenderingFernando Sahmkow3-2/+100
Conditional Rendering takes care of conditionaly clearing or drawing depending on a set of queries. This PR implements the query checks to stablish if things can be rendered or not.
2019-07-17shader_ir: std::move Node instance where applicableLioncash4-60/+67
These are std::shared_ptr instances underneath the hood, which means copying them isn't as cheap as a regular pointer. Particularly so on weakly-ordered systems. This avoids atomic reference count increments and decrements where they aren't necessary for the core set of operations.
2019-07-17shader_ir: Rename Get/SetTemporal to Get/SetTemporaryLioncash5-36/+36
This is more accurate in terms of describing what the functions are actually doing. Temporal relates to time, not the setting of a temporary itself.
2019-07-17shader_ir: Remove unused includesLioncash1-3/+0
Removes unnecessary header dependencies.
2019-07-16Shader_Ir: Correct tracking to track from right to leftFernando Sahmkow1-2/+2
2019-07-16shader/decode/other: Correct branch indirect argument within BRA handlingLioncash1-1/+1
This appears to have been a copy/paste error introduced within 8a6fc529a968e007f01464abadd32f9b5eb0a26c
2019-07-16gl_shader_cache: Fix clang-format issuesReinUsesLisp2-4/+2
2019-07-15gl_shader_decompiler: Stub local memory sizeReinUsesLisp1-8/+14
2019-07-15gl_shader_cache: Address review commentariesReinUsesLisp4-13/+12
2019-07-15gl_shader_cache: Address CI issuesReinUsesLisp2-3/+3
2019-07-15gl_rasterizer: Implement compute shadersReinUsesLisp15-136/+350
2019-07-15shader: Allow tracking of indirect buffers without variable offsetReinUsesLisp6-35/+36
While changing this code, simplify tracking code to allow returning the base address node, this way callers don't have to manually rebuild it on each invocation.
2019-07-14Texture_Cache: Address FeedbackFernando Sahmkow3-13/+17
2019-07-14Texture_Cache: Remove some unprecise fallback case and clang formatFernando Sahmkow2-13/+5
2019-07-14Texture_Cache: Force Framebuffer reset if an active render target is unregistered.Fernando Sahmkow3-10/+36
2019-07-14GPU: Add a microprofile for macro interpreterFernando Sahmkow2-1/+6
2019-07-14GL_State: Add a microprofile timer to OpenGL state.Fernando Sahmkow1-0/+4
2019-07-14Gl_Texture_Cache: Measure Buffer Copy TimesFernando Sahmkow1-0/+2
2019-07-14Texture_Cache: Correct Linear Structural Match.Fernando Sahmkow1-3/+6
2019-07-11gl_shader_decompiler: Fix gl_PointSize redeclarationReinUsesLisp1-1/+1
2019-07-11gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_arrayReinUsesLisp1-2/+3
2019-07-09shader_ir: Add comments on missing instruction.Fernando Sahmkow2-2/+9
Also shows Nvidia's address space on comments.
2019-07-09prefer system reference over global accessorMichael Scire3-9/+13
2019-07-09shader_ir: limit explorastion to best known program size.Fernando Sahmkow1-1/+1
2019-07-09control_flow: Correct block breaking algorithm.Fernando Sahmkow1-17/+17
2019-07-09control_flow: Assert shaders bigger than limit.Fernando Sahmkow1-0/+2
2019-07-09control_flow: Address feedback.Fernando Sahmkow1-89/+37
2019-07-09shader_ir: Correct parsing of scheduling instructions and correct sizingFernando Sahmkow2-13/+30
2019-07-09shader_ir: Correct max sizingFernando Sahmkow2-2/+2
2019-07-09shader_ir: Remove unnecessary constructors and use optional for ScanFlow resultFernando Sahmkow3-28/+17
2019-07-09shader_ir: Corrections, documenting and asserting control_flowFernando Sahmkow3-52/+54
2019-07-09shader_ir: Unify blocks in decompiled shaders.Fernando Sahmkow7-58/+85
2019-07-09shader_ir: Decompile Flow StackFernando Sahmkow4-11/+206
2019-07-09shader_ir: propagate shader size to the IRFernando Sahmkow6-17/+28
2019-07-09shader_ir: Implement BRX & BRA.CCFernando Sahmkow6-4/+76
2019-07-09shader_ir: Remove the old scanner.Fernando Sahmkow2-77/+0
2019-07-09shader_ir: Implement a new shader scannerFernando Sahmkow4-16/+473
2019-07-09gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()Lioncash1-7/+9
must_reconfigure isn't a parameter for this function any more, so it can be replaced with current_state. While we're at it, we can make the parameters of the declaration match the same name as the ones in the definition.
2019-07-09Prevent merging of device mapped memory blocks.Michael Scire1-1/+23
This sets the DeviceMapped attribute for GPU-mapped memory blocks, and prevents merging device mapped blocks. This prevents memory mapped from the gpu from having its backing address changed by block coalesce.
2019-07-08gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shadersReinUsesLisp10-40/+136
This commit implements gl_ViewportIndex and gl_Layer in vertex and geometry shaders. In the case it's used in a vertex shader, it requires ARB_shader_viewport_layer_array. This extension is available on AMD and Nvidia devices (mesa and proprietary drivers), but not available on Intel on any platform. At the moment of writing this description I don't know if this is a hardware limitation or a driver limitation. In the case that ARB_shader_viewport_layer_array is not available, writes to these registers on a vertex shader are ignored, with the appropriate logging.
2019-07-07Delete decode_integer_set.cppTobias1-0/+0
2019-07-07shader/texture: Add F16 support for TLDSReinUsesLisp2-2/+9
2019-07-07vk_sampler_cache: Remove unused includesLioncash1-3/+0
These are no longer used within this header, so they can be removed.
2019-07-07video_core: Add missing override specifiersLioncash2-4/+4
2019-07-07vk_scheduler: Drop execution context in favor of viewsReinUsesLisp6-50/+60
Instead of passing by copy an execution context through out the whole Vulkan call hierarchy, use a command buffer view and fence view approach. This internally dereferences the command buffer or fence forcing the user to be unable to use an outdated version of it on normal usage. It is still possible to keep store an outdated if it is casted to VKFence& or vk::CommandBuffer. While changing this file, add an extra parameter for Flush and Finish to allow releasing the fence from this calls.
2019-07-06buffer_cache: Avoid [[nodiscard]] to make clang-format happyReinUsesLisp1-5/+4
2019-07-06buffer_cache: Try to fix MinGW buildReinUsesLisp1-1/+1
2019-07-06gl_rasterizer: Fix nullptr dereference on disabled buffersReinUsesLisp3-5/+5
2019-07-06gl_rasterizer: Minor style changesReinUsesLisp4-32/+22
2019-07-06gl_rasterizer: Fix vertex and index data invalidationsReinUsesLisp4-8/+67
2019-07-06gl_buffer_cache: Implement with generic buffer cacheReinUsesLisp8-291/+92
2019-07-06buffer_cache: Implement a generic buffer cacheReinUsesLisp2-0/+301
Implements a templated class with a similar approach to our current generic texture cache. It is designed to be compatible with Vulkan and OpenGL,
2019-07-06gl_buffer_cache: Remove global system gettersReinUsesLisp3-9/+14
2019-07-06gl_device: Query SSBO alignmentReinUsesLisp2-0/+6
2019-07-06gl_buffer_cache: Implement flushingReinUsesLisp2-2/+11
2019-07-06gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cacheReinUsesLisp7-206/+35
2019-07-06gl_buffer_cache: Rework to support internalized buffersReinUsesLisp3-65/+174
2019-07-06gl_buffer_cache: Store in CachedBufferEntry the used buffer handleReinUsesLisp2-23/+30
2019-07-06gl_buffer_cache: Return used buffer from Upload functionReinUsesLisp4-36/+35
2019-07-06gl_rasterizer: Add some commentariesReinUsesLisp1-0/+5
2019-07-06gl_rasterizer: Make DrawParameters rasterizer instance constReinUsesLisp1-1/+1
2019-07-06gl_rasterizer: Move index buffer uploading to its own methodReinUsesLisp2-7/+18
2019-07-05NVServices: Styling, define constructors as explicit and correctionsFernando Sahmkow4-24/+24
2019-07-05NVFlinger: Correct GCC compile errorFernando Sahmkow2-6/+6
2019-07-05NVServices: Make NVEvents Automatic according to documentation.Fernando Sahmkow2-3/+6
2019-07-05GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardwareFernando Sahmkow5-29/+23
2019-07-05gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme.Fernando Sahmkow2-47/+3
2019-07-05nv_host_ctrl: Make Sync GPU variant always return synced result.Fernando Sahmkow4-5/+11
2019-07-05Async GPU: do invalidate as synced operationFernando Sahmkow1-6/+1
Async GPU: Always invalidate synced.
2019-07-05Gpu: use an std mutex instead of a spin_lock to guard syncpointsFernando Sahmkow2-6/+6
2019-07-05Gpu: Mark areas as protected.Fernando Sahmkow2-0/+13
2019-07-05nv_services: Stub CtrlEventSignalFernando Sahmkow2-1/+14
2019-07-05Gpu: Implement Hardware Interrupt Manager and manage GPU interruptsFernando Sahmkow5-4/+21
2019-07-05video_core: Implement GPU side SyncpointsFernando Sahmkow3-2/+51
2019-07-05texture_cache: Address FeedbackFernando Sahmkow4-12/+13
2019-07-05texture_cache: Correct Texture Buffer UploadingFernando Sahmkow3-2/+18
2019-07-04gl_shader_cache: Make CachedShader constructor privateZach Hilman2-5/+5
Fixes missing review comments introduced.
2019-07-01rasterizer_cache: Protect inherited caches from submission levelFernando Sahmkow3-1/+5
2019-06-30texture_cache: Pack sibling queries inside a methodReinUsesLisp1-6/+8
2019-06-30texture_cache: Use std::vector reservation for sampled_texturesReinUsesLisp1-17/+10
2019-06-30texture_cache: Style changesReinUsesLisp3-17/+13
2019-06-29texture_cache: Use std::array for siblings_tableReinUsesLisp1-10/+13
2019-06-29texture_cache: Address feedbackReinUsesLisp4-30/+13
2019-06-26texture_cache: Correct variable naming.Fernando Sahmkow1-3/+3
2019-06-26gl_texture_cache: Correct assertsFernando Sahmkow2-2/+2
2019-06-26texture_cache: Corrections, documentation and assertsFernando Sahmkow1-42/+42
2019-06-26surface_params: Corrections, asserts and documentation.Fernando Sahmkow2-43/+58
2019-06-25copy_params: use constexpr for constructorFernando Sahmkow1-3/+4
2019-06-25gl_texture_cache: Corrections and fixesFernando Sahmkow2-13/+9
2019-06-25gl_resource_manager: Correct MakeStreamCopyFernando Sahmkow2-3/+2
2019-06-25texture_cache: Query MemoryManager from the systemFernando Sahmkow5-20/+7
2019-06-24texture_cache: Include "core/core.h"ReinUsesLisp1-4/+1
2019-06-24gl_texture_cache: Explicitly add indirect includeReinUsesLisp1-0/+1
2019-06-24texture_cache/surface_view: Address feedbackReinUsesLisp1-1/+0
2019-06-24texture_cache/surface_base: Address feedbackReinUsesLisp2-2/+10
2019-06-24video_core/surface: Address feedbackReinUsesLisp1-2/+2
2019-06-24decode/texture: Address feedbackReinUsesLisp1-0/+1
2019-06-24renderer_opengl/utils: Remove unused includes and unused forward declarationReinUsesLisp1-4/+0
2019-06-24gl_texture_cache: Address some feedbackReinUsesLisp1-2/+4
2019-06-24gl_shader_disk_cache: Address feedbackReinUsesLisp2-4/+8
2019-06-24gl_shader_decompiler: Address feedbackReinUsesLisp1-11/+12
2019-06-24shader_bytecode: Include missing <array>ReinUsesLisp1-0/+1
2019-06-21texture_cache: Style and CorrectionsFernando Sahmkow7-71/+75
2019-06-21shader_cache: Correct versioning and size calculation.Fernando Sahmkow2-2/+7
2019-06-21texture_cache: Eliminate linear textures fallthroughFernando Sahmkow1-4/+0
2019-06-21texture_cache: Correct format R16U as siblingFernando Sahmkow2-1/+2
2019-06-21texture_cache: Implement texception detection and texture barriers.Fernando Sahmkow2-7/+40
2019-06-21texture_cache: Corrections to buffers and shadow formats use.Fernando Sahmkow1-10/+34
2019-06-21texture_cache: Implement Irregular Views in surfacesFernando Sahmkow2-4/+24
2019-06-21surface: Correct format S8Z24Fernando Sahmkow4-9/+5
2019-06-21texture_cache: Initialize all siblings to invalid pixel format.Fernando Sahmkow1-6/+15
2019-06-21gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies.Fernando Sahmkow3-5/+4
2019-06-21gl_texture_cache: Correct Image BlitFernando Sahmkow1-1/+1
2019-06-21decoders: correct block calculationFernando Sahmkow7-29/+41
2019-06-21texture_cache: Use siblings textures on Rebuild and fix possible error on blittingFernando Sahmkow2-11/+24
2019-06-21texture_cache: Remove old rasterizer cacheFernando Sahmkow2-1956/+0
2019-06-21texture_cache: Implement siblings texture formats.Fernando Sahmkow2-12/+31
2019-06-21fermi2d: Correct Origin ModeFernando Sahmkow1-5/+10
2019-06-21texture_cache: correct texture buffer on surface paramsFernando Sahmkow1-4/+11
2019-06-21texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability.Fernando Sahmkow4-22/+6
2019-06-21texture_cache: correct mutex locksFernando Sahmkow1-4/+4
2019-06-21shader_ir: Fix image copy rebase issuesFernando Sahmkow1-2/+7
2019-06-21texture_cache: Don't Image Copy if component types differFernando Sahmkow1-1/+2
2019-06-21texture_cache: move some large methods to cpp filesFernando Sahmkow4-139/+135
2019-06-21texture_cache: Optimize GetSurface and use references on functions that don't change a surface.Fernando Sahmkow3-12/+12
2019-06-21texture_cache: Implement Buffer Copy and detect Turing GPUs Image CopiesFernando Sahmkow8-12/+148
2019-06-21texture_cache uncompress-compress is untopological.Fernando Sahmkow5-19/+53
This makes conflicts between non compress and compress textures to be auto recycled. It also limits the amount of mipmaps a texture can have if it goes above it's limit.
2019-06-21texture_cache: Correct copying between compressed and uncompressed formatsFernando Sahmkow3-10/+27
2019-06-21texture_cache: Only load on recycle with accurate GPU.Fernando Sahmkow1-2/+3
Testing so far has proven this to be quite safe as texture memory read added a 2-5ms load to the current cache.
2019-06-21Fix rebase errorsFernando Sahmkow3-3/+13
2019-06-21texture_cache: Handle uncontinuous surfaces.Fernando Sahmkow4-21/+82
2019-06-21texture_cache: return null surface on invalid addressFernando Sahmkow1-0/+12
2019-06-21texture_cache: Add checks for texture buffers.Fernando Sahmkow1-2/+16
2019-06-21texture_cache: Fermi2D reform and implement View MirageFernando Sahmkow11-77/+125
This also does some fixes on compressed textures reinterpret and on the Fermi2D engine in general.
2019-06-21gl_shader_decompiler: Implement image binding settingsReinUsesLisp5-24/+52
2019-06-21shader: Implement bindless imagesReinUsesLisp3-2/+40
2019-06-21shader: Decode SUST and implement backing image functionalityReinUsesLisp8-3/+282
2019-06-21gl_rasterizer: Track texture buffer usageReinUsesLisp6-74/+119
2019-06-21video_core: Make ARB_buffer_storage a required extensionReinUsesLisp3-8/+5
2019-06-21gl_rasterizer_cache: Use texture buffers to emulate texture buffersReinUsesLisp5-11/+35
2019-06-21maxwell_3d: Partially implement texture buffers as 1D texturesReinUsesLisp4-10/+24
2019-06-21gl_shader_decompiler: Allow 1D textures to be texture buffersReinUsesLisp1-4/+38
2019-06-21shader: Implement texture buffersReinUsesLisp3-0/+62
2019-06-21texture_cache: loose TryReconstructSurface when accurate GPU is not on.Fernando Sahmkow3-4/+20
Also corrects some asserts.
2019-06-21texture_cache: Document the most important methods.Fernando Sahmkow1-8/+87
2019-06-21texture_cache: Try to Reconstruct Surface on bigger than overlap.Fernando Sahmkow1-4/+11
This fixes clouds in SMO Cap Kingdom and lens on Cloud Kingdom. Also moved accurate_gpu setting check to Pick Strategy
2019-06-21texture_cache: Implement Guard mechanismFernando Sahmkow2-1/+12
2019-06-21texture_cache: General FixesFernando Sahmkow8-47/+170
Fixed ASTC mipmaps loading Fixed alignment on openGL upload/download Fixed Block Height Calculation Removed unalign_height
2019-06-21surface_params: Ensure pitch is always written to avoid surface leaksReinUsesLisp1-0/+2
2019-06-21gl_framebuffer_cache: Use a hashed struct to cache framebuffersReinUsesLisp6-62/+148
2019-06-21texture_cache return invalid buffer on deactivated color_maskFernando Sahmkow2-2/+9
2019-06-21engine_upload: Addapt to new Texture CacheFernando Sahmkow2-5/+5
2019-06-21surface_params: Optimize CreateForTextureReinUsesLisp2-72/+76
Instead of using Common::AlignUp, use Common::AlignBits to align the texture compression factor.
2019-06-21gl_texture_cache: Make main views be proxy textures instead of a full view.Fernando Sahmkow2-11/+25
2019-06-21texture_cache: Add ASync ProtectionsFernando Sahmkow1-0/+10
2019-06-21Remove Framebuffer reconfiguration and restrict rendertarget protectionFernando Sahmkow4-39/+27
2019-06-21texture_cache: Implement GPU Dirty FlagsFernando Sahmkow1-15/+22
2019-06-21texture_cache: Optimize GetMipBlockHeight and GetMipBlockDepthFernando Sahmkow1-13/+6
2019-06-21texture_cache: Implement L1_Inner_cacheFernando Sahmkow1-13/+30
2019-06-21video_core: Use un-shifted block sizes to avoid integer divisionsReinUsesLisp9-60/+73
Instead of storing all block width, height and depths in their shifted form: block_width = 1U << block_shift; Store them like they are provided by the emulated hardware (their block_shift form). This way we can avoid doing the costly Common::AlignUp operation to align texture sizes and drop CPU integer divisions with bitwise logic (defined in Common::AlignBits).
2019-06-21texture_cache: Change internal cache from lists to vectorsReinUsesLisp1-6/+7
2019-06-21Reduce amount of size calculations.Fernando Sahmkow7-88/+86
2019-06-21texture_cache: Correct premature texceptionsFernando Sahmkow4-14/+51
Due to our current infrastructure, it is possible for a mipmap to be set on as a render target before a texception of that mipmap's superset be set afterwards. This is problematic as we rely on texture views to set up texceptions and protecting render targets targets for 3D texture rendering. One simple solution is to configure framebuffers after texture setup but this brings other problems. This solution, forces a reconfiguration of the framebuffers after such event happens.
2019-06-21texture_cache: Implement guest flushingFernando Sahmkow3-10/+29
2019-06-21Fixes to mipmap's process and reconstruct processFernando Sahmkow2-3/+3
2019-06-21surface_base: Add parenthesis to EmplaceOverview's predicateReinUsesLisp1-3/+2
2019-06-21Texture Cache: Implement Blitting and Fermi CopiesFernando Sahmkow7-100/+93
2019-06-21surface_view: Add constructor for ViewParamsReinUsesLisp3-39/+23
2019-06-21surface_base: Split BreakDown into layered and non-layered variantsReinUsesLisp1-45/+48
2019-06-21surface_base: Silence truncation warnings and minor renames and reorderingReinUsesLisp2-32/+37
2019-06-21copy_params: Use constructor instead of C-like initializationReinUsesLisp3-47/+39
2019-06-21Correct Mipmaps View method in Texture CacheFernando Sahmkow3-32/+29
2019-06-21Change texture_cache chaching from GPUAddr to CacheAddrFernando Sahmkow7-101/+60
This also reverses the changes to make invalidation and flushing through the GPU address.
2019-06-21Corrections to Structural MatchingFernando Sahmkow2-24/+53
The texture will now be reconstructed if the width only matches on GoB alignment.
2019-06-21Implement Texture Cache V2Fernando Sahmkow6-381/+568
2019-06-21Correct Surface Base and Views for new Texture CacheFernando Sahmkow7-380/+466
2019-06-21Add OGLTextureViewFernando Sahmkow2-0/+43
2019-06-21Deglobalize Memory Manager on texture cahe and Implement Invalidation and Flushing using GPUVAddrFernando Sahmkow4-1/+20
2019-06-21texture_cache: Remove execution context copies from the texture cacheReinUsesLisp7-168/+59
This is done to simplify the OpenGL implementation, it is needed for Vulkan.
2019-06-21gl_texture_cache: Implement fermi copiesReinUsesLisp5-2/+105
2019-06-21texture_cache: Split texture cache into different filesReinUsesLisp12-876/+965
2019-06-21texture_cache: Move staging buffer into a generic implementationReinUsesLisp4-181/+211
2019-06-21texture_cache: Flush 3D textures in the order they are drawnReinUsesLisp5-19/+44
2019-06-21gl_texture_cache: Minor changesReinUsesLisp5-140/+185
2019-06-21gl_texture_cache: Add copy from multiple overlaps into a single surfaceReinUsesLisp3-6/+84
2019-06-21gl_texture_cache: Attach surface textures instead of viewsReinUsesLisp3-20/+32
2019-06-21gl_texture_cache: Add fast copy pathReinUsesLisp4-7/+60
2019-06-21gl_texture_cache: Initial implementationReinUsesLisp9-47/+809
2019-06-18core: Remove unused CiTrace source filesLioncash1-1/+0
These source files have been unused for the entire lifecycle of the project. They're a hold-over from Citra and only add to the build time of the project, so they can be removed. There's also likely no way this would ever work in yuzu in its current form without revamping quite a bit of it, given how different the GPU on the Switch is compared to the 3DS.
2019-06-12gl_device: Fix TestVariableAoffi testReinUsesLisp1-1/+2
This test is intended to be invalid GLSL, but it was being invalid in two points instead of one. The intention is to use a non-immediate parameter in a textureOffset like function. The problem is that this shader was being compiled as a separable shader object and the text was writting to gl_Position without a redeclaration, being invalid GLSL. Address that issue by using a user-defined output attribute.
2019-06-09GPUVM: Correct GPU VM virtual address spaceFernando Sahmkow1-2/+2
2019-06-08kepler_compute: Use std::array for cbuf infoReinUsesLisp1-2/+3
2019-06-08kepler_compute: Fix block_dim_x encodingReinUsesLisp1-1/+1
2019-06-08gl_shader_cache: Use static constructors for CachedShader initializationReinUsesLisp2-52/+53
2019-06-08gl_rasterizer: Remove unused parameters in descriptor uploadsReinUsesLisp2-8/+6
2019-06-08video_core/engines: Move ConstBufferInfo out of Maxwell3DReinUsesLisp6-49/+64
2019-06-07shader: Split SSY and PBK stackReinUsesLisp4-27/+78
Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT;
2019-06-07shader/node: Minor changesReinUsesLisp1-50/+54
Reflect std::shared_ptr nature of Node on initializers and remove constant members in nodes. Add some commentaries.
2019-06-07shader: Move Node declarations out of the shader IR headerReinUsesLisp4-493/+518
Analysis passes do not have a good reason to depend on shader_ir.h to work on top of nodes. This splits node-related declarations to their own file and leaves the IR in shader_ir.h
2019-06-06shader: Use shared_ptr to store nodes and move initialization to fileReinUsesLisp35-248/+296
Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class.
2019-06-05core/core_timing_util: Use std::chrono types for specifying time unitsLioncash1-1/+1
Makes the interface more type-safe and consistent in terms of return values.
2019-06-04shader_bytecode: Mark EXIT as flow instructionFernando Sahmkow1-1/+1
2019-06-03gl_shader_decompiler: Remove guest "position" varyingReinUsesLisp2-36/+21
"position" was being written but not read anywhere besides geometry shaders, where it had the same value as gl_Position. This commit replaces "position" with gl_Position, reducing the complexity of our code and the emitted GLSL code.
2019-05-30gl_shader_cache: Store a system class and drop global accessorsReinUsesLisp2-7/+9
2019-05-30gl_shader_cache: Add commentaries explaining the intention in shaders creationReinUsesLisp1-0/+2
2019-05-30gl_shader_cache: Flip if condition in GetStageProgram to reduce indentationReinUsesLisp1-25/+26
2019-05-30gl_buffer_cache: Remove unused ReserveMemory methodReinUsesLisp2-13/+0
2019-05-30maxwell_to_gl: Use GL_CLAMP to emulate Clamp wrap modeReinUsesLisp3-7/+4
2019-05-30gl_rasterizer: Move alpha testing to the OpenGL pipelineReinUsesLisp8-71/+33
Removes the alpha testing code from each fragment shader invocation.
2019-05-30gl_rasterizer: Use GL_QUADS to emulate quads renderingReinUsesLisp6-132/+5
2019-05-27gl_device: Add commentary to AOFFI unit test source codeReinUsesLisp1-0/+1
The intention behind this commit is to hint someone inspecting an apitrace dump to ignore this ill-formed GLSL code.
2019-05-27gl_shader_gen: Always declare extensions after the version declarationReinUsesLisp2-7/+5
This addresses a bug on geometry shaders where code was being written before all #extension declarations were done. Ref to #2523
2019-05-26vk_device: Let formats array type be deducedReinUsesLisp1-33/+33
2019-05-26vk_shader_decompiler: Misc fixesReinUsesLisp2-45/+67
Fix missing OpSelectionMerge instruction. This caused devices loses on most hardware, Intel didn't care. Fix [-1;1] -> [0;1] depth conversions. Conditionally use VK_EXT_scalar_block_layout. This allows us to use non-std140 layouts on UBOs. Update external Vulkan headers.
2019-05-26vk_device: Enable features when available and misc changesReinUsesLisp2-43/+151
Keeps track of native ASTC support, VK_EXT_scalar_block_layout availability and SSBO range. Check for independentBlend and vertexPipelineStorageAndAtomics as a required feature. Always enable it. Use vk::to_string format to log Vulkan enums. Style changes.
2019-05-25renderer_opengl/utils: Use a std::string_view with LabelGLObject()Lioncash2-10/+10
Uses a std::string_view instead of a std::string, given the pointed to string isn't modified and is only used in a formatting operation. This is nice because a few usages directly supply a string literal to the function, allowing these usages to otherwise not heap allocate, unlike the std::string overloads. While we're at it, we can combine the address formatting into a single formatting call.
2019-05-24gl_shader_decompiler: Use an if based cbuf indexing for broken driversReinUsesLisp1-3/+20
The following code is broken on AMD's proprietary GLSL compiler: ```glsl uint idx = ...; vec4 values = ...; float some_value = values[idx & 3]; ``` It index the wrong components, to fix this the following pessimized code is emitted when that bug is present: ```glsl uint idx = ...; vec4 values = ...; float some_value; if ((idx & 3) == 0) some_value = values.x; if ((idx & 3) == 1) some_value = values.y; if ((idx & 3) == 2) some_value = values.z; if ((idx & 3) == 3) some_value = values.w; ```
2019-05-24gl_device: Add test to detect broken component indexingReinUsesLisp2-0/+60
Component indexing on AMD's proprietary driver is broken. This commit adds a test to detect when we are on a driver that can't successfully manage component indexing. It dispatches a dummy draw with just one vertex shader that writes to an indexed SSBO from the GPU with data sent through uniforms, it then reads that data from the CPU and compares the expected output.
2019-05-23shader/shader_ir: Make Comment() take a std::string by valueLioncash2-3/+3
This allows for forming comment nodes without making unnecessary copies of the std::string instance. e.g. previously: Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]", cbuf->GetIndex(), cbuf_offset)); Would result in a copy of the string being created, as CommentNode() takes a std::string by value (a const ref passed to a value parameter results in a copy). Now, only one instance of the string is ever moved around. (fmt::format returns a std::string, and since it's returned from a function by value, this is a prvalue (which can be treated like an rvalue), so it's moved into Comment's string parameter), we then move it into the CommentNode constructor, which then moves the string into its member variable).
2019-05-23shader/decode/*: Add missing newline to files lacking themLioncash18-18/+18
Keeps the shader code file endings consistent.
2019-05-23shader/decode/*: Eliminate indirect inclusionsLioncash6-1/+5
Amends cases where we were using things that were indirectly being satisfied through other headers. This way, if those headers change and eliminate dependencies on other headers in the future, we don't have cascading compilation errors.
2019-05-22shader/decode/memory: Remove left in debug pragmaLioncash1-2/+0
2019-05-21renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format stringLioncash1-1/+1
This accidentally slipped through a rebase.
2019-05-21gl_shader_cache: Fix clang strict standard build issuesReinUsesLisp3-9/+13
2019-05-21gl_shader_cache: Use shared contexts to build shaders in parallelReinUsesLisp6-47/+103
2019-05-21shader/memory: Implement ST (generic memory)ReinUsesLisp2-21/+36
2019-05-21shader/memory: Implement LD (generic memory)ReinUsesLisp3-15/+38
2019-05-20shader: Implement S2R Tid{XYZ} and CtaId{XYZ}ReinUsesLisp4-15/+69
2019-05-20gl_shader_decompiler: Make GetSwizzle constexprReinUsesLisp1-7/+7
2019-05-20gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenationLioncash1-21/+20
2019-05-20gl_shader_decompiler: Replace individual overloads with the fmt-based oneLioncash1-28/+16
Gets rid of the need to special-case brace handling depending on the overload used, and makes it consistent across the board with how fmt handles them. Strings with compile-time deducible strings are directly forwarded to std::string's constructor, so we don't need to worry about the performance difference here, as it'll be identical.
2019-05-20gl_shader_decompiler: Utilize fmt overload of AddLine() where applicableLioncash1-136/+152
2019-05-19Revert #2466Fernando Sahmkow1-1/+3
This reverts a tested behavior on delay slots not exiting if the exit flag is set. Currently new tests are required in order to ensure this behavior.
2019-05-19gl_shader_decompiler: Add AddLine() overload that forwards to fmtLioncash1-0/+11
In a lot of places throughout the decompiler, string concatenation via operator+ is used quite heavily. This is usually fine, when not heavily used, but when used extensively, can be a problem. operator+ creates an entirely new heap allocated temporary string and given we perform expressions like: std::string thing = a + b + c + d; this ends up with a lot of unnecessary temporary strings being created and discarded, which kind of thrashes the heap more than we need to. Given we utilize fmt in some AddLine calls, we can make this a part of the ShaderWriter's API. We can make an overload that simply acts as a passthrough to fmt. This way, whenever things need to be appended to a string, the operation can be done via a single string formatting operation instead of discarding numerous temporary strings. This also has the benefit of making the strings themselves look nicer and makes it easier to spot errors in them.
2019-05-19Dma_pusher: ASSERT on empty command_listFernando Sahmkow1-0/+7
This is a measure to avoid crashes on command list reading as an empty command_list is considered a NOP.
2019-05-19shader/shader_ir: Remove unnecessary inline specifiersLioncash1-2/+2
constexpr internally links by default, so the inline specifier is unnecessary.
2019-05-19shader/shader_ir: Simplify constructors for OperationNodeLioncash1-15/+6
Many of these constructors don't even need to be templated. The only ones that need to be templated are the ones that actually make use of the parameter pack. Even then, since std::vector accepts an initializer list, we can supply the parameter pack directly to it instead of creating our own copy of the list, then copying it again into the std::vector.
2019-05-19shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicableLioncash1-2/+0
These overloads don't actually make use of the parameter pack, so they can be turned into regular non-template function overloads.
2019-05-19shader/shader_ir: Mark tracking functions as const member functionsLioncash2-8/+11
These don't actually modify instance state, so they can be marked as const member functions
2019-05-19shader/shader_ir: Place implementations of constructor and destructor in cpp fileLioncash2-5/+9
Given the class contains quite a lot of non-trivial types, place the constructor and destructor within the cpp file to avoid inlining construction and destruction code everywhere the class is used.
2019-05-19gl_shader_gen: std::move objects where applicableLioncash1-7/+7
Avoids performing copies into the pair being returned. Instead, we can just move the resources into the pair, avoiding the need to make copies of both the std::string and ShaderEntries struct.
2019-05-19gl_shader_disk_cache: in-class initialize virtual file offset of ShaderDiskCacheOpenGLLioncash2-5/+3
Given the offset is assigned a fixed value in the constructor, we can just assign it directly and get rid of the need to write the name of the variable again in the constructor initializer list.
2019-05-19gl_shader_disk_cache: Default ShaderDiskCacheOpenGL's destructor in the cpp fileLioncash2-0/+3
Given the disk shader cache contains non-trivial types, we should default it in the cpp file in order to prevent inlining of the complex destruction logic.
2019-05-19gl_shader_disk_cache: Make hash specializations noexceptLioncash1-2/+2
The standard library expects hash specializations that don't throw exceptions. Make this explicit in the type to allow selection of better code paths if possible in implementations.
2019-05-19gl_shader_disk_cache: Remove redundant code string construction in LoadDecompiledEntry()Lioncash1-2/+2
We don't need to load the code into a vector and then construct a string over the data. We can just create a string with the necessary size ahead of time, and read the data directly into it, getting rid of an unnecessary heap allocation.
2019-05-19gl_shader_disk_cache: Make variable non-const in decompiled entry caseLioncash1-1/+1
std::move does nothing when applied to a const variable. Resources can't be moved if the object is immutable. With this change, we don't end up making several unnecessary heap allocations and copies.
2019-05-19gl_shader_disk_cache: Special-case boolean handlingLioncash2-24/+37
Booleans don't have a guaranteed size, but we still want to have them integrate into the disk cache system without needing to actually use a different type. We can do this by supplying non-template overloads for the bool type. Non-template overloads always have precedence during function resolution, so this is safe to provide. This gets rid of the need to smatter ternary conditionals, as well as the need to use u8 types to store the value in.
2019-05-18gl_rasterizer: Limit OpenGL point size to a minimum of 1ReinUsesLisp1-1/+3
2019-05-18maxwell_to_gl: Add TriangleFan primitive topologyReinUsesLisp1-0/+2
2019-05-17gl_rasterizer: Pass the right number of array quad vertices countReinUsesLisp1-2/+2
2019-05-14maxwell_3d: reduce sevirity of different component formats assert.Fernando Sahmkow1-1/+1
This was reduced due to happening on most games and at such constant rate that it affected performance heavily for the end user. In general, we are well aware of the assert and an implementation is already planned.
2019-05-14video_core/engines/engine_upload: Amend constructor initializer list orderLioncash1-1/+1
Silences a -Wreorder warning.
2019-05-14video_core/engines/engine_upload: Default destructor in the cpp fileLioncash2-1/+3
Avoids inlining destruction logic where applicable, and also makes forward declarations not cause unexpected compilation errors depending on where the State class is used.
2019-05-14video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarationsLioncash1-2/+2
These only apply in the definition of the function. They can be omitted from the declaration.
2019-05-14video_core/engines/engine_upload: Remove unnecessary includesLioncash2-2/+2
2019-05-14video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()Lioncash1-3/+3
We can use the named constant instead of using 32 directly.
2019-05-14video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()Lioncash1-15/+15
Lessens the amount of code that needs to be read, and gets rid of the need to introduce an indexing variable. Instead, we just operate on the objects directly.
2019-05-14video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for RegsLioncash1-0/+2
std::memset is used to clear the entire register structure, which requires that the Regs struct be trivially copyable (otherwise undefined behavior is invoked). This prevents the case where a non-trivial type is potentially added to the struct.
2019-05-14yuzu: Remove explicit types from locks where applicableLioncash2-2/+2
With C++17's deduction guides, the type doesn't need to be explicitly specified within locking primitives anymore.
2019-05-14video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainerLioncash1-6/+0
std::move within a copy constructor (on a data member that isn't mutable) will always result in a copy. Because of that, the behavior of this copy constructor is identical to the one that would be generated automatically by the compiler, so we can remove it.
2019-05-12GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.Sebastian Valle1-3/+3
It seems instructions marked with the 'exit' flag will not cause an exit when executed within a delay slot. This was hwtested by fincs.
2019-05-10video_core/memory_manager: Mark IsBlockContinuous() as a const member functionLioncash2-4/+4
Corrects the typo in its name and marks the function as a const member function, given it doesn't actually modify memory manager state.
2019-05-10video_core/memory_manager: Mark the constructor as explicitLioncash1-1/+1
Prevents implicit converting constructions of the memory manager.
2019-05-10video_core/memory_manager: Default the destructor within the cpp fileLioncash2-0/+3
Makes the class less surprising when it comes to forward declaring the type, and also prevents inlining the destruction code of the class, given it contains non-trivial types.
2019-05-10video_core/memory_manager: Amend doxygen commentsLioncash1-7/+7
Corrects references to non-existent parameters and corrects typos.
2019-05-10video_core/memory_manager: Remove superfluous const from function declarationsLioncash1-7/+7
These are able to be omitted from the declaration of functions, since they don't do anything at the type system level. The definitions of the functions can retain the use of const though, since they make the variables immutable in the implementation of the function where they're used.
2019-05-10video_core/renderer_opengl/gl_shader_cache: Correct member initialization orderLioncash1-1/+1
Silences a -Wreorder warning.
2019-05-10video_core/shader/decode/texture: Remove unused variable from GetTld4Code()Lioncash1-1/+0
2019-05-10renderer_vulkan/vk_shader_decompiler: Remove unused variable from DeclareInternalFlags()Lioncash1-1/+0
2019-05-10video_core/renderer_opengl/gl_shader_decompiler: Remove unused Composite() functionLioncash1-11/+0
This isn't used at all, so it can be removed.
2019-05-10video_core/renderer_opengl/gl_rasterizer_cache: Remove unused variable in UploadGLMipmapTexture()Lioncash1-1/+0
This variable is unused entirely, so it can be removed.
2019-05-10video_core/gpu_thread: Remove unused local variableLioncash1-1/+1
Instead of retrieving the data from the std::variant instance, we can just check if the variant contains that type of data. This is essentially the same behavior, only it returns a bool indicating whether or not the type in the variant is currently active, instead of actually retrieving the data.
2019-05-10video_core/textures/astc: Remove unused variablesLioncash1-6/+2
Silences a few compilation warnings.
2019-05-07Correct possible error on Rasterizer CachesFernando Sahmkow1-1/+2
There was a weird bug that could happen if the object died directly and the cache address wasn't stored.
2019-05-04shader/decode/texture: Remove unused variableLioncash1-1/+0
This isn't used anywhere, so we can get rid of it.
2019-05-04gl_rasterizer: Silence unused variable warningLioncash1-2/+2
Makes use of src, so it's not considered unused.
2019-05-03shader_ir/other: Implement IPA.IDXReinUsesLisp2-5/+9
2019-05-03gl_shader_decompiler: Skip physical unused attributesReinUsesLisp1-18/+27
2019-05-03shader_ir/memory: Assert on non-32 bits ALD.PHYSReinUsesLisp1-0/+3
2019-05-03shader: Add physical attributes commentariesReinUsesLisp4-4/+8
2019-05-03gl_shader_decompiler: Implement GLSL physical attributesReinUsesLisp2-66/+101
2019-05-03shader_ir/memory: Implement physical input attributesReinUsesLisp4-6/+32
2019-05-03gl_shader_decompiler: Abstract generic attribute operationsReinUsesLisp1-29/+26
2019-05-03gl_shader_decompiler: Declare all possible varyings on physical attribute usageReinUsesLisp4-27/+88
2019-05-03shader: Remove unused AbufNode Ipa modeReinUsesLisp6-35/+14
2019-05-03shader_ir/memory: Emit AL2P IRReinUsesLisp2-0/+22
2019-05-03shader_bytecode: Add AL2P decodingReinUsesLisp1-2/+15
2019-05-01Refactors and name corrections.Fernando Sahmkow6-35/+35
2019-05-01gl_shader_disk_cache: Skip stored shader variants instead of assertingReinUsesLisp1-1/+4
Instead of asserting on already stored shader variants, silently skip them. This shouldn't be happening but when a shader is invalidated and it is not stored in the shader cache, this assert would hit and save that shader anyways when the asserts are disabled.
2019-05-01Fix Layered ASTC TexturesFernando Sahmkow1-1/+3
By adding the missing layer offset in ASTC compression.
2019-04-26shader_ir: Move Sampler index entry in operand< to sort declarationsReinUsesLisp1-2/+2
2019-04-26shader_ir: Add missing entry to Sampler operand< comparisonReinUsesLisp1-2/+3
2019-04-26shader_ir/texture: Fix sampler const buffer key shiftReinUsesLisp1-1/+1
2019-04-23Re added new lines at the end of filesFreddyFunk2-2/+2
2019-04-23gl_shader_disk_cache: Compress precompiled shader cache file with Zstandardunknown1-6/+10
2019-04-23gl_shader_disk_cache: Use VectorVfsFile for the virtual precompiled shader cache fileunknown3-101/+168
2019-04-23gl_shader_disk_cache: Remove per shader compressionunknown2-45/+11
2019-04-23Fixes and Corrections to DMA EngineFernando Sahmkow2-37/+57
2019-04-23Add Swizzle Parameters to the DMA engineFernando Sahmkow2-2/+27
2019-04-23Add Documentation Headers to all the GPU EnginesFernando Sahmkow5-0/+29
2019-04-23Corrections and stylingFernando Sahmkow5-6/+9
2019-04-23Implement Maxwell3D Data UploadFernando Sahmkow2-3/+32
2019-04-23Introduce skeleton of the GPU Compute Engine.Fernando Sahmkow3-8/+202
2019-04-23Revamp Kepler Memory to use a subegine to manage uploadsFernando Sahmkow6-93/+134
2019-04-21Rasterizer Cache: Use a temporal storage for Surfaces loading/flushing.Fernando Sahmkow4-18/+30
This PR should heavily reduce memory usage since temporal buffers are no longer stored per Surface but instead managed by the Rasterizer Cache.
2019-04-21Corrections Half Float operations on const buffers and implement saturation.Fernando Sahmkow2-15/+16
2019-04-20Apply Position Y DirectionFernando Sahmkow1-0/+3
2019-04-20RasterizerCache Redesign: Flush Fernando Sahmkow6-17/+26
flushing is now responsability of children caches instead of the cache object. This change will allow the specific cache to pass extra parameters on flushing and will allow more flexibility.
2019-04-20make ReadBlockunsafe and WriteBlockunsafe, ignore invalid pages.Fernando Sahmkow1-4/+12
2019-04-19gl_state: Fix samplers memory corruptionReinUsesLisp1-3/+5
It was possible for "samplers" to be read without being written. This addresses that.
2019-04-18video_core: Silent -Wswitch warningsReinUsesLisp10-77/+106
2019-04-17Implement IsBlockContinousFernando Sahmkow2-2/+13
This detects when a GPU Memory Block is not continous within host cpu memory.
2019-04-16Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators.Fernando Sahmkow2-9/+12
2019-04-16Use ReadBlockUnsafe for fetyching DMA CommandListsFernando Sahmkow2-4/+2
2019-04-16Document unsafe versions and add BlockCopyUnsafeFernando Sahmkow3-16/+45
2019-04-16Use ReadBlockUnsafe for Shader CacheFernando Sahmkow1-5/+7
2019-04-16Use ReadBlockUnsafe on TIC and TSC readingFernando Sahmkow2-2/+4
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed from host GPU there.
2019-04-16GPU MemoryManager: Implement ReadBlockUnsafe and WriteBlockUnsafeFernando Sahmkow2-0/+34
2019-04-16Use WriteBlock and ReadBlock.Fernando Sahmkow1-10/+6
2019-04-16Implement Block Linear copies in Kepler Memory.Fernando Sahmkow3-5/+38
2019-04-16vk_shader_decompiler: Add missing operationsReinUsesLisp1-0/+7
2019-04-16shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmeticReinUsesLisp9-85/+72
Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall.
2019-04-16gl_shader_decompiler: Fix MrgH0 decompilationReinUsesLisp1-2/+2
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-16shader_ir/decode: Implement half float saturationReinUsesLisp5-8/+31
2019-04-16shader_ir/decode: Reduce severity of unimplemented half-float FTZReinUsesLisp3-3/+9
2019-04-16renderer_opengl: Implement half float NaN comparisonsReinUsesLisp3-36/+59
2019-04-16shader_ir: Avoid using static on heap-allocated objectsReinUsesLisp1-5/+4
Using static here might be faster at runtime, but it adds a heap allocation called before main.
2019-04-16Do some corrections in conversion shader instructions.Fernando Sahmkow2-23/+73
Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs.
2019-04-15Correct Kepler Memory on Linear Pushes.Fernando Sahmkow2-16/+48
2019-04-15Support compressed formats on linear textures.Fernando Sahmkow1-2/+5
2019-04-15Correct Pitch in Fermi2DFernando Sahmkow1-4/+1
2019-04-14gl_shader_decompiler: Use variable AOFFI on supported hardwareReinUsesLisp10-71/+102
2019-04-14shader_ir: Implement STG, keep track of global memory usage and flushReinUsesLisp11-89/+186
2019-04-12video_core/gpu: Create threads separately from initializationLioncash9-14/+47
Like with CPU emulation, we generally don't want to fire off the threads immediately after the relevant classes are initialized, we want to do this after all necessary data is done loading first. This splits the thread creation into its own interface member function to allow controlling when these threads in particular get created.
2019-04-11gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurfaceFernando Sahmkow1-4/+10
2019-04-11gl_shader_manager: Move code to source file and minor clean upReinUsesLisp2-34/+61
2019-04-10gl_rasterizer: Apply just the needed state on ClearReinUsesLisp1-4/+4
2019-04-10gl_device: Implement interface and add uniform offset alignmentReinUsesLisp5-13/+70
2019-04-10vk_shader_decompiler: Implement flow primitivesReinUsesLisp1-5/+82
2019-04-10vk_shader_decompiler: Implement most common texture primitivesReinUsesLisp1-8/+65
2019-04-10vk_shader_decompiler: Implement texture decompilation helper functionsReinUsesLisp1-0/+32
2019-04-10vk_shader_decompiler: Implement Assign and LogicalAssignReinUsesLisp1-2/+64
2019-04-10vk_shader_decompiler: Implement non-OperationCode visitsReinUsesLisp1-7/+129
2019-04-10vk_shader_decompiler: Implement OperationCode decompilation interfaceReinUsesLisp1-1/+411
2019-04-10vk_shader_decompiler: Implement VisitReinUsesLisp1-1/+50
2019-04-10vk_shader_decompiler: Implement labels tree and flowReinUsesLisp1-0/+71
2019-04-10vk_shader_decompiler: Implement declarationsReinUsesLisp1-3/+457
2019-04-10vk_shader_decompiler: Declare and stub interface for a SPIR-V decompilerReinUsesLisp3-0/+127
2019-04-10video_core: Add sirit as optional dependency with VulkanReinUsesLisp1-1/+4
sirit is a runtime assembler for SPIR-V
2019-04-10Remove bounding in LD_CFernando Sahmkow1-2/+1
2019-04-09Correct Fermi Copy on Linear Textures.Fernando Sahmkow1-0/+4
2019-04-09Implement Texture Format ZF32_X24S8.Fernando Sahmkow1-0/+2
2019-04-09Correct depth compare with color formats for R32FFernando Sahmkow1-2/+17
2019-04-08gl_backend: Align Pixel StorageFernando Sahmkow2-4/+12
This commit makes sure GL reads on the correct pack size for the respective texture buffer.
2019-04-08Correct LOP_IMN encodingFernando Sahmkow1-1/+1
2019-04-08Correct XMAD mode, psl and high_b on different encodings.Fernando Sahmkow2-9/+33
2019-04-08Adapt Bindless to work with AOFFIFernando Sahmkow1-7/+18
2019-04-08Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.Fernando Sahmkow9-44/+25
2019-04-08Fix bad rebaseFernando Sahmkow1-2/+1
2019-04-08Fix TMMLFernando Sahmkow1-5/+7
2019-04-08Simplify ConstBufferAccessorFernando Sahmkow5-53/+22
2019-04-08Refactor GetTextureCode and GetTexCode to use an optional instead of optional parametersFernando Sahmkow2-34/+33
2019-04-08Implement TXQ_BFernando Sahmkow2-2/+10
2019-04-08Implement TMML_BFernando Sahmkow1-5/+10
2019-04-08Corrections to TEX_BFernando Sahmkow2-4/+37
2019-04-08Fixes to Const Buffer Accessor and FormattingFernando Sahmkow3-10/+10
2019-04-08Implement Bindless Handling on SetupTextureFernando Sahmkow4-18/+34
2019-04-08Unify both sampler types.Fernando Sahmkow4-22/+48
2019-04-08Implement Bindless Samplers and TEX_B in the IR.Fernando Sahmkow4-16/+77
2019-04-08Implement Const Buffer AccessorFernando Sahmkow5-2/+65
2019-04-07Permit a Null Shader in case of a bad host_ptr.Fernando Sahmkow1-0/+4
2019-04-06maxwell_3d: Reduce severity of ProcessSyncPointReinUsesLisp1-2/+2
2019-04-06video_core/textures/convert: Replace include with a forward declarationLioncash2-1/+5
Avoids dragging in a direct dependency in a header.
2019-04-06video_core/texures/texture: Remove unnecessary includesLioncash6-2/+5
Nothing in this header relies on common_funcs or the memory manager. This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06memory_manager: Improved implementation of read/write/copy block.bunnei3-12/+84
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY! - Fixes a crash with Mario Tennis Aces
2019-04-06video_core/macro_interpreter: Remove assertion within FetchParameter()Lioncash1-2/+1
We can just use .at(), which essentially does the same thing, but with less code.
2019-04-06video_core/macro_interpreter: Simplify GetRegister()Lioncash1-11/+6
Given we already ensure nothing can set the zeroth register in SetRegister(), we don't need to check if the index is zero and special case it. We can just access the register normally, since it's already going to be zero. We can also replace the assertion with .at() to perform the equivalent behavior inline as part of the API.
2019-04-06video_core/memory_manager: Make Read() a const qualified member functionLioncash2-6/+6
Given this doesn't actually alter internal state, this can be made a const member function.
2019-04-06video_core/memory_manager: Make ReadBlock() a const qualifier member functionLioncash2-2/+2
Now, since we have a const qualified variant of GetPointer(), we can put it to use in ReadBlock() to retrieve the source pointer that is passed into memcpy. Now block reading may be done from a const context.
2019-04-06video_core/memory_manager: Add a const qualified variant of GetPointer()Lioncash2-2/+17
Allows retrieving read-only pointers from a const context externally.
2019-04-06video_core/memory_manager: Make FindFreeRegion() a const member functionLioncash2-10/+11
This doesn't modify internal state, so it can be made a const member function.
2019-04-06video_core/memory_manager: Make GpuToCpuAddress() a const member functionLioncash2-3/+3
This doesn't modify any internal state, so it can be made a const member function to allow its use in const contexts.
2019-04-06Implement SyncPoint Register in the GPU.Fernando Sahmkow2-1/+27
2019-04-06video_core/gpu_thread: Silence truncation warning in ThreadManager's constructorLioncash1-1/+1
Since c5d41fd812d7eb1a04f36b76c08fe971cee0868c callback parameters were changed to use an s64 to represent late cycles instead of an int, so this was causing a truncation warning to occur here. Changing it to s64 is sufficient to silence the warning.
2019-04-06video_core/engines: Make memory manager members privateLioncash9-13/+14
These aren't used externally by anything, so they can be made private data members.
2019-04-06video_core/engines: Remove unnecessary inclusions where applicableLioncash10-9/+25
Replaces header inclusions with forward declarations where applicable and also removes unused headers within the cpp file. This reduces a few more dependencies on core/memory.h
2019-04-06renderer_opengl/utils: Skip empty bindsReinUsesLisp1-0/+3
2019-04-06gl_rasterizer: Use ARB_multi_bind to update SSBOsReinUsesLisp2-9/+9
2019-04-06gl_rasterizer: Use ARB_multi_bind to update UBOs across stagesReinUsesLisp4-22/+58
2019-04-05gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()Lioncash1-12/+12
Temporal generally indicates a relation to time, but this is just creating a temporary, so this isn't really an accurate name for what the function is actually doing.
2019-04-05gl_shader_decompiler: Fix TXQ typesReinUsesLisp1-2/+3
TXQ returns integer types. Shaders usually do: R0 = TXQ(); // => int R0 = static_cast<float>(R0); If we don't treat it as an integer, it will cast a binary float value as float - resulting in a corrupted number.
2019-04-04video_core/renderer_opengl: Remove unnecessary includesLioncash13-24/+4
Quite a few unused includes have built up over time, particularly on core/memory.h. Removing these includes means the source files including those files will no longer need to be rebuilt if they're changed, making compilation slightly faster in this scenario.
2019-04-04gl_state: Rework to enable individual appliesReinUsesLisp3-339/+324
2019-04-03shader_ir/memory: Reduce severity of LD_L cache management and log itReinUsesLisp2-2/+9
2019-04-03shader_ir/memory: Reduce severity of ST_L cache management and log itReinUsesLisp2-3/+11
2019-04-03gl_shader_decompiler: Return early when an operation is invalidReinUsesLisp1-1/+6
2019-04-02gl_sampler_cache: Port sampler cache to OpenGLReinUsesLisp5-123/+82
2019-04-02video_core: Abstract vk_sampler_cache into a templated classReinUsesLisp5-58/+101
2019-04-02gpu_thread: Improve synchronization by using CoreTiming.bunnei3-51/+65
2019-04-01general: Use deducation guides for std::lock_guard and std::unique_lockLioncash4-17/+17
Since C++17, the introduction of deduction guides for locking facilities means that we no longer need to hardcode the mutex type into the locks themselves, making it easier to switch mutex types, should it ever be necessary in the future.
2019-03-31gl_shader_decompiler: Hide local definitions inside an anonymous namespaceReinUsesLisp1-6/+8
2019-03-31shader_ir/decode: Silent implicit sign conversion warningMat M1-2/+2
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-30gl_shader_decompiler: Add AOFFI backing implementationReinUsesLisp1-38/+85
2019-03-30shader_ir/decode: Implement AOFFI for TEX and TLD4ReinUsesLisp2-27/+94
2019-03-30shader_ir: Implement immediate register trackingReinUsesLisp2-1/+19
2019-03-29common/zstd_compression: simplify decompression interfaceunknown1-3/+2
2019-03-29gl_shader_disk_cache: Fixup clang formatunknown1-2/+3
2019-03-29gl_shader_disk_cache: Use Zstandard for compressionunknown1-6/+6
2019-03-29gl_shader_disk_cache: Use LZ4HC with compression level 9 instead of compression level 12 for less compression timeunknown1-3/+3
2019-03-29Addressed feedbackunknown1-6/+6
2019-03-29gl_shader_disk_cache: Use better compression for transferable and precompiled shader disk chache filesunknown1-2/+2
2019-03-29data_compression: Move LZ4 compression from video_core/gl_shader_disk_cache to common/data_compressionunknown2-39/+9
2019-03-29vk_swapchain: Implement a swapchain managerReinUsesLisp3-1/+305
2019-03-28gl_shader_manager: Remove unnecessary gl_shader_manager inclusionLioncash1-2/+0
This isn't used at all in the OpenGL shader cache, so we can remove it's include here, meaning one less file needs to be recompiled if any changes ever occur within that header. core/memory.h is also not used within this file at all, so we can remove it as well.
2019-03-28gl_shader_manager: Move using statement into the cpp fileLioncash2-4/+4
Avoids introducing Maxwell3D into the namespace for everything that includes the header.
2019-03-28gl_shader_manager: Remove reliance on global accessor within MaxwellUniformData::SetFromRegs()Lioncash3-9/+9
We can just pass in the Maxwell3D instance instead of going through the system class to get at it. This also lets us simplify the interface a little bit. Since we pass in the Maxwell3D context now, we only really need to pass the shader stage index value in.
2019-03-27gl_shader_manager: Amend Doxygen string for MaxwellUniformDataLioncash1-3/+3
Previously only one line of the whole comment was in proper Doxygen formatting.
2019-03-27gpu_thread: Remove unused dma_pusher class member variable from ThreadManagerLioncash2-5/+2
The pusher instance is only ever used in the constructor of the ThreadManager for creating the thread that the ThreadManager instance contains. Aside from that, the member is unused, so it can be removed.
2019-03-27gl_rasterizer: Remove unused reference member variable from RasterizerOpenGLLioncash3-9/+5
This member variable is no longer being used, so it can be removed, removing a dependency on EmuWindow from the rasterizer's interface"
2019-03-27video_core: Amend constructor initializer list order where applicableLioncash6-14/+14
Specifies the members in the same order that initialization would take place in. This also silences -Wreorder warnings.
2019-03-27video_core: Add missing override specifiersLioncash3-4/+4
Ensures that the signatures will always match with the base class. Also silences a few compilation warnings.
2019-03-27video_core/gpu: Amend typo in GPU member variable nameLioncash2-7/+8
smaphore -> semaphore
2019-03-22video_core: Implement API agnostic view based texture cacheReinUsesLisp3-0/+974
Implements an API agnostic texture view based texture cache. Classes defined here are intended to be inherited by the API implementation and used in API-specific code. This implementation exposes protected virtual functions to be called from the implementer. Before executing any surface copies methods (defined in API-specific code) it tries to detect if the overlapping surface is a superset and if it is, it creates a view. Views are references of a subset of a surface, it can be a superset view (the same as referencing the whole texture). Current code manages 1D, 1D array, 2D, 2D array, cube maps and cube map arrays with layer and mipmap level views. Texture 3D slices views are not implemented. If the view attempt fails, the fast path is invoked with the overlapping textures (defined in the implementer). If that one fails (returning nullptr) it will flush and reload the texture.
2019-03-22Revert "Devirtualize Register/Unregister and use a wrapper instead."bunnei3-8/+12
- Fixes graphical issues from transitions in Super Mario Odyssey.
2019-03-21memory_manager: Cleanup FindFreeRegion.bunnei2-12/+6
2019-03-21memory_manager: Use Common::AlignUp in public interface as needed.bunnei1-11/+22
2019-03-21memory_manager: Bug fixes and further cleanup.bunnei2-73/+72
2019-03-21maxwell_dma: Check for valid source in destination before copy.bunnei1-0/+10
- Avoid a crash in Octopath Traveler.
2019-03-21memory_manager: Add protections for invalid GPU addresses.bunnei2-22/+43
- Avoid a crash in Xenoblade Chronicles 2.
2019-03-21gl_rasterizer_cache: Check that backing memory is valid before creating a surface.bunnei2-15/+12
- Fixes a crash in Puyo Puyo Tetris.
2019-03-21gpu: Rewrite virtual memory manager using PageTable.bunnei10-201/+472
2019-03-21gpu: Move GPUVAddr definition to common_types.bunnei13-31/+24
2019-03-17gl_rasterizer: Skip zero addr/sized regions on flush/invalidate.bunnei1-0/+6
2019-03-16memory: Simplify rasterizer cache operations.bunnei1-2/+1
2019-03-16video_core: Refactor to use MemoryManager interface for all memory access.bunnei19-186/+194
# Conflicts: # src/video_core/engines/kepler_memory.cpp # src/video_core/engines/maxwell_3d.cpp # src/video_core/morton.cpp # src/video_core/morton.h # src/video_core/renderer_opengl/gl_global_cache.cpp # src/video_core/renderer_opengl/gl_global_cache.h # src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-15gpu: Use host address for caching instead of guest address.bunnei24-288/+384
2019-03-13video_core/morton: Use enum to describe MortonCopyPixels128 modeReinUsesLisp3-7/+10
2019-03-13video_core/morton: Remove unused parameter in MortonSwizzleReinUsesLisp3-8/+7
2019-03-13video_core/morton: Remove clang-format off when it's not neededReinUsesLisp1-133/+129
2019-03-13video_core/morton: Remove unused functionsReinUsesLisp1-39/+0
2019-03-13video_core/texture: Fix up sampler lod biasReinUsesLisp1-1/+1
2019-03-13vk_sampler_cache: Use operator== instead of memcmpMat M1-1/+1
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-13vk_sampler_cache: Implement a sampler cacheReinUsesLisp4-1/+140
2019-03-12video_core/texture: Add a raw representation of TSCEntryReinUsesLisp1-24/+29
2019-03-11renderer_opengl/gl_global_cache: Replace indexing for assignment with insert_or_assignLioncash2-3/+3
The previous code had some minor issues with it, really not a big deal, but amending it is basically 'free', so I figured, "why not?". With the standard container maps, when: map[key] = thing; is done, this can cause potentially undesirable behavior in certain scenarios. In particular, if there's no value associated with the key, then the map constructs a default initialized instance of the value type. In this case, since it's a std::shared_ptr (as a type alias) that is the value type, this will construct a std::shared_pointer, and then assign over it (with objects that are quite large, or actively heap allocate this can be extremely undesirable). We also make the function take the region by value, as we can avoid a copy (and by extension with std::shared_ptr, a copy causes an atomic reference count increment), in certain scenarios when ownership isn't a concern (i.e. when ReserveGlobalRegion is called with an rvalue reference, then no copy at all occurs). So, it's more-or-less a "free" gain without many downsides.
2019-03-11renderer_opengl/gl_global_cache: Append missing override specifiersLioncash1-2/+2
Two of the functions here are overridden functions, so we can append these specifiers to make it explicit.
2019-03-11gl_rasterizer: Use system instance passed from argumentReinUsesLisp2-29/+31
2019-03-09gl_rasterizer: Encapsulate sampler queries into methodsReinUsesLisp3-64/+72
2019-03-09gl_rasterizer: Minor logger changesReinUsesLisp1-19/+13
2019-03-08dma_pusher: Store command_list_header by copyReinUsesLisp1-1/+1
Instead of holding a reference that will get invalidated by dma_pushbuffer.pop(), hold it as a copy. This doesn't have any performance cost since CommandListHeader is 8 bytes long.
2019-03-07video_core/gpu_thread: Remove unimplemented WaitForIdle function prototypeLioncash1-3/+0
This function didn't have a definition, so we can remove it to prevent accidentally attempting to use it.
2019-03-07video_core/gpu_thread: Amend constructor initializer list orderLioncash1-2/+2
Moves the data members to satisfy the order they're declared as in the constructor initializer list. Silences a -Wreorder warning.
2019-03-07video_core/gpu: Make GPU's destructor virtualLioncash3-3/+3
Because of the recent separation of GPU functionality into sync/async variants, we need to mark the destructor virtual to provide proper destruction behavior, given we use the base class within the System class. Prior to this, it was undefined behavior whether or not the destructor in the derived classes would ever execute.
2019-03-07gpu_thread: Fix deadlock with threading idle state check.bunnei2-7/+11
2019-03-07gpu_thread: (HACK) Ignore flush on FlushAndInvalidateRegion.bunnei1-3/+1
2019-03-07gpu: Always flush.bunnei2-13/+6
2019-03-07gpu: Refactor a/synchronous implementations into their own classes.bunnei7-63/+155
2019-03-07gpu: Move command processing to another thread.bunnei7-10/+353
2019-03-07gpu: Refactor command and swap buffers interface for asynch.bunnei2-3/+22
2019-03-07gpu: Refactor to take RendererBase instead of RasterizerInterface.bunnei2-17/+22
2019-03-06video_core/engines: Remove unnecessary includesLioncash10-11/+11
Removes a few unnecessary dependencies on core-related machinery, such as the core.h and memory.h, which reduces the amount of rebuilding necessary if those files change. This also uncovered some indirect dependencies within other source files. This also fixes those.
2019-03-05video_core/surface: Remove obsolete TODO in PixelFormatFromRenderTargetFormat()Lioncash1-2/+0
This isn't needed anymore, according to Hexagon
2019-03-04video_core/renderer_opengl: Replace direct usage of global system object accessorsLioncash2-11/+17
We already pass a reference to the system object to the constructor of the renderer, so we can just use that instead of using the global accessor functions.
2019-03-04maxwell_to_vk: Initial implementationReinUsesLisp4-3/+553
2019-03-02vk_buffer_cache: Fix clang-formatReinUsesLisp1-3/+3
2019-03-02fuck git for ruining my day, I will learn but I will not forgivebunnei1-1/+1
2019-03-01vk_buffer_cache: Implement a buffer cacheReinUsesLisp3-0/+205
This buffer cache is just like OpenGL's buffer cache with some minor style changes. It uses VKStreamBuffer.
2019-02-28gl_rasterizer: Remove texture unbinding after dispatching a draw callReinUsesLisp1-12/+0
Unbinding was required when OpenGL delete operations didn't unbind a resource if it was bound. This is no longer needed and can be removed.
2019-02-28gl_state: Fixup multibind bugReinUsesLisp1-2/+2
2019-02-28Devirtualize Register/Unregister and use a wrapper instead.Fernando Sahmkow3-12/+8
2019-02-28Corrections and redesign.Fernando Sahmkow2-51/+51
2019-02-28Fix linux compile error.Fernando Sahmkow1-1/+1
2019-02-28Remove NotifyFrameBuffer as we are doing a texception pass every drawcall.Fernando Sahmkow2-25/+0
2019-02-28Remove certain optimizations that caused texception to fail in certain scenarios.Fernando Sahmkow3-24/+1
2019-02-28Bug fixes and formattingFernando Sahmkow2-3/+4
2019-02-28rasterizer_cache_gl: Implement Texception PassFernando Sahmkow3-0/+51
2019-02-28rasterizer_cache_gl: Implement Partial Reinterpretation of Surfaces.Fernando Sahmkow2-0/+100
2019-02-28rasterizer_cache: mark reinterpreted surfaces and add ability to reload marked surfaces on next use.Fernando Sahmkow2-0/+78
2019-02-28rasterizer_cache_gl: Notify on framebuffer changeFernando Sahmkow2-4/+23
2019-02-28rasterizer_cache: Expose FlushObject to Child classes and allow redefining of Register and UnregisterFernando Sahmkow1-11/+11
2019-02-27gl_rasterizer_cache: Create texture views for array discrepanciesReinUsesLisp3-32/+42
When a texture is sampled in a shader with a different array mode than the cached state, create a texture view and bind that to the shader instead.
2019-02-27vk_memory_manager: Reorder constructor initializer list in terms of member declaration orderLioncash1-1/+1
Reorders members in the order that they would actually be initialized in. Silences a -Wreorder warning.
2019-02-27gl_rasterizer: Reorder constructor initializer list in terms of member declaration orderLioncash1-2/+2
Orders the members in the order they would actually be initialized in. Silences a -Wreorder warning.
2019-02-27gl_shader_disk_cache: Remove #pragma once from cpp fileLioncash1-2/+0
This is only necessary in headers. Silences a warning with clang.
2019-02-27common/math_util: Move contents into the Common namespaceLioncash9-23/+23
These types are within the common library, so they should be within the Common namespace.
2019-02-27gl_rasterizer_cache: Move format conversion to its own fileReinUsesLisp7-136/+175
2019-02-27decoders: Minor style changesReinUsesLisp2-14/+8
2019-02-26renderer_opengl: Update pixel format trackingReinUsesLisp1-0/+1
2019-02-26maxwell_3d: Use std::bitset to manage dirty flagsReinUsesLisp4-52/+51
2019-02-26vk_stream_buffer: Remove copy code pathReinUsesLisp2-53/+18
2019-02-26shader/decode: Remove extras from MetaTextureReinUsesLisp4-40/+65
2019-02-26shader/decode: Split memory and texture instructions decodingReinUsesLisp6-501/+537
2019-02-25shader/track: Resolve variable shadowing warningsLioncash1-5/+5
2019-02-24vk_stream_buffer: Implement a stream bufferReinUsesLisp3-1/+200
This manages two kinds of streaming buffers: one for unified memory models and one for dedicated GPUs. The first one skips the copy from the staging buffer to the real buffer, since it creates an unified buffer. This implementation waits for all fences to finish their operation before "invalidating". This is suboptimal since it should allocate another buffer or start searching from the beginning. There is room for improvement here. This could also handle AMD's "pinned" memory (a heap with 256 MiB) that seems to be designed for buffer streaming.
2019-02-24vk_resource_manager: Minor VKFenceWatch changesReinUsesLisp2-7/+7
2019-02-24vk_memory_manager: Fixup commit interval allocationReinUsesLisp1-2/+1
VKMemoryCommitImpl was using as the end of its interval "begin + end". That ended up wasting memory.
2019-02-24gl_rasterizer_cache: Fixup parameter order in layered swizzleReinUsesLisp1-1/+1
2019-02-22vk_scheduler: Implement a schedulerReinUsesLisp3-1/+132
The scheduler abstracts command buffer and fence management with an interface that's able to do OpenGL-like operations on Vulkan command buffers. It returns by value a command buffer and fence that have to be used for subsequent operations until Flush or Finish is executed, after that the current execution context (the pair of command buffers and fences) gets invalidated a new one must be fetched. Thankfully validation layers will quickly detect if this is skipped throwing an error due to modifications to a sent command buffer.
2019-02-19video_core/dma_pusher: Simplyfy Step() logic.Markus Wick2-81/+77
As fetching command list headers and and the list of command headers is a fixed 1:1 relation now, they can be implemented within a single call. This cleans up the Step() logic quite a bit.
2019-02-19video_core/dma_pusher: The full list of headers at once.Markus Wick2-48/+58
Fetching every u32 from memory leads to a big overhead. So let's fetch all of them as a block if possible. This reduces the Memory::* calls by the dma_pusher by a factor of 10.
2019-02-19vk_memory_manager: Implement memory managerReinUsesLisp3-0/+342
A memory manager object handles the memory allocations for a device. It allocates chunks of Vulkan memory objects and then suballocates.
2019-02-16video_core: Remove usages of System::GetInstance() within the enginesLioncash8-22/+48
Avoids the use of the global accessor in favor of explicitly making the system a dependency within the interface.
2019-02-16core_timing: Convert core timing into a classLioncash3-3/+4
Gets rid of the largest set of mutable global state within the core. This also paves a way for eliminating usages of GetInstance() on the System class as a follow-up. Note that no behavioral changes have been made, and this simply extracts the functionality into a class. This also has the benefit of making dependencies on the core timing functionality explicit within the relevant interfaces.
2019-02-15renderer_opengl: respect the sRGB colorspace for the screenshot featurefearlessTobi1-1/+2
Previously, we were completely ignoring for screenshots whether the game uses RGB or sRGB. This resulted in screenshot colors that looked off for some titles.
2019-02-15gl_state: Synchronize gl_state even when state is disabledReinUsesLisp1-83/+61
There are some potential edge cases where gl_state may fail to track the state if a related state changes while the toggle is disabled or it didn't change. This addresses that.
2019-02-14vk_resource_manager: Implement a command buffer pool with VKFencedPoolReinUsesLisp2-1/+59
2019-02-14vk_resource_manager: Add VKFencedPool interfaceReinUsesLisp2-0/+83
Handles a pool of resources protected by fences. Manages resource overflow allocating more resources. This class is intended to be used through inheritance.
2019-02-14vk_resource_manager: Implement VKResourceManager and fence allocatorReinUsesLisp2-0/+85
CommitFence iterates a pool of fences until one is found. If all fences are being used at the same time, allocate more.
2019-02-14vk_resource_manager: Implement VKFenceWatchReinUsesLisp2-0/+68
A fence watch is used to keep track of the usage of a fence and protect a resource or set of resources without having to inherit from their handlers.
2019-02-14vk_resource_manager: Implement VKFenceReinUsesLisp2-0/+131
Fences take ownership of objects, protecting them from GPU-side or driver-side concurrent access. They must be commited from the resource manager. Their usage flow is: commit the fence from the resource manager, protect resources with it and use them, send the fence to an execution queue and Wait for it if needed and then call Release. Used resources will automatically be signaled when they are free to be reused.
2019-02-14vk_resource_manager: Add VKResource interfaceReinUsesLisp3-1/+43
VKResource is an interface that gets signaled by a fence when it is free to be reused.
2019-02-14shader_decompiler: Improve Accuracy of Attribute Interpolation.Fernando Sahmkow6-38/+74
2019-02-13rasterizer_cache_gl: Only do fast layered copy on the same format. AsFernando Sahmkow1-1/+5
glCopyImageSubData does not support different formats.
2019-02-13vk_device: Abstract device handling into a classReinUsesLisp3-1/+351
VKDevice contains all the data required to manage and initialize a physical device. Its intention is to be passed across Vulkan objects to query device-specific data (for example the logical device and the dispatch loader).
2019-02-13renderer_opengl: Remove reference to global system instanceLioncash1-3/+3
We already store a reference to the system instance that the renderer is created with, so we don't need to refer to the system instance via Core::System::GetInstance()
2019-02-12gl_rasterizer_cache: Remove unnecessary newlineLioncash1-2/+0
2019-02-12gl_rasterizer_cache: Get rid of variable shadowingLioncash1-6/+14
Avoids shadowing the members of the struct itself, which results in a -Wshadow warning.
2019-02-12renderer_vulkan: Add declarations fileReinUsesLisp2-0/+52
This file is intended to be included instead of vulkan/vulkan.hpp. It includes declarations of unique handlers using a dynamic dispatcher instead of a static one (which would require linking to a Vulkan library).
2019-02-12gl_shader_decompiler: Re-implement TLDS lodReinUsesLisp2-22/+35
2019-02-12core_timing: Rename CoreTiming namespace to Core::TimingLioncash3-3/+3
Places all of the timing-related functionality under the existing Core namespace to keep things consistent, rather than having the timing utilities sitting in its own completely separate namespace.
2019-02-11Corrected F2I None mode to RoundEven.Fernando Sahmkow2-4/+4
2019-02-11Fix incorrect value for CC bit in IADDFernando Sahmkow1-2/+2
2019-02-10kepler_compute: Fixup assert and rename enginesReinUsesLisp6-52/+59
When I originally added the compute assert I used the wrong documentation. This addresses that. The dispatch register was tested with homebrew against hardware and is triggered by some games (e.g. Super Mario Odyssey). What exactly is missing to get a valid program bound by this engine requires more investigation.
2019-02-09Implement BGRA8 framebuffer formatgreggameplayer3-0/+4
2019-02-09Implement linear textures (#2089)Fernando Sahmkow2-5/+39
2019-02-08gl_rasterizer_cache: Fixup texture view parametersReinUsesLisp1-2/+2
These parameters were declared as constants and passed to glTextureView but then they were removed on a rabase. This addresses that mistake.
2019-02-07shader_ir: Remove F4 prefix to texture operationsReinUsesLisp3-26/+25
This was originally included because texture operations returned a vec4. These operations now return a single float and the F4 prefix doesn't mean anything.
2019-02-07shader_ir: Clean texture management codeReinUsesLisp3-133/+104
Previous code relied on GLSL parameter order (something that's always ill-formed on an IR design). This approach passes spatial coordiantes through operation nodes and array and depth compare values in the the texture metadata. It still contains an "extra" vector containing generic nodes for bias and component index (for example) which is still a bit ill-formed but it should be better than the previous approach.
2019-02-07gl_rasterizer_cache: Mark surface copy destinations as modified.bunnei2-4/+18
2019-02-07gl_rasterizer: Implement a more accurate fermi 2D copy.bunnei7-68/+188
- This is a blit, use the blit registers.
2019-02-07gl_shader_disk_cache: Check LZ4 size limitFrederic L1-0/+4
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-07gl_shader_disk_cache: Consider compressed size zero as an errorFrederic L1-2/+2
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-07gl_shader_disk_cache: Use unordered containersReinUsesLisp4-56/+64
2019-02-07gl_shader_cache: Fixup GLSL unique identifiersReinUsesLisp2-3/+3
2019-02-07gl_shader_cache: Link loading screen with disk shader cache loadReinUsesLisp5-9/+40
2019-02-07gl_shader_cache: Set GL_PROGRAM_SEPARABLE to dumped shadersReinUsesLisp1-0/+1
i965 (and probably all mesa drivers) require GL_PROGRAM_SEPARABLE when using glProgramBinary. This is probably required by the standard but it's ignored by permisive proprietary drivers.
2019-02-07gl_shader_disk_cache: Pass core system as argument and guard against games without title idsReinUsesLisp10-17/+57
2019-02-07gl_shader_disk_cache: Guard reads and writes against failureReinUsesLisp2-216/+339
2019-02-07gl_shader_disk_cache: Address miscellaneous feedbackReinUsesLisp5-43/+57
2019-02-07gl_shader_disk_cache: Pass return values returning instead of by parametersReinUsesLisp3-39/+37
2019-02-07gl_shader_disk_cache: Compress program binaries using LZ4ReinUsesLisp1-7/+28
2019-02-07gl_shader_disk_cache: Compress GLSL code using LZ4ReinUsesLisp2-6/+57
2019-02-07gl_shader_disk_cache: Save GLSL and entries into the precompiled fileReinUsesLisp9-135/+234
2019-02-07settings: Hide shader cache behind a settingReinUsesLisp1-0/+21
2019-02-07gl_shader_disk_cache: Invalidate shader cache changes with CMake hashReinUsesLisp1-7/+16
2019-02-07gl_shader_cache: Refactor to support disk shader cacheReinUsesLisp2-121/+388
2019-02-07gl_shader_disk_cache: Add transferable cache invalidationReinUsesLisp2-0/+8
2019-02-07gl_shader_disk_cache: Add precompiled loadReinUsesLisp2-0/+45
2019-02-07gl_shader_disk_cache: Add precompiled saveReinUsesLisp2-0/+57
2019-02-07gl_shader_disk_cache: Add transferable loadReinUsesLisp2-0/+56
2019-02-07gl_shader_disk_cache: Add transferable storesReinUsesLisp2-0/+194
2019-02-07gl_shader_disk_cache: Add ShaderDiskCacheOpenGL class and helpersReinUsesLisp2-0/+76
2019-02-07gl_shader_disk_cache: Add file and move BaseBindings declarationReinUsesLisp4-10/+58
2019-02-07gl_shader_decompiler: Remove name entriesReinUsesLisp2-28/+10
2019-02-07gl_shader_util: Add parameter to handle retrievable programsReinUsesLisp3-6/+10
2019-02-07rasterizer_interface: Add disk cache entry for the rasterizerReinUsesLisp5-0/+14
2019-02-07shader_decode: Implement LDG and basic cbuf trackingReinUsesLisp1-0/+33
2019-02-05video_core/texture: Fix BitField size for depth_minus_oneReinUsesLisp1-1/+1
2019-02-04Update src/video_core/engines/shader_bytecode.hMat M1-1/+1
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03Fix TXQ not using the component mask.Fernando Sahmkow2-6/+13
2019-02-03shader_ir/memory: Add ST_L 64 and 128 bits storesReinUsesLisp1-3/+11
2019-02-03shader/track: Search inside of conditional nodesReinUsesLisp1-0/+11
Some games search conditionally use global memory instructions. This allows the heuristic to search inside conditional nodes for the source constant buffer.
2019-02-03shader_ir: Rename BasicBlock to NodeBlockReinUsesLisp30-122/+120
It's not always used as a basic block. Rename it for consistency.
2019-02-03shader_ir: Pass decoded nodes as a whole instead of per basic blocksReinUsesLisp27-57/+62
Some games call LDG at the top of a basic block, making the tracking heuristic to fail. This commit lets the heuristic the decoded nodes as a whole instead of per basic blocks. This may lead to some false positives but allows it the heuristic to track cases it previously couldn't.
2019-02-03video_core: Assert on invalid GPU to CPU address queriesReinUsesLisp8-47/+67
2019-02-03maxwell_3d: Allow sampler handles with TSC id zeroReinUsesLisp1-10/+6
2019-02-03maxwell_3d: Allow texture handles with TIC id zeroReinUsesLisp3-21/+7
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because it would become unused.
2019-02-03memory_manager: Check for reserved page statusReinUsesLisp1-1/+2
2019-02-03shader_ir/memory: Add LD_L 128 bits loadsReinUsesLisp1-7/+19
2019-02-03shader_bytecode: Rename BytesN enums to BitsNReinUsesLisp2-7/+7
2019-02-03shader_ir/memory: Add LD_L 64 bits loadsReinUsesLisp1-6/+17
2019-02-01rasterizer_interface: Remove unused AccelerateFill operationReinUsesLisp3-11/+0
2019-02-01video_core: Remove unused Fill surface typeReinUsesLisp2-6/+1
2019-01-30gl_rasterizer_cache: Fixup test clauseReinUsesLisp1-6/+5
2019-01-30gl_rasterizer_cache: Guard clause swizzle testingMat M1-1/+3
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-30gl_state: Remove texture target trackingReinUsesLisp2-5/+0
2019-01-30gl_rasterizer_cache: Move swizzling to textures instead of stateReinUsesLisp6-28/+35
2019-01-30gl_state: Use DSA and multi bind to update texture bindingsReinUsesLisp1-8/+22
2019-01-30gl_rasterizer: Use DSA for texturesReinUsesLisp5-185/+105
2019-01-30video_core/dma_pusher: Silence C4828 warningsLioncash1-1/+1
This was previously causing: warning C4828: The file contains a character starting at offset 0xa33 that is illegal in the current source character set (codepage 65001). warnings on Windows when compiling yuzu.
2019-01-30shader_ir: Unify constant buffer offset valuesReinUsesLisp17-25/+36
Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries.
2019-01-30gl_shader_cache: Use explicit bindingsReinUsesLisp7-249/+194
2019-01-30gl_rasterizer: Implement global memory managementReinUsesLisp6-4/+140
2019-01-30shader_decode: Implement LDG and basic cbuf trackingReinUsesLisp7-10/+240
2019-01-30video_core/GPU Implemented the GPU PFIFO puller semaphore operations. (#1908)Kevin2-12/+242
* Implemented the puller semaphore operations. * Nit: Fix 2 style issues * Nit: Add Break to default case. * Fix style. * Update for comments. Added ReferenceCount method * Forgot to remove GpuSmaphoreAddress union. * Fix the clang-format issues. * More clang formatting. * two more white spaces for the Clang formatting. * Move puller members into the regs union * Updated to use Memory::WriteBlock instead of Memory::Write* * Fix clang style issues * White space clang error * Removing unused funcitons and other pr comment * Removing unused funcitons and other pr comment * More union magic for setting regs value. * union magic refcnt as well * Remove local var * Set up the regs and regs_assert_positions up properly * Fix clang error
2019-01-30gl_shader_cache: Fix texture view for cubemaps as cubemap arraysReinUsesLisp4-3/+28
Cubemaps are considered layered and to create a texture view the texture mustn't be a layered texture, resulting in cubemaps being bound as cubemap arrays. To fix this issue this commit introduces an extra surface parameter called "is_array" and uses this to query for texture view creation. Now that texture views for cubemaps are actually being created, this also fixes the number of layers created for the texture view (since they have to be 6 to create a texture view of cubemaps).
2019-01-30gl_rasterizer: Workaround invalid zeta clearsReinUsesLisp2-14/+19
Some games (like Xenoblade Chronicles 2) clear both depth and stencil buffers while there's a depth-only texture attached (e.g. D16 Unorm). This commit reads the zeta format of the bound surface on ConfigureFramebuffers and returns if depth and/or stencil attachments were set. This is ignored on DrawArrays but on Clear it's used to just clear those attachments, bypassing an OpenGL error.
2019-01-28shader/shader_ir: Amend three comment typosLioncash1-3/+3
Given we're in the area, these are three trivial typos that can be corrected.
2019-01-28shader/shader_ir: Amend constructor initializer ordering for AbufNodeLioncash1-2/+2
Orders the class members in the same order that they would actually be initialized in. Gets rid of two compiler warnings.
2019-01-28shader/decode: Avoid a pessimizing std::move within DecodeRange()Lioncash1-1/+1
std::moveing a local variable in a return statement has the potential to prevent copy elision from occurring, so this can just be converted into a regular return.
2019-01-26video_core: Silent implicit conversion warningReinUsesLisp1-3/+4
2019-01-24frontend: Refactor ScopeAcquireWindowContext out of renderer_opengl.bunnei3-28/+2
2019-01-22maxwell_3d: Set rt_separate_frag_data to 1 by defaultReinUsesLisp2-4/+6
Commercial games assume that this value is 1 but they never set it. On the other hand nouveau manually sets this register. On ConfigureFramebuffers we were asserting for what we are actually implementing (according to envytools).
2019-01-21Rename step 1 and step 2 to be a little more descriptiveJames Rowe1-2/+2
2019-01-20QT: Upgrade the Loading Bar to look much betterJames Rowe1-0/+9
2019-01-18gl_rasterizer: Silent unsafe mix warningReinUsesLisp1-1/+1
2019-01-16shader_ir: Fixup clang buildReinUsesLisp1-4/+6
2019-01-15gl_shader_decompiler: replace std::get<> with std::get_if<> for macOS compatibilityReinUsesLisp1-44/+58
2019-01-15gl_shader_decompiler: Inline textureGather componentReinUsesLisp1-15/+16
2019-01-15shader_decode: Fixup XMADReinUsesLisp1-1/+1
2019-01-15shader_ir: Pass to decoder functions basic block's codeReinUsesLisp27-82/+83
2019-01-15shader_decode: Improve zero flag implementationReinUsesLisp15-75/+79
2019-01-15shader_ir: Remove composite primitives and use temporals insteadReinUsesLisp4-241/+224
2019-01-15gl_shader_decompiler: Fixup AssignCompositeHalfReinUsesLisp1-1/+1
2019-01-15shader_decode: Use proper primitive namesReinUsesLisp4-25/+21
2019-01-15shader_decode: Use BitfieldExtract instead of shift + andReinUsesLisp8-48/+37
2019-01-15shader_ir: Remove Ipa primitiveReinUsesLisp3-13/+2
2019-01-15gl_shader_decompiler: Use rasterizer's UBO size limitReinUsesLisp1-1/+3
2019-01-15gl_shader_gen: Fixup code formattingReinUsesLisp2-18/+22
2019-01-15video_core: Rename glsl_decompiler to gl_shader_decompilerReinUsesLisp7-7/+7
2019-01-15shader_ir: Remove RZ and use Register::ZeroIndex insteadReinUsesLisp3-12/+16
2019-01-15shader_decode: Implement TEXS.F16ReinUsesLisp3-15/+57
2019-01-15shader_decode: Fixup R2PReinUsesLisp1-2/+3
2019-01-15glsl_decompiler: Fixup TLDSReinUsesLisp1-1/+0
2019-01-15glsl_decompiler: Fixup geometry shadersReinUsesLisp2-15/+17
2019-01-15shader_decode: Fixup WriteLogicOperation zero comparisonReinUsesLisp1-1/+1
2019-01-15glsl_decompiler: Fixup permissive member function declarationsReinUsesLisp1-133/+133
2019-01-15shader_decode: Fixup PSETReinUsesLisp1-2/+3
2019-01-15shader_decode: Fixup clang-formatReinUsesLisp2-2/+4
2019-01-15video_core: Implement IR based geometry shadersReinUsesLisp4-10/+102
2019-01-15shader_decode: Implement VMAD and VSETPReinUsesLisp5-2/+129
2019-01-15shader_decode: Implement HSET2ReinUsesLisp3-1/+50
2019-01-15shader_decode: Rework HSETP2ReinUsesLisp4-47/+57
2019-01-15shader_decode: Implement R2PReinUsesLisp1-1/+28
2019-01-15shader_decode: Implement CSETPReinUsesLisp1-14/+37
2019-01-15shader_decode: Implement PSETReinUsesLisp1-1/+16
2019-01-15shader_decode: Implement HFMA2ReinUsesLisp4-5/+60
2019-01-15glsl_decompiler: Remove HNegate inliningReinUsesLisp1-10/+0
2019-01-15shader_decode: Implement POPCReinUsesLisp4-1/+22
2019-01-15shader_decode: Implement TLDS (untested)ReinUsesLisp3-10/+92
2019-01-15shader_decode: Update TLD4 reflecting #1862 changesReinUsesLisp2-52/+52
2019-01-15shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompilingReinUsesLisp3-60/+72
2019-01-15shader_decode: Fixup FSETReinUsesLisp1-2/+2
2019-01-15shader_decode: Implement IADD32IReinUsesLisp1-0/+11
2019-01-15shader_decode: Fixup clang-formatReinUsesLisp1-1/+1
2019-01-15video_core: Return safe values after an assert hitsReinUsesLisp8-8/+19
2019-01-15shader_decode: Implement FFMAReinUsesLisp1-1/+36
2019-01-15video_core: Address feedbackReinUsesLisp4-13/+16
2019-01-15shader_ir: Fixup file inclusions and clang-formatReinUsesLisp3-2/+2
2019-01-15shader_ir: Move comment node stringMat M1-2/+2
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-15shader_ir: Address feedback to avoid UB in bit castingReinUsesLisp1-2/+4
2019-01-15shader_decode: Fixup clang-formatReinUsesLisp2-3/+2
2019-01-15shader_decode: Implement LEAReinUsesLisp1-0/+55
2019-01-15shader_decode: Implement IADD3ReinUsesLisp1-0/+61
2019-01-15shader_decode: Implement LOP3ReinUsesLisp2-0/+62
2019-01-15shader_decode: Implement ST_LReinUsesLisp1-0/+17
2019-01-15shader_decode: Implement LD_LReinUsesLisp1-0/+18
2019-01-15shader_decode: Implement HSETP2ReinUsesLisp1-1/+37
2019-01-15shader_decode: Implement HADD2 and HMUL2ReinUsesLisp1-1/+48
2019-01-15shader_decode: Implement HADD2_IMM and HMUL2_IMMReinUsesLisp1-1/+28
2019-01-15shader_decode: Implement MOV_SYSReinUsesLisp1-0/+27
2019-01-15shader_decode: Implement IMNMXReinUsesLisp1-0/+16
2019-01-15shader_decode: Implement F2F_CReinUsesLisp1-2/+10
2019-01-15shader_decode: Implement I2IReinUsesLisp1-0/+26
2019-01-15shader_decode: Implement BRA internal flagReinUsesLisp1-4/+8
2019-01-15shader_decode: Implement ISCADDReinUsesLisp1-0/+15
2019-01-15shader_decode: Implement XMADReinUsesLisp1-1/+85
2019-01-15shader_decode: Implement PBK and BRKReinUsesLisp1-1/+22
2019-01-15shader_decode: Implement LOPReinUsesLisp1-0/+15
2019-01-15shader_decode: Implement SELReinUsesLisp1-0/+8
2019-01-15shader_decode: Implement IADDReinUsesLisp1-1/+28
2019-01-15shader_decode: Implement ISETPReinUsesLisp1-1/+30
2019-01-15shader_decode: Implement BFIReinUsesLisp1-1/+22
2019-01-15shader_decode: Implement ISETReinUsesLisp1-1/+27
2019-01-15shader_decode: Implement LD_CReinUsesLisp1-0/+31
2019-01-15shader_decode: Implement SHLReinUsesLisp1-0/+8
2019-01-15shader_decode: Implement SHRReinUsesLisp1-1/+26
2019-01-15shader_decode: Implement LOP32IReinUsesLisp2-1/+72
2019-01-15shader_decode: Implement BFEReinUsesLisp1-1/+25
2019-01-15shader_decode: Implement FSETReinUsesLisp1-1/+36
2019-01-15shader_decode: Implement F2IReinUsesLisp1-0/+37
2019-01-15shader_decode: Implement I2FReinUsesLisp1-0/+23
2019-01-15shader_decode: Implement F2FReinUsesLisp1-1/+37
2019-01-15shader_decode: Stub DEPBARReinUsesLisp1-0/+4
2019-01-15shader_decode: Implement SSY and SYNCReinUsesLisp1-0/+19
2019-01-15shader_decode: Implement PSETPReinUsesLisp1-1/+21
2019-01-15shader_decode: Implement TMMLReinUsesLisp1-3/+45
2019-01-15shader_decode: Implement TEX and TXQReinUsesLisp2-0/+223
2019-01-15shader_decode: Implement TEXS (F32)ReinUsesLisp2-0/+217
2019-01-15shader_decode: Implement FSETPReinUsesLisp1-1/+33
2019-01-15shader_decode: Partially implement BRAReinUsesLisp1-0/+12
2019-01-15shader_decode: Implement IPAReinUsesLisp1-0/+12
2019-01-15shader_decode: Implement EXITReinUsesLisp1-1/+32
2019-01-15shader_decode: Implement ST_AReinUsesLisp1-0/+30
2019-01-15shader_decode: Implement LD_AReinUsesLisp1-1/+39
2019-01-15shader_decode: Implement FADD32IReinUsesLisp1-0/+12
2019-01-15shader_decode: Implement FMUL32_IMMReinUsesLisp1-0/+10
2019-01-15shader_decode: Implement MOV32_IMMReinUsesLisp1-1/+9
2019-01-15shader_decode: Stub RRO_C, RRO_R and RRO_IMMReinUsesLisp1-0/+9
2019-01-15shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMMReinUsesLisp1-0/+18
2019-01-15shader_decode: Implement MUFUReinUsesLisp1-0/+29
2019-01-15shader_decode: Implement FADD_C, FADD_R and FADD_IMMReinUsesLisp1-0/+15
2019-01-15shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMMReinUsesLisp1-0/+42
2019-01-15shader_decode: Implement MOV_C and MOV_RReinUsesLisp1-1/+23
2019-01-15video_core: Replace gl_shader_decompilerReinUsesLisp8-4185/+57
2019-01-15glsl_decompiler: ImplementationReinUsesLisp3-0/+1483
2019-01-15shader_ir: Add condition code helperReinUsesLisp2-0/+13
2019-01-15shader_ir: Add predicate combiner helperReinUsesLisp2-0/+15
2019-01-15shader_ir: Add comparison helpersReinUsesLisp2-0/+106
2019-01-15shader_ir: Add half float helpersReinUsesLisp2-0/+44
2019-01-15shader_ir: Add integer helpersReinUsesLisp2-0/+40
2019-01-15shader_ir: Add float helpersReinUsesLisp2-0/+24
2019-01-15shader_ir: Add settersReinUsesLisp2-0/+24
2019-01-15shader_ir: Add local memory gettersReinUsesLisp2-0/+7
2019-01-15shader_ir: Add internal flag gettersReinUsesLisp2-0/+10
2019-01-15shader_ir: Add attribute gettersReinUsesLisp2-0/+26
2019-01-15shader_ir: Add constant buffer gettersReinUsesLisp2-0/+25
2019-01-15shader_ir: Add register getterReinUsesLisp2-0/+9
2019-01-15shader_ir: Add immediate node constructorsReinUsesLisp2-1/+34
2019-01-15shader_ir: Initial implementationReinUsesLisp30-0/+1573
2019-01-15shader_bytecode: Fixup encodingReinUsesLisp1-1/+1
2019-01-15shader_header: Make local memory size getter constantReinUsesLisp1-1/+1
2019-01-09gl_rasterizer: Workaround Intel VAO DSA bugReinUsesLisp3-7/+16
There is a bug on Intel's blob driver where it fails to properly build a vertex array object if it's not bound even after creating it with glCreateVertexArrays. This workaround binds it after creating it to bypass the issue.
2019-01-08gl_global_cache: Add dummy global cache managerReinUsesLisp5-3/+96
2019-01-07gl_rasterizer: Skip framebuffer configuration if rendertargets have not been changedReinUsesLisp2-1/+31
2019-01-07gl_rasterizer_cache: Use dirty flags for the depth bufferReinUsesLisp4-3/+23
2019-01-07gl_rasterizer_cache: Use dirty flags for color buffersReinUsesLisp4-4/+24
2019-01-07gl_shader_cache: Use dirty flags for shadersReinUsesLisp5-2/+23
2019-01-06gl_stream_buffer: Use DSA for buffer managementReinUsesLisp3-17/+14
2019-01-06gl_rasterizer: Use DSA for vertex array objectsReinUsesLisp6-79/+53
2019-01-06gl_state: Drop uniform buffer state trackingReinUsesLisp3-10/+0
2019-01-05gl_rasterizer_cache: Use GL_STREAM_COPY for PBOsReinUsesLisp1-1/+1
Since the data is doing the path CPU -> GPU -> GPU copy is the most approximate hint. Using GL_STREAM_DRAW generated a performance warning on Nvidia's stack. Changing this hint removed the warning.
2018-12-30gl_rasterizer_cache: Texture view if shader samples array but OGL is notReinUsesLisp3-14/+74
When a shader samples a texture array but that texture in OpenGL is created without layers, use a texture view to increase the texture hierarchy. For example, instead of binding a GL_TEXTURE_2D bind a GL_TEXTURE_2D_ARRAY view.
2018-12-28gpu: Remove PixelFormat G8R8U and G8R8S, as they do not seem to exist.bunnei4-79/+46
- Fixes UI rendering issues in The Legend of Zelda: Breath of the Wild.
2018-12-27Add missing uintBitsToFloat to SetRegisterToHalfFloatRodolfo Bogado1-2/+2
2018-12-26renderer_opengl: Correct forward declaration of FramebufferLayoutLioncash1-1/+1
This is actually a struct, not a class, which can lead to compilation warnings.
2018-12-26Apply CC test to the final value to be stored in the registerRodolfo Bogado1-9/+12
2018-12-26Fixed shader linking error due to TLDS (#1934)David1-1/+1
* Fixed shader linking error due to TLDS coord should be coords * Fix remaining coords
2018-12-26shader_bytecode: Fixup TEXS.F16 encodingReinUsesLisp1-1/+1
2018-12-22Includde saturation in the evaluation of the control codeRodolfo Bogado1-3/+4
2018-12-22Handle RZ cases evaluating the expression instead of the register value.Rodolfo Bogado1-14/+22
2018-12-22complete emulation of ZeroFlagRodolfo Bogado1-100/+97
2018-12-19hopefully fix clang format issueDavid Marcec1-0/+1
2018-12-19Fixed uninitialized memory due to missing returns in canaryDavid Marcec10-3/+29
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-18yuzu, video_core: Screenshot functionalityzhupengfei6-4/+95
Allows capturing screenshot at the current internal resolution (native for software renderer), but a setting is available to capture it in other resolutions. The screenshot is saved to a single PNG in the current layout.
2018-12-18Texture format fixes: Flag RGBA16UI as GL_RGBA_INTEGER format, and interpret R16U as Z16 when depth_compare is enabled.heapo1-1/+11
2018-12-18shader_bytecode: Fixup half float's operator B encodingReinUsesLisp1-1/+1
2018-12-17Implement postfactor multiplication/division for fmul instructionsheapo2-5/+21
2018-12-17Fix arrayed shadow sampler array slice/depth comparison ordering, as well as invalid GLSL LOD selection.heapo1-16/+14
2018-12-11gl_shader_cache: Dehardcode constant in CalculateProgramSize()Lioncash1-2/+2
This constant is related to the size of the instruction.
2018-12-11gl_shader_cache: Resolve truncation compiler warningLioncash1-1/+1
The previous code would cause a warning, as it was truncating size_t (64-bit) to a u32 (32-bit) implicitly.
2018-12-10gl_shader_decompiler: IPA FrontFacing: the right value when is the front face is 0xFFFFFFFF.Marcos Vitali1-1/+1
2018-12-09Implemented a shader unique identifier.Fernando Sahmkow4-0/+57
2018-12-09Add more info into textures' object labelsFernandoS272-2/+57
2018-12-07gl_shader_decompiler: TLDS/TLD4/TLD4S Reworked reflecting the source registers, bugs fixed and modularize.Marcos Vitali1-106/+134
2018-12-05gl_shader_decompiler: Implement TEXS.F16ReinUsesLisp2-13/+51
2018-12-05gl_shader_decompiler: Fixup inverted ifReinUsesLisp1-6/+5
2018-12-05Improve msvc codegen for hot-path array LUTsheapo1-275/+277
In some constexpr functions, msvc is building the LUT at runtime (pushing each element onto the stack) out of an abundance of caution. Moving the arrays into be file-scoped constexpr's avoids this and turns the functions into simple look-ups as intended.
2018-12-04Rewrited TEX/TEXS (TEX Scalar). (#1826)Marcos1-259/+177
* Rewrited TEX/TEXS (TEX Scalar). * Style fixes. * Styles issues.
2018-12-04Removed unused file.Subv1-142/+0
This is a leftover from #1792
2018-12-04GPU: Don't try to route PFIFO methods (0-0x40) to the other engines.Subv1-0/+6
2018-12-01Fix debug buildLioncash1-4/+2
A non-existent parameter was left in some formatting calls (the logging macro for which only does anything meaningful on debug builds)
2018-11-30gl_rasterizer_cache: Update AccurateCopySurface to flush complete source surface.bunnei1-1/+4
- Fixes issues with Breath of the Wild with use_accurate_gpu_emulation setting.
2018-11-29gl_rasterizer: Enable clip distances when set in register and in shaderReinUsesLisp5-13/+37
2018-11-29gl_rasterizer: Implement a framebuffer cacheReinUsesLisp2-40/+82
2018-11-29gl_shader_manager: Update pipeline when programs have changedReinUsesLisp1-4/+17
2018-11-29gl_rasterizer_cache: Remove BlitSurface and replace with more accurate copy.bunnei1-144/+1
- BlitSurface with different texture targets is inherently broken. - When target is the same, we can just use FastCopySurface. - Fixes rendering issues with Breath of the Wild.
2018-11-29gl_shader_decompiler: Remove texture temporal in TLD4ReinUsesLisp1-3/+1
2018-11-29gl_shader_decompiler: Flip negated if else statementReinUsesLisp1-3/+3
2018-11-29gl_shader_decompiler: Use GLSL scope on instructions unrelated to texturesReinUsesLisp1-35/+10
2018-11-29gl_shader_decompiler: Move texture code generation into lambdasReinUsesLisp1-97/+78
2018-11-29gl_shader_decompiler: Clean up texture instructionsReinUsesLisp1-87/+56
2018-11-29gl_shader_decompiler: Scope GLSL variables with a scoped objectReinUsesLisp1-32/+72
2018-11-29gl_rasterizer: Signal UNIMPLEMENTED when rt_separate_frag_data is not zeroReinUsesLisp1-1/+1
2018-11-29gl_rasterizer_cache: Use brackets for two-line single-expresion blocksReinUsesLisp1-1/+2
2018-11-29gl_rasterizer: Remove unused struct declarationsReinUsesLisp1-14/+0
2018-11-29gl_rasterizer: Remove extension booleansReinUsesLisp2-16/+0
2018-11-28dma_pushbuffer: Optimize to avoid loop and copy on Push.bunnei2-5/+17
2018-11-28gpu: Move command list profiling to DmaPusher::DispatchCalls.bunnei2-5/+5
2018-11-27gl_shader_decompiler: Fixup clip distance indexReinUsesLisp1-1/+1
2018-11-27gl_rasterizer: Fixup for #1723.Markus Wick1-1/+1
On invalidating the streaming buffer, we need to reupload all vertex buffers. But we don't need to reconfigure the vertex format. This was a (silly) misstake in #1723. Thanks at Rodrigo for discovering the issue. Fun fact, as configuring the vertex format also invalidate the vertex buffer, this misstake had no affect on the behavior.
2018-11-27gpu: Rewrite GPU command list processing with DmaPusher class.bunnei17-105/+343
- More accurate impl., fixes Undertale (among other games).
2018-11-27remove viewport_transform_enabled as it seems to be inactive when valid transforms are used.Rodolfo Bogado1-12/+5
2018-11-27morton: Fixup compiler warningReinUsesLisp1-1/+2
2018-11-27Implement depth clampRodolfo Bogado5-10/+58
2018-11-27Add support for Clip Distance enabled registerRodolfo Bogado3-3/+26
2018-11-27GPU States: Implement Polygon Offset. This is used in SMO all the time. (#1784)Marcos5-5/+107
* GPU States: Implement Polygon Offset. This is used in SMO all the time. * Clang Format fixes. * Initialize polygon_offset in the constructor.
2018-11-26Implemented Tile Width SpacingFernandoS278-36/+55
2018-11-25Limit the amount of viewports tested for state changes only to the usable onesRodolfo Bogado1-2/+10
2018-11-25gl_shader_decompiler: Implement S2R's Y_DIRECTIONReinUsesLisp5-16/+26
2018-11-25morton: Style changesReinUsesLisp1-12/+12
2018-11-25video_core: Move morton functions to their own fileReinUsesLisp6-345/+391
2018-11-24Fix Texture OverlappingFernandoS271-43/+70
2018-11-24Implemented BRA CC conditional and FSET CC SettingFernandoS271-4/+14
2018-11-24Add support for viewport_transfom_enable registerRodolfo Bogado2-6/+22
2018-11-24Add support for clear_flags registerRodolfo Bogado5-28/+95
2018-11-24Fix TEXS Instruction encodingsFernandoS271-22/+48
2018-11-24Fix one encoding in TEX InstructionFernandoS271-3/+3
2018-11-24Corrected inputs indexing in TEX instructionFernandoS271-66/+85
2018-11-23memory_manager: Do not allow 0 to be a valid GPUVAddr.bunnei2-1/+9
- Fixes a bug with Undertale using 0 for a render target.
2018-11-23Added predicate comparison LessEqualWithNan (#1736)Hexagon122-5/+13
* Added predicate comparison LessEqualWithNan * oops * Clang fix
2018-11-23gl_shader_decompiler: Implement clip distancesReinUsesLisp3-21/+58
2018-11-22gl_shader_decompiler: Add a message for unimplemented cc generationReinUsesLisp1-23/+46
2018-11-22macro_interpreter: Implement AddWithCarry and SubtractWithBorrow.bunnei2-8/+25
- Used by Undertale.
2018-11-22maxwell_3d: Implement alternate blend equations.bunnei2-0/+12
- Used by Undertale.
2018-11-22gl_shader_decompiler: Rename internal flag stringsReinUsesLisp1-15/+20
2018-11-22gl_shader_decompiler: Rename control codes to condition codesReinUsesLisp2-67/+50
2018-11-22gl_shader_decompiler: Fix register overwriting on texture callsReinUsesLisp1-60/+78
2018-11-21Properly Implemented TXQ InstructionFernandoS271-2/+12
2018-11-21gl_shader_decompiler: Implement BFI_IMM_RReinUsesLisp2-0/+23
2018-11-21Removed pre 4.3 ARB extensionsFernandoS275-20/+13
2018-11-21Use default values for unknown framebuffer pixel formatFernandoS272-0/+8
2018-11-21gl_shader_decompiler: Implement R2P_IMMReinUsesLisp2-0/+42
2018-11-21gl_shader_decompiler: Remove UNREACHABLE when setting RZReinUsesLisp1-2/+1
2018-11-21gl_shader_decompiler: Use UNIMPLEMENTED instead of LOG+UNREACHABLE when applicableReinUsesLisp1-371/+258
2018-11-21maxwell_3d: Initialize rasterizer color mask registers as enabled.bunnei1-0/+9
- Fixes rendering regression with Sonic Mania.
2018-11-20shader_cache: Only lock covered instructions.Markus Wick4-8/+24
2018-11-20Implemented Fast Layered CopyFernandoS272-2/+30
2018-11-19Eliminated unnessessary memory allocation and copy (#1702)Frederic L3-9/+20
2018-11-19gl_rasterizer: Remove default clip distanceReinUsesLisp1-2/+0
2018-11-18drop support for non separate alpha as it seems to cause issues in some gamesRodolfo Bogado3-61/+35
2018-11-17fix sampler configuration, thanks to Marcos for his investigationRodolfo Bogado3-19/+57
2018-11-17small type fixRodolfo Bogado1-6/+6
2018-11-17small fix for alphaToOne bit locationRodolfo Bogado1-2/+2
2018-11-17fix for gcc compilationRodolfo Bogado1-60/+61
2018-11-17add AlphaToCoverage and AlphaToOneRodolfo Bogado5-1/+39
2018-11-17add support for fragment_color_clampRodolfo Bogado5-1/+24
2018-11-17add missing MirrorOnceBorder support where supportedRodolfo Bogado1-0/+6
2018-11-17set border color not depending on the wrap modeRodolfo Bogado1-9/+9
only enable color mask for the first framebuffer id independent blending is disabled
2018-11-17set default value for point size registerRodolfo Bogado2-5/+4
2018-11-17fix viewport and scissor behaviorRodolfo Bogado6-64/+89
2018-11-17gl_rasterizer: Skip VB upload if the state is clean.Markus Wick9-6/+60
2018-11-17textures/decoders: Replace magic numbersFrederic Laing1-37/+33
2018-11-15textures/decoders: Minor cleanupFrederic Laing1-16/+16
2018-11-15gl_rasterizer_chache: Minor cleanupFrederic Laing1-3/+3
2018-11-13video_core/renderer_base: Remove GL include from the renderer base class filesLioncash1-1/+0
Keeps the base class source files implementation-agnostic.
2018-11-13gl_rasterizer: Minor cleanupFrederic L1-4/+2
Minor code cleanup from unaddressed feedback in #1654
2018-11-13gl_state: Amend compilation warningsLioncash2-3/+4
Makes float -> integral conversions explicit via casts and also silences a sign conversion warning.
2018-11-13Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB (#1666)greggameplayer4-71/+101
* Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB ( needed by Mario+Rabbids Kingdom Battle ) * Small placement correction
2018-11-11Use core extensions when available to set max anisotropic filtering levelRodolfo Bogado1-2/+7
2018-11-11Improve state management by splitting some of the states id separated function to avoid a full apply overheadRodolfo Bogado6-39/+40
2018-11-11Try to fix problems with stencil test in some games, relax translation to opengl enums to avoid crashing and only generate logs of the errors.Rodolfo Bogado4-37/+61
2018-11-11set sampler max lod, min lod, lod bias and max anisotropyRodolfo Bogado3-13/+33
2018-11-11Improved GPU Caches lookup SpeedFernandoS271-18/+17
2018-11-10gl_shader_decompiler: Guard out of bound geometry shader input readsReinUsesLisp4-15/+24
Geometry shaders follow a pattern that results in out of bound reads. This pattern is: - VSETP to predicate - Use that predicate to conditionally set a register a big number - Use the register to access geometry shaders At the time of writing this commit I don't know what's the intent of this number. Some drivers argue about these out of bound reads. To avoid this issue, input reads are guarded limiting reads to the highest posible vertex input of the current topology (e.g. points to 1 and triangles to 3).
2018-11-08gl_rasterizer_cache: Remove unnecessary memory allocation and copy in CopySurfaceFrederic Laing1-10/+7
2018-11-08gl_rasterizer: Fix compiler warningsFrederic Laing1-2/+2
2018-11-08rasterizer_cache: Remove reliance on the System singletonLioncash9-10/+25
Rather than have a transparent dependency, we can make it explicit in the interface. This also gets rid of the need to put the core include in a header.
2018-11-08rasterizer_cache: Add missing virtual destructor to RasterizerCacheObjectLioncash3-0/+10
Ensures that destruction will always do the right thing in any context.
2018-11-08gl_resource_manager: Amend clang-format discrepanciesLioncash1-4/+2
Fixes the buildbot.
2018-11-08Correct issue where texturelod could not be applied to 2darrayshadowFernandoS271-1/+5
2018-11-07Implement 3 coordinate array in TEXS instructionFernandoS271-6/+6
2018-11-06gl_rasterizer: Skip VAO binding if the state is clean.Markus Wick3-2/+21
2018-11-06gl_rasterizer: Split VAO and VB setup functions.Markus Wick2-5/+16
2018-11-06gl_rasterizer_cache: Add profiles for Copy and Blit.Markus Wick1-2/+6
They were missed, and Copy is very high in profile here. It doesn't block the GPU, but it stalls the driver thread. So with our bad GL instructions, this might block quite a while.
2018-11-06gl_resource_manager: Profile creation and deletion.Markus Wick1-0/+42
2018-11-06gl_stream_buffer: Profile orphaning of stream buffer.Markus Wick1-0/+5
This serialize to the driver thread and so it may block for a while. So if it is in the benchmark, we get noticed if it happens too often.
2018-11-06gl_resource_manager: Split implementations in .cpp file.Markus Wick5-114/+167
Those implementations are quite costly, so there is no need to inline them to the caller. Ressource deletion is often a performance bug, so in this way, we support to add breakpoints to them.
2018-11-05Add support to color mask to avoid issues in blending caused by wrong values in the alpha channel in some render targets.Rodolfo Bogado5-25/+79
2018-11-05Implement multi-target viewports and blendingRodolfo Bogado6-128/+259
2018-11-02correct syntaxgreggameplayer1-4/+3
2018-11-02Fix ASTC Decompressor to support depth parameterFernandoS276-62/+128
2018-11-01memory_manager: Do not MapBufferEx over already in use memory.bunnei2-31/+52
- This fixes rendering when changing areas in Super Mario Odyssey.
2018-11-01Fix ASTC formatsFernandoS273-11/+20
2018-11-01Implemented ASTC 5x5FernandoS271-1/+5
2018-11-01Implement Cube ArraysFernandoS274-0/+20
2018-11-01maxwell_3d: Restructure macro upload to use a single macro code memory.bunnei4-27/+55
- Fixes an issue where macros could be skipped. - Fixes rendering of distant objects in Super Mario Odyssey.
2018-10-31Implement SurfaceTarget Texture2DArraygreggameplayer1-0/+1
( needed by Mario+Rabbids Kingdom Battle )
2018-10-31Improve OpenGL state handlingRodolfo Bogado3-105/+158
2018-10-30video_core: Move surface declarations out of gl_rasterizer_cacheReinUsesLisp6-898/+954
2018-10-30Assert Control Codes GenerationFernandoS272-1/+103
2018-10-30global: Use std::optional instead of boost::optional (#1578)Frederic L17-97/+107
* get rid of boost::optional * Remove optional references * Use std::reference_wrapper for optional references * Fix clang format * Fix clang format part 2 * Adressed feedback * Fix clang format and MacOS build
2018-10-29video_core: Move OpenGL specific utils to its rendererReinUsesLisp6-30/+61
2018-10-29renderer_opengl: Correct bpp value for ASTC_2D_8X5_SRGBRodolfo Bogado1-1/+1
2018-10-29Assert Control Flow Instructions using Control CodesFernandoS272-3/+28
2018-10-29Fixed black textures, pixelation and we no longer require to auto-generate mipmapsFernandoS271-14/+2
2018-10-29Fixed mipmap block autosizing algorithmFernandoS273-13/+25
2018-10-29Fixed Invalid Image size and Mipmap calculationFernandoS271-4/+7
2018-10-29Fixed Block Resizing algorithm and Clang FormatFernandoS273-12/+19
2018-10-29Implement Mip FilterFernandoS274-10/+33
2018-10-29Zero out memory region of recreated surface before flushingFernandoS271-0/+2
2018-10-28Implement MipmapsFernandoS272-101/+211
2018-10-28Enable alpha channel for DXT1 texture formatMichael1-2/+2
2018-10-28Correct bpp value for ASTC_2D_8X5Tobias1-1/+1
2018-10-28Refactor precise usage and add FMNMX, MUFU, FMUL32 and FADD332FernandoS272-74/+37
2018-10-28Implement sRGB Support, including workarounds for nvidia driver issues and QT sRGB supportRodolfo Bogado8-40/+197
2018-10-28Improved Shader accuracy on Vertex and Geometry Shaders with FFMA, FMUL and FADDFernandoS272-6/+58
2018-10-27Implement Default Block Height for each formatFernandoS271-0/+62
2018-10-27gl_rasterizer_cache: Fix compiler warningFrederic Laing1-2/+2
2018-10-26gl_rasterizer: Implement primitive restart.bunnei5-1/+40
2018-10-26maxwell_3d: Add code for initializing register defaults.bunnei2-1/+21
2018-10-26gl_rasterizer: Implement depth range.bunnei4-13/+20
2018-10-24Implemented LD_L and ST_LFernandoS273-12/+112
2018-10-24Implement Shader Local MemoryFernandoS271-0/+37
2018-10-24decoders: Remove unused variable within SwizzledData()Lioncash1-1/+0
2018-10-24maxwell_3d: Remove unused variable within ProcessQueryGet()Lioncash1-1/+0
2018-10-23Implement PointSizeFernandoS273-5/+28
2018-10-23Fixed Layered Textures Loading and CubemapsFernandoS273-72/+109
2018-10-23gl_shader_decompiler: Implement VSETPReinUsesLisp2-0/+26
2018-10-23gl_shader_decompiler: Abstract VMAD into a video subsetReinUsesLisp2-75/+82
2018-10-23Added Saturation to FMUL32IFernandoS272-3/+8
2018-10-22Assert that multiple render targets are not set while alpha testingFernandoS273-3/+17
2018-10-22Use standard UBO and fix/stylize the codeFernandoS278-91/+51
2018-10-22Cache uniform locations and restructure the implementationFernandoS273-33/+29
2018-10-22Remove SyncAlphaTest and clang formatFernandoS274-8/+9
2018-10-22Added Alpha FuncFernandoS272-3/+43
2018-10-22Implemented Alpha TestingFernandoS276-3/+59
2018-10-22Fixed FSETP and FSETFernandoS272-30/+12
2018-10-22Fixed VAOs Float types only returning GL_FLOAT in cases that they had to return GL_HALF_FLOATFernandoS271-2/+14
2018-10-20engines/maxwell_*: Use nested namespace specifiers where applicableLioncash3-12/+6
These three source files are the only ones within the engines directory that don't use nested namespaces. We may as well change these over to keep things consistent.
2018-10-20maxwell_dma: Make variables const where applicable within HandleCopy()Lioncash1-3/+3
These are never modified, so we can make that assumption explicit.
2018-10-20maxwell_dma: Make FlushAndInvalidate's size parameter a u64Lioncash1-1/+1
This prevents truncation warnings at the lambda's usage sites.
2018-10-20maxwell_dma: Remove unused variables in HandleCopy()Lioncash1-3/+0
These pointer variables are never used, so we can get rid of them.
2018-10-20gl_shader_decompiler: Allow std::move to function in SetPredicateLioncash1-1/+1
If the variable being moved is const, then std::move will always perform a copy (since it can't actually move the data).
2018-10-20gl_shader_decompiler: Get rid of variable shadowing warningsLioncash1-2/+2
A variable with the same name was previously declared in an outer scope.
2018-10-20gl_shader_decompiler: Fix a few comment typosLioncash1-3/+4
2018-10-20gl_shader_decompiler: Move position varying declaration back to gl_shader_genReinUsesLisp3-13/+9
The intention of declaring them in gl_shader_decompiler was to be able to use blocks to implement geometry shaders. But that wasn't needed in the end and it caused issues when both vertex stages were being used, resulting in a redeclaration of "position".
2018-10-19GPU: Improved implementation of maxwell DMA (Subv).bunnei3-17/+66
2018-10-19decoders: Introduce functions for un/swizzling subrects.bunnei2-0/+49
2018-10-19GPU: Invalidate destination address of kepler_memory writes.bunnei3-3/+17
2018-10-19fermi_2d: Add support for more accurate surface copies.bunnei2-3/+12
2018-10-18gl_shader_decompiler: Implement PBK and BRKReinUsesLisp2-22/+43
2018-10-18Clang format and other fixesFernandoS271-16/+0
2018-10-18Implement Reinterpret Surface, to accurately blit 3D texturesFernandoS271-2/+4
2018-10-18Implement GetInRange in the Rasterizer CacheFernandoS271-0/+16
2018-10-18Implement 3D TexturesFernandoS274-1/+10
2018-10-18gl_rasterizer_cache: Remove unnecessary block_depth=1 on Flush.bunnei1-1/+0
2018-10-18gl_rasterizer_cache: Remove unnecessary temporary buffer with unswizzle.bunnei1-5/+2
2018-10-16gl_rasterizer_cache: Use AccurateCopySurface for use_accurate_gpu_emulation.bunnei2-2/+18
2018-10-16config: Rename use_accurate_framebuffers -> use_accurate_gpu_emulation.bunnei3-6/+6
- This will be used as a catch-all for slow-but-accurate GPU emulation paths.
2018-10-16rasterizer_cache: Refactor to support in-order flushing.bunnei6-63/+116
2018-10-16gl_rasterizer_cache: Refactor to only call GetRegionEnd on surface creation.bunnei2-16/+23
2018-10-16gl_rasterizer_cache: Only flush when use_accurate_framebuffers is enabled.bunnei2-2/+13
2018-10-16gl_rasterizer_cache: Separate guest and host surface size managment.bunnei2-92/+94
2018-10-16gl_rasterizer_cache: Rename GetGLBytesPerPixel to GetBytesPerPixel.bunnei2-17/+18
- This does not really have anything to do with OpenGL.
2018-10-16gl_rasterizer_cache: Remove unused FlushSurface method.bunnei2-7/+0
2018-10-16gl_rasterizer: Implement flushing.bunnei1-1/+25
2018-10-16gl_rasterizer_cache: Remove usage of Memory::Read/Write functions.bunnei1-13/+8
- These cannot be used within the cache, as they change cache state.
2018-10-16gl_rasterizer_cache: Clamp cached surface size to mapped GPU region size.bunnei2-19/+37
2018-10-16memory_manager: Add a method for querying the end of a mapped GPU region.bunnei2-0/+11
2018-10-16rasterizer_cache: Reintroduce method for flushing.bunnei3-0/+23
2018-10-16gl_rasterizer_cache: Reintroduce code for handling swizzle and flush to guest RAM.bunnei2-28/+119
2018-10-15shader_bytecode: Add Control Code enum 0xfReinUsesLisp1-1/+1
Control Code 0xf means to unconditionally execute the instruction. This value is passed to most BRA, EXIT and SYNC instructions (among others) but this may not always be the case.
2018-10-15gl_shader_decompiler: Fixup style inconsistenciesReinUsesLisp1-5/+3
2018-10-15gl_rasterizer: Silence implicit cast warning in glBindBufferRangeReinUsesLisp1-1/+2
2018-10-15gl_shader_decompiler: Implement HSET2_RReinUsesLisp2-0/+62
2018-10-15gl_shader_decompiler: Implement HSETP2_RReinUsesLisp2-0/+65
2018-10-15gl_shader_decompiler: Implement HFMA2 instructionsReinUsesLisp2-0/+85
2018-10-15gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMMReinUsesLisp2-0/+73
2018-10-15gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructionsReinUsesLisp2-0/+75
2018-10-15gl_shader_decompiler: Setup base for half float unpacking and settingReinUsesLisp2-0/+98
2018-10-14Implement Arrays on Tex InstructionFernandoS271-14/+55
2018-10-14Fix TLDSFernandoS271-1/+5
2018-10-14Shorten the implementation of 3D swizzle to only 3 functionsFernandoS271-70/+27
2018-10-13Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBufferFernandoS272-19/+2
2018-10-13Propagate depth and depth_block on modules using decodersFernandoS277-52/+64
2018-10-13Remove old Swizzle algorithms and use 3d SwizzleFernandoS271-93/+69
2018-10-13Implement Precise 3D SwizzleFernandoS271-3/+71
2018-10-13Implement Fast 3D SwizzleFernandoS271-2/+74
2018-10-13Added ASTC 5x4; 8x5Hexagon123-6/+32
2018-10-12Implemented helper function to correctly calculate a texture's sizeFernandoS272-0/+22
2018-10-11gl_shader_decompiler: Implement VMADReinUsesLisp2-0/+118
2018-10-10Add memory Layout to Render Targets and Depth BuffersFernandoS273-21/+33
2018-10-10Fixed block height settings for RenderTargets and Depth Buffers, and added block width and block depthFernandoS275-12/+63
2018-10-09gl_shader_decompiler: Remove unused variables in TMML's implementationLioncash1-7/+3
Given "y" isn't always used, but "x" is, we can rearrange this to avoid unused variable warnings by changing the names of op_a and op_b
2018-10-09Implement Scissor TestFernandoS271-4/+9
2018-10-09Assert Scissor testsFernandoS273-1/+31
2018-10-07gl_shader_decompiler: Move position varying location from 15 to 0 and apply an offsetReinUsesLisp1-6/+10
2018-10-07gl_shader_decompiler: Implement geometry shadersReinUsesLisp10-107/+522
2018-10-07video_core: Allow LabelGLObject to use extra info on any objectReinUsesLisp1-10/+14
2018-10-07gl_rasterizer: Fixup undefined behaviour in SetupDrawReinUsesLisp1-0/+1
2018-10-06Implemented Depth Compare and Shadow SamplersFernandoS276-65/+224
2018-10-06fermi_2d: Implement simple copies with AccelerateSurfaceCopy.bunnei3-24/+36
2018-10-06gl_rasterizer: Add rasterizer cache code to handle accerated fermi copies.bunnei5-16/+60
2018-10-06gl_rasterizer_cache: Implement a simpler surface copy using glCopyImageSubData.bunnei1-0/+21
2018-10-04gl_rasterizer: Implement quads topologyReinUsesLisp8-46/+236
2018-10-03Implemented Texture Processing Modes in TEXS and TLDSFernandoS271-5/+42
2018-10-01gl_rasterizer: Fixup unassigned point sizesReinUsesLisp1-1/+4
2018-09-30gl_rasterizer_cache: Fixes to how we do render to cubemap.bunnei2-32/+5
- Fixes issues with Splatoon 2.
2018-09-30gl_rasterizer_cache: Add check for array rendering to cubemap texture.bunnei1-0/+8
2018-09-30gl_rasterizer_cache: Implement render to cubemap.bunnei3-119/+218
2018-09-30gl_shader_decompiler: TEXS: Implement TextureType::TextureCube.bunnei1-0/+8
2018-09-30gl_rasterizer_cache: Add support for SurfaceTarget::TextureCubemap.bunnei2-1/+36
2018-09-30gl_rasterizer_cache: Implement LoadGLBuffer for Texture2DArray.bunnei1-0/+8
2018-09-30gl_rasterizer_cache: Update BlitTextures to support non-Texture2D ColorTexture surfaces.bunnei1-23/+88
2018-09-30gl_rasterizer_cache: Track texture target and depth in the cache.bunnei1-2/+3
2018-09-30gl_rasterizer_cache: Workaround for Texture2D -> Texture2DArray scenario.bunnei3-6/+21
2018-09-30gl_rasterizer_cache: Keep track of surface 2D size separately from total size.bunnei2-32/+46
2018-09-30Fix trailing whitespaceraven021-1/+4
2018-09-28video_core: Implement point_size and add point state syncReinUsesLisp5-1/+27
2018-09-28gl_state: Pack sampler bindings into a single ARB_multi_bindReinUsesLisp5-8/+25
2018-09-26video_core: Add asserts for CS, TFB and alpha testingReinUsesLisp5-3/+92
Add asserts for compute shader dispatching, transform feedback being enabled and alpha testing. These have in common that they'll probably break rendering without logging.
2018-09-23Added glObjectLabels for renderdoc for textures and shader programs (#1384)David4-0/+48
* Added glObjectLabels for renderdoc for textures and shader programs * Changed hardcoded "Texture" name to reflect the texture type instead * Removed string initialize
2018-09-23correct BC6Hgreggameplayer1-2/+2
2018-09-22gl_state: Remove unused type aliasLioncash2-4/+1
This isn't used anywhere within the header, so we can remove it, along with the include that was previously necessary. This also uncovers an indirect include in the cpp file for the assertion macros.
2018-09-21shader_bytecode: Lay out the Ipa-related enums betterLioncash1-2/+12
This is more consistent with the surrounding enums.
2018-09-21shader_bytecode: Make operator== and operator!= of IpaMode const qualifiedLioncash1-6/+7
These don't affect the state of the struct and can be const member functions.
2018-09-21Reverse stride align restriction on FastSwizzle due to lost performanceFernandoS271-3/+2
2018-09-21Join both Swizzle methods within one interface functionFernandoS271-11/+19
2018-09-21Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table insteadFernandoS271-42/+38
2018-09-21Remove same output bpp restriction on FastSwizzleFernandoS271-4/+5
2018-09-21Improved Legacy Swizzler to be better documented and work betterFernandoS271-15/+21
2018-09-21gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()Lioncash1-1/+1
This was very likely intended to be a logical OR based off the conditioning and testing of inversion in one case. Even if this was intentional, this is the kind of non-obvious thing one should be clarifying with a comment.
2018-09-21RasterizerGL: Use the correct framebuffer when clearing via the CLEAR_BUFFERS register.Subv1-1/+1
Previously we were clearing the default backbuffer framebuffer. Found thanks to a Piglit test :)
2018-09-21Improved fast swizzle and removed restrictions to itFernandoS271-7/+12
2018-09-19gl_rasterizer: Fix StartAddress handling with indexed draw calls.Markus Wick1-6/+7
We uploaded the wrong data before. So the offset on the host GPU pointer may work for the first vertices, the last ones run out bounds. Let's just offset the upload instead.
2018-09-18Implemented Internal FlagsFernandoS271-13/+35
2018-09-18gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A codeLioncash1-4/+4
These are internally stored as u64 values, so using u32 here causes truncation warnings. Instead, we can just use u64 and preserve the bit width.
2018-09-17Implemented I2I.CC on the NEU control code, used by SMOFernandoS272-14/+18
2018-09-17Implemented CSETPFernandoS272-14/+49
2018-09-17Implemented Control CodesFernandoS272-0/+51
2018-09-17Added asserts for texture misc modes to texture instructionsFernandoS271-2/+45
2018-09-17Added texture misc modes to texture instructionsFernandoS271-1/+147
2018-09-17Add 1D sampler for TLDS - TexelFetch (Mario Rabbids)raven021-7/+12
2018-09-16Implement ASTC_2D_8X8 (Bayonetta 2)raven023-6/+20
2018-09-15Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX)raven022-0/+4
2018-09-15Shaders: Implemented multiple-word loads and stores to and from attribute memory.Subv2-7/+58
This seems to be an optimization performed by nouveau.
2018-09-15Port #4182 from Citra: "Prefix all size_t with std::"fearlessTobi20-133/+138
2018-09-14Optimized Texture SwizzlingFernandoS271-2/+49
2018-09-14gl_shader_decompiler: Get rid of variable shadowing within LEA instructionsLioncash1-2/+0
These variables are already defined within an outer scope.
2018-09-13Use ARB_multi_bind for uniform buffers (#1287)ReinUsesLisp2-3/+23
* gl_rasterizer: use ARB_multi_bind for uniform buffers * address feedback
2018-09-13gl_rasterizer_cache: B5G6R5U should use GL_RGB8 as an internal format.bunnei1-1/+1
- Fixes a regression with Sonic Mania with ARB_texture_storage.
2018-09-12GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).Subv6-0/+146
This engine writes data from a FIFO register into the configured address.
2018-09-12Implemented Texture Processing ModesFernandoS272-1/+43
2018-09-12gl_rasterizer_cache: Always blit on recreate, regardless of format.bunnei1-6/+10
- Fixes several rendering issues with Super Mario Odyssey.
2018-09-12gl_shader_cache: Remove cache_width/cache_height.bunnei2-12/+2
- This was once an optimization, but we no longer need it with the cache reserve. - This is also inaccurate.
2018-09-11gl_rasterizer: Use ARB_texture_storage.Markus Wick1-11/+8
It allows us to use texture views and it reduces the overhead within the GPU driver. But it disallows us to reallocate the texture, but we don't do so anyways. In the end, it is the new way to allocate textures, so there is no need to use the old way.
2018-09-11Implemented LEA and PSETFernandoS271-0/+91
2018-09-11Implemented encodings for LEA and PSETFernandoS271-0/+64
2018-09-11Replace old FragmentHeader for the new HeaderFernandoS272-31/+18
2018-09-11Implemented (Partialy) Shader HeaderFernandoS273-2/+102
2018-09-11Fixed renderdoc input/output textures not working due to render targetsDavid Marcec2-2/+9
2018-09-10video_core: Refactor command_processor.Markus Wick2-44/+42
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10video_core: Move command buffer loop.Markus Wick3-46/+72
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10rasterizer: Drop unused handler.Markus Wick4-8/+0
This virtual function is called in a very hot spot, and it does nothing. If this kind of feature is required, please be more specific and add callbacks in the switch statement within Maxwell3D::WriteReg. There is no point in having another switch statement within the rasterizer.
2018-09-10gl_rasterizer_cache: Only use depth for applicable texture formats.bunnei1-6/+22
- Fixes an issue with Octopath Traveler leaving stale data here.
2018-09-10gl_rasterizer: Implement clear for non-zero render targets.bunnei2-50/+66
- Several misc. changes to ConfigureFramebuffers in support of this.
2018-09-10gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.bunnei3-0/+4
- Used by Octopath Traveler (with multiple render targets).
2018-09-10gl_rasterizer: Implement multiple color attachments.bunnei5-132/+95
2018-09-10Implemented TMMLFernandoS272-5/+67
2018-09-09Implemented TXQ dimension query type, used by SMO.FernandoS272-1/+36
2018-09-09video_core: fixed arithmetic overflow warnings & improved code stylePatrick Elsässer5-89/+101
- Fixed all warnings, for renderer_opengl items, which were indicating a possible incorrect behavior from integral promotion rules and types larger than those in which arithmetic is typically performed. - Added const for variables where possible and meaningful. - Added constexpr where possible.
2018-09-09Port Citra #4047 & #4052: add change background color supporttech4me3-0/+8
2018-09-09Change name of TEXQ to TXQ, in order to match NVIDIA's namingFernandoS271-2/+2
2018-09-08GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.Subv1-2/+10
When not set, this tells the GPU to only use the X size when performing a DMA copy. This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert. This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08gl_rasterizer: Use baseInstance instead of moving the buffer points.bunnei1-21/+25
This hopefully helps our cache not to redundant upload the vertex buffer. # Conflicts: # src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)Patrick Elsässer1-12/+14
* video_core: Arithmetic overflow fix for gl_rasterizer - Fixed warnings, which were indicating incorrect behavior from integral promotion rules and types larger than those in which arithmetic is typically performed. - Added const for variables where possible and meaningful. * Changed the casts from C to C++ style Changed the C-style casts to C++ casts as proposed. Took also care about signed / unsigned behaviour.
2018-09-08gl_rasterizer_cache: Improve accuracy of RecreateSurface for non-2D textures.bunnei2-27/+45
2018-09-08maxwell_3d: Remove assert that no longer applies.bunnei1-4/+0
2018-09-08gl_rasterizer_cache: Partially implement several non-2D texture types.bunnei1-30/+111
2018-09-08gl_shader_decompiler: Partially implement several non-2D texture types (Subv).bunnei2-32/+143
2018-09-08gl_rasterizer: Implement texture wrap mode p.bunnei2-2/+8
2018-09-08gl_rasterizer_cache: Track texture depth.bunnei3-4/+15
2018-09-08gl_rasterizer_cache: Remove impl. of FlushGLBuffer.bunnei1-34/+1
- Will not work for non-2d textures, and was not used anyways.
2018-09-08gl_rasterizer_cache: Keep track of texture type per surface.bunnei3-32/+84
2018-09-08gl_rasterizer_cache: Remove unused DownloadGLTexture.bunnei2-51/+0
2018-09-08gl_state: Keep track of texture target.bunnei5-26/+28
2018-09-06gl_rasterizer: Call state.Apply only once on SetupShaders.bunnei1-4/+2
2018-09-06gl_shader_decompiler: Implement saturate mode for IPA.bunnei1-1/+5
2018-09-06gl_buffer_cache: Default initialize member variablesLioncash1-3/+3
Ensures that the cache always has a deterministic initial state.
2018-09-06gl_buffer_cache: Make GetHandle() a const member functionLioncash2-2/+2
GetHandle() internally calls GetHandle() on the stream_buffer instance, which is a const member function, so this can be made const as well.
2018-09-06gl_buffer_cache: Remove unnecessary includesLioncash2-2/+4
2018-09-06gl_buffer_cache: Make constructor explicitLioncash1-1/+1
Implicit conversions during construction isn't desirable here.
2018-09-06video_core/CMakeLists: Add missing gl_buffer_cache.hLioncash1-0/+1
Without this, the header file won't show up by default within IDEs such as Visual Studio.
2018-09-06gl_shader_gen: Initialize position.Markus Wick1-0/+1
IMO the old code is fine, but nvidia raises shader compiler warnings. Trivial fix through...
2018-09-06Implemented IPA ProperlyFernandoS272-47/+98
2018-09-05gl_rasterizer: Skip TODO log.Markus Wick1-1/+1
This is called ~3k times per frame in SMO ingame. My laptop spends ~3ms per frame on allocating and freeing this string. Let's just stop printing this kind of redundant information.
2018-09-05gl_rasterizer: Implement a VAO cache.Markus Wick3-53/+60
This patch caches VAO objects instead of re-emiting all pointers per draw call. Configuring this pointers is known as a fast task, but it yields too many GL calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05renderer_opengl: Implement a buffer cache.Markus Wick5-86/+182
The idea of this cache is to avoid redundant uploads. So we are going to cache the uploaded buffers within the stream_buffer and just reuse the old pointers. The next step is to implement a VBO cache on GPU memory, but for now, I want to check the overhead of the cache management. Fetching the buffer over PCI-E should be quite fast.
2018-09-04gl_shader_cache: Use an u32 for the binding point cache.Markus Wick4-15/+23
The std::string generation with its malloc and free requirement was a noticeable overhead. Also switch to an ordered_map to avoid the std::hash call. As those maps usually have a size of two elements, the lookup time shall not matter.
2018-09-04command_processor: Use std::array for bound_engines.Markus Wick2-4/+4
subchannel is a 3 bit field. So there must not be more than 8 bound engines. And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04Update microprofile scopes.Markus Wick1-3/+11
Blame the subsystems which deserve the blame :) The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-02gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()Lioncash1-1/+1
Using the getter function intended for external code here makes an unnecessary copy of the already-accessible used_shaders vector.
2018-09-01Removed saturate assertDavid Marcec2-2/+0
Unneeded as we already implement it
2018-09-01Removed saturate assertDavid Marcec2-2/+0
Saturate already implemented
2018-09-01Changed tab5980_0 default from 0 -> 1David Marcec1-2/+2
2018-09-01Added FMUL assertsDavid Marcec2-0/+15
2018-09-01Added FFMA assertsDavid Marcec2-0/+11
2018-09-01Added assert for TEXS nodepDavid Marcec2-0/+3
2018-09-01Added better asserts to IPA, Renamed IPA modes to match mesaDavid Marcec2-6/+13
IpaMode is changed to IpaInterpMode IpaMode is suppose to be 2 bits not 3 Added IpaSampleMode Added Saturate Renamed modes based on https://github.com/mesa3d/mesa/blob/d27c7918916cdc8092959124955f887592e37d72/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp#L2530
2018-09-01maxwell_3d: Use CoreTiming for query timestampZach Hilman1-2/+3
2018-08-31core/core: Replace includes with forward declarations where applicableLioncash3-4/+4
The follow-up to e2457418dae19b889b2ad85255bb95d4cd0e4bff, which replaces most of the includes in the core header with forward declarations. This makes it so that if any of the headers the core header was previously including change, then no one will need to rebuild the bulk of the core, due to core.h being quite a prevalent inclusion. This should make turnaround for changes much faster for developers.
2018-08-31gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies.bunnei2-73/+54
2018-08-31gl_rasterizer_cache: Also use reserve cache for RecreateSurface.bunnei2-24/+18
2018-08-31rasterizer_cache: Use boost::interval_map for a more accurate cache.bunnei1-33/+45
2018-08-31gl_renderer: Cache textures, framebuffers, and shaders based on CPU address.bunnei8-100/+53
2018-08-31gl_rasterizer: Fix issues with the rasterizer cache.bunnei4-46/+57
- Use a single cached page map. - Fix calculation of ending page.
2018-08-31Implement BC6H_UF16 & BC6H_SF16 (#1092)greggameplayer3-31/+55
* Implement BC6H_UF16 & BC6H_SF16 Require by ARMS * correct coding style * correct coding style part 2
2018-08-31core: Make the main System class use the PImpl idiomLioncash1-3/+4
core.h is kind of a massive header in terms what it includes within itself. It includes VFS utilities, kernel headers, file_sys header, ARM-related headers, etc. This means that changing anything in the headers included by core.h essentially requires you to rebuild almost all of core. Instead, we can modify the System class to use the PImpl idiom, which allows us to move all of those headers to the cpp file and forward declare the bulk of the types that would otherwise be included, reducing compile times. This change specifically only performs the PImpl portion.
2018-08-31Report correct shader size.Markus Wick1-1/+1
Seems like this was an oversee in regards to 1fd979f50a9f4c21fa8cafba7268d959e3076924 It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31Added predicate comparison GreaterEqualWithNanHexagon122-3/+4
2018-08-31gl_shader_decompiler: Implement POPC (#1203)Laku2-0/+19
* Implement POPC * implement invert
2018-08-29Shaders: Implemented IADD3tech4me2-1/+84
2018-08-29gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.bunnei2-1/+39
2018-08-28gl_shader_cache: Remove unused program_code vector in GetShaderAddress()Lioncash1-2/+1
Given std::vector is a type with a non-trivial destructor, this variable cannot be optimized away by the compiler, even if unused. Because of that, something that was intended to be fairly lightweight, was actually allocating 32KB and deallocating it at the end of the function.
2018-08-28gpu: Make memory_manager privateLioncash4-16/+30
Makes the class interface consistent and provides accessors for obtaining a reference to the memory manager instance. Given we also return references, this makes our more flimsy uses of const apparent, given const doesn't propagate through pointers in the way one would typically expect. This makes our mutable state more apparent in some places.
2018-08-28gl_rasterizer: Remove unused variablesLioncash1-2/+0
2018-08-28renderer_opengl: Implement a new shader cache.bunnei9-285/+250
2018-08-28gl_rasterizer_cache: Update to use RasterizerCache base class.bunnei3-132/+20
2018-08-28video_core: Add RasterizerCache class for common cache management code.bunnei2-0/+117
2018-08-25debug_utils: Remove unused includesLioncash2-23/+0
Quite a bit of these aren't necessary directly within the debug_utils header and can be removed or included where actually necessary.
2018-08-25debug_utils: Make BreakpointObserver class' constructor explicitLioncash1-1/+1
Avoids implicit conversions.
2018-08-25debug_utils: Initialize active_breakpoint member of DebugContextLioncash1-2/+2
Ensures that all class members are initialized.
2018-08-25maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()Lioncash1-4/+4
The start and finish events should likely not be right after one another like this, otherwise the batch will appear to complete immediately
2018-08-24fix SEL_IMM bitstringLaku1-1/+1
2018-08-24gl_rasterizer: Correct assertion condition in SyncLogicOpState()Lioncash1-1/+2
Previously the assert would always be hit, since it was the equivalent of: array == nullptr, which is never true.
2018-08-23Shaders: Added decodings for IADD3 instructionstech4me1-0/+6
2018-08-23gl_rasterizer_cache: Blit when possible on RecreateSurface.bunnei1-5/+12
2018-08-23gl_rasterizer_cache: Reserve surfaces that have already been created for later use.bunnei2-3/+61
2018-08-23gl_rasterizer_cache: Remove assert for RecreateSurface type.bunnei1-1/+0
2018-08-23gl_rasterizer_cache: Implement compressed texture copies.bunnei1-8/+18
2018-08-23gl_rasterizer: Implement stencil test.bunnei3-4/+58
- Used by Splatoon 2.
2018-08-23gl_rasterizer: Implement partial color clear and stencil clear.bunnei1-12/+42
2018-08-23maxwell_3d: Update to include additional stencil registers.bunnei1-20/+50
2018-08-23gl_state: Update to handle stencil front/back face separately.bunnei2-33/+38
2018-08-22gl_shader_gen: Make ShaderSetup's constructor explicitLioncash1-1/+1
Prevents implicit conversions.
2018-08-22gl_shader_gen: Use a std::vector to represent program code instead of std::arrayLioncash2-11/+16
While convenient as a std::array, it's also quite a large set of data as well (32KB). It being an array also means data cannot be std::moved. Any situation where the code is being set or relocated means that a full copy of that 32KB data must be done. If we use a std::vector we do need to allocate on the heap, however, it does allow us to std::move the data we have within the std::vector into another std::vector instance, eliminating the need to always copy the program data (as std::move in this case would just transfer the pointers and bare necessities over to the new vector instance).
2018-08-22more fixesLaku1-6/+7
2018-08-22fixesLaku1-6/+12
2018-08-22renderer_opengl: Namespace OpenGL codeLioncash21-23/+70
Namespaces all OpenGL code under the OpenGL namespace. Prevents polluting the global namespace and allows clear distinction between other renderers' code in the future.
2018-08-22remove debug loggingLaku1-2/+0
2018-08-22implement lop3Laku2-0/+55
2018-08-22maxwell_to_gl: Implement PrimitiveTopology::LinesOatmealDome1-0/+2
Used by Splatoon 2's debug menu.
2018-08-22Revert "Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions."bunnei2-153/+31
- This reverts commit 3ef4b3d4b445960576f10d1ba6521580d03e3da8. - This commit had broken a lot of games. We really should do a full implementation of this in one change.
2018-08-21shader_bytecode: Parenthesize conditional expression within GetTextureType()Lioncash1-1/+1
Resolves a -Wlogical-op-parentheses warning.
2018-08-21renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logsLioncash1-1/+1
LOG_TRACE is only enabled on debug builds which can be quite slow when trying to debug graphics issues. Instead we can log the messages to the debug log, which is available on both release and debug builds.
2018-08-21gl_stream_buffer: Add missing header guardLioncash1-0/+2
Prevents potential compilation errors from occuring due to multiple inclusions
2018-08-21Shaders: Implement depth writing in fragment shaders.Subv1-1/+6
We'll write <last color output reg + 2> to gl_FragDepth.
2018-08-21shader_bytecode: Replace some UNIMPLEMENTED logs.bunnei1-2/+6
2018-08-21gl_shader_decompiler: Implement Texture3D for TEXS.bunnei1-0/+7
2018-08-21gl_shader_decompiler: Implement TextureCube for TEX.bunnei1-0/+8
2018-08-21Shaders: Fixed the coords in TEX with Texture2D.Subv1-1/+1
The X and Y coordinates should be in gpr8 and gpr8+1, respectively. This fixes the cutscene rendering in Sonic Mania.
2018-08-21Shaders: Log and crash when using an unimplemented texture type in a texture sampling instruction.Subv1-5/+14
2018-08-21GPU: Implemented the logic op functionality of the GPU.Subv3-0/+61
This will ASSERT if blending is enabled at the same time as logic ops.
2018-08-21GLState: Allow enabling/disabling GL_COLOR_LOGIC_OP independently from blending.Subv2-6/+19
2018-08-21rasterizer_interface: Remove ScreenInfo from AccelerateDraw()'s signatureLioncash5-17/+14
This is an OpenGL renderer-specific data type. Given that, this type shouldn't be used within the base interface for the rasterizer. Instead, we can pass this information to the rasterizer via reference.
2018-08-21GPU: Added registers for the logicop functionality.Subv1-1/+28
2018-08-21renderer_base: Make creation of the rasterizer, the responsibility of the renderers themselvesLioncash4-14/+12
Given we use a base-class type within the renderer for the rasterizer (RasterizerInterface), we want to allow renderers to perform more complex initialization if they need to do such a thing. This makes it important to reserve type information. Given the OpenGL renderer is quite simple settings-wise, this is just a simple shuffling of the initialization code. For something like Vulkan however this might involve doing something like: // Initialize and call rasterizer-specific function that requires // the full type of the instance created. auto raster = std::make_unique<VulkanRasterizer>(some, params); raster->CallSomeVulkanRasterizerSpecificFunction(); // Assign to base class variable rasterizer = std::move(raster)
2018-08-21Port #3353 from CitrafearlessTobi1-1/+1
2018-08-21Shaders: Write all the enabled color outputs when a fragment shader exits.Subv2-6/+45
We were only writing to the first render target before. Note that this is only the GLSL side of the implementation, supporting multiple render targets requires more changes in the OpenGL renderer. Dual Source blending is not implemented and stuff that uses it might not work at all.
2018-08-20Rasterizer: Reinterpret the raw texture bytes instead of blitting (and thus doing format conversion) to a new texture when a game requests an old texture address with a different format.Subv1-3/+49
2018-08-20Rasterizer: Don't attempt to copy over the old texture's data when doing a format reinterpretation if we're only going to clear the framebuffer.Subv4-13/+21
2018-08-20Implemented RGBA8_UINTDavid Marcec4-45/+58
Needed by kirby
2018-08-20Shaders/TEXS: Fixed the component mask in the TEXS instruction.Subv1-18/+18
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19Shaders/TEXS: Fixed the component mask in the TEXS instruction.Subv1-6/+11
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19Shader: Implemented the TLD4 and TLD4S opcodes using GLSL's textureGather.Subv1-0/+51
It is unknown how TLD4S determines the sampler type, more research is needed.
2018-08-19Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions.Subv2-29/+127
Different sampler types have their parameters in different registers.
2018-08-19Shader: Added bitfields for the texture type of the various sampling instructions.Subv1-1/+65
2018-08-19Shaders: Added decodings for TLD4 and TLD4SSubv1-3/+7
2018-08-19Shaders: Added decodings for the LDG and STG instructions.Subv1-0/+4
2018-08-19Shaders: Implemented the gl_FrontFacing input attribute (attr 63).Subv2-0/+7
2018-08-18Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions.Subv1-2/+0
2018-08-18GLRasterizer: Implemented instanced vertex arrays.Subv2-4/+30
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18Shader: Implemented the predicate and mode arguments of LOP.Subv2-11/+39
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)). This is used by Super Mario Odyssey.
2018-08-18Added WrapMode MirrorOnceClampToEdgeDavid Marcec1-0/+2
Used by splatoon 2
2018-08-18Shaders: Implemented a stack for the SSY/SYNC instructions.Subv1-3/+36
The SSY instruction pushes an address into the stack, and the SYNC instruction pops it. The current stack depth is 20, we should figure out if this is enough or not.
2018-08-18Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.Subv2-16/+38
We should definitely audit our shader generator for more errors like this.
2018-08-18Added predcondition GreaterThanWithNanDavid Marcec2-5/+8
2018-08-17gl_rasterizer_cache: Remove asserts for supported blits.bunnei1-2/+0
2018-08-17renderer_opengl: Treat OpenGL errors as critical.bunnei1-1/+1
2018-08-16gl_rasterizer_cache: Treat Depth formats differently from DepthStencil.bunnei2-16/+26
2018-08-15Shader/Conversion: Implemented the negate bit in F2F and I2I instructions.Subv1-4/+12
2018-08-15Shader/I2F: Implemented the negate I2F_C instruction variant.Subv1-7/+23
2018-08-15Shader/F2I: Implemented the negate bit in the I2F instructionSubv1-0/+4
2018-08-15Shader/F2I: Implemented the F2I_C instruction variant.Subv1-2/+10
2018-08-15Shader/F2I: Implemented the negate bit in the F2I instruction.Subv1-0/+4
2018-08-15gl_rasterizer_cache: Cleanup some PixelFormat names and logging.bunnei2-41/+71
2018-08-15Rasterizer: Implemented instanced rendering.Subv7-5/+28
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are. Instanced vertex arrays are not yet implemented.
2018-08-15gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.bunnei1-1/+9
- Used by Breath of the Wild.
2018-08-15Implement Z16_UNORM in PixelFormatFromTextureFormat functiongreggameplayer1-0/+2
Require by Zelda Breath Of The Wild
2018-08-15gl_shader_decompiler: Several fixes for indirect constant buffer loads.bunnei1-13/+22
2018-08-15gl_rasterizer: Fix upload size for constant buffers.bunnei1-3/+3
2018-08-15maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes.bunnei1-5/+20
2018-08-15gl_rasterizer_cache: Implement G8R8S format.bunnei2-34/+49
- Used by Super Mario Odyssey.
2018-08-14Fix BC7Ugreggameplayer1-1/+1
2018-08-14renderer_opengl: Implement RenderTargetFormat::RGBA16_UNORM.bunnei4-37/+48
- Used by Breath of the Wild.
2018-08-13Implement RG32UI and R32UIDavid Marcec4-7/+45
Needed for xenoblade
2018-08-13maxwell_to_gl: Implement VertexAttribute::Size::Size_8.bunnei1-0/+1
- Used by Breath of the Wild.
2018-08-13renderer_opengl: Implement RenderTargetFormat::RGBA16_UINT.bunnei4-34/+45
- Used by Breath of the Wild.
2018-08-13maxwell_to_gl: Implement PrimitiveTopology::LineStrip.bunnei1-0/+2
- Used by Breath of the Wild.
2018-08-13renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.bunnei4-26/+61
- Used by Breath of the Wild.
2018-08-13gl_shader_decompiler: Implement XMAD instruction.bunnei2-4/+120
2018-08-12gl_rasterizer: Use a shared helper to upload from CPU memory.Markus Wick2-28/+33
2018-08-12gl_state: Don't track constant buffer mappings.Markus Wick3-41/+3
2018-08-12gl_rasterizer: Use the stream buffer for constant buffers.Markus Wick4-29/+32
2018-08-12gl_rasterizer: Use the streaming buffer itself for the constant buffer.Markus Wick2-33/+15
Don't emut copies, especially not for data, which is used once. They just end in a huge GPU overhead.
2018-08-12gl_rasterizer: Use a helper for aligning the buffer.Markus Wick2-15/+22
2018-08-12Update the stream_buffer helper from Citra.Markus Wick4-184/+98
Please see https://github.com/citra-emu/citra/pull/3666 for more details.
2018-08-12gl_shader_decompiler: Fix SetOutputAttributeToRegister empty check.bunnei1-2/+2
2018-08-12gl_shader_decompiler: Fix GLSL compiler error with KIL instruction.bunnei1-0/+8
2018-08-12GPU/Maxwell3D: Implemented an alternative set of blend factors.Subv2-0/+40
These are used by nouveau and some games like SMO.
2018-08-12Implement R8_UINT RenderTargetFormat & PixelFormat (#1014)greggameplayer4-55/+74
- Used by Go Vacation
2018-08-12RasterizerGL: Ignore invalid/unset vertex attributes.Subv2-1/+11
This should make the es2gears example not crash anymore.
2018-08-12gl_rasterizer: Silence implicit truncation warning in SetupShaders()Lioncash1-1/+1
Previously this would warn of truncating a std::size_t to a u32. This is safe because we'll obviously never have more than UINT32_MAX amount of uniform buffers.
2018-08-12core: Namespace EmuWindowLioncash8-15/+26
Gets the class out of the global namespace.
2018-08-12gl_shader_decompiler: Improve handling of unknown input/output attributes.bunnei2-10/+11
2018-08-12gl_rasterizer: Implement render target format RG8_SNORM.bunnei4-8/+18
- Used by Super Mario Odyssey.
2018-08-12gl_rasterizer: Implement render target format RGBA8_SNORM.bunnei4-64/+83
- Used by Super Mario Odyssey.
2018-08-11GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY).Subv2-2/+13
2018-08-11GPU/Shaders: Implemented SSY and SYNC as a way to modify control flow during shader execution.Subv1-6/+25
SSY sets the target label to jump to when the SYNC instruction is executed.
2018-08-11Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats and more (R16_UNORM needed by Fate Extella) (#848)greggameplayer4-19/+92
* Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats Do a separate function in order to get Bytes Per Pixel of DepthFormat Apply the new function in gpu.h delete unneeded white space * correct merging error
2018-08-11video_core; Get rid of global g_toggle_framelimit_enabled variableLioncash6-25/+42
Instead, we make a struct for renderer settings and allow the renderer to update all of these settings, getting rid of the need for global-scoped variables. This also uncovered a few indirect inclusions for certain headers, which this commit also fixes.
2018-08-11renderer_base: Remove unused kFramebuffer enumerationLioncash1-3/+0
This is entirely unused and can be removed.
2018-08-11video_core: Remove unused Renderer enumerationLioncash1-2/+0
Currently we only have an OpenGL renderer, so this is unused in code (and occupies the Renderer identifier in the VideoCore namespace).
2018-08-10maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.bunnei1-0/+1
- Used by Super Mario Odyssey.
2018-08-10maxwell_to_gl: Implement VertexAttribute::Size::Size_32_32_32.bunnei1-0/+2
- Used by Super Mario Odyssey.
2018-08-10Revert "gl_state: Temporarily disable culling and depth test."bunnei1-3/+1
2018-08-10gl_rasterizer_cache: Remove unused viewport parameter of GetFramebufferSurfaces()Lioncash3-8/+6
2018-08-10video_core: Use variable template variants of type_traits interfaces where applicableLioncash2-4/+2
2018-08-10textures: Refactor out for Texture/Depth FormatFromPixelFormat.bunnei4-179/+27
2018-08-10gl_rasterizer_cache: Add bounds checking for gl_buffer copies.bunnei1-10/+12
2018-08-10Implement SNORM for BC5/DXN2 (#998)Khangaroo2-38/+55
* Implement BC5/DXN2 (#996) - Used by Kirby Star Allies. * Implement BC5/DXN2 SNORM UNORM for Kirby Star Allies SNORM for Super Mario Odyssey
2018-08-09gl_shader_decompiler: Reserve element memory beforehand in BuildRegisterList()Lioncash1-0/+2
Avoids potentially perfoming multiple reallocations when we know the total amount of memory we need beforehand.
2018-08-09gl_rasterizer_cache: Avoid iterator invalidation issues within InvalidateRegion()Lioncash1-2/+4
A range-based for loop can't be used when the container being iterated is also being erased from.
2018-08-09Implement BC5/DXN2 (#996)Khangaroo3-33/+45
- Used by Kirby Star Allies.
2018-08-09gl_rasterizer_cache: Invert conditional in LoadGLBuffer()Lioncash1-5/+5
It's generally easier to follow code using conditionals that operate in terms of the true case followed by the false case (no chance of overlooking the exclamation mark).
2018-08-09gl_rasterizer_cache: Use std::vector::assign in LoadGLBuffer() for the non-tiled caseLioncash1-4/+6
resize() causes the vector to expand and zero out the added members to the vector, however we can avoid this zeroing by using assign(). Given we have the pointer to the data we want to copy, we can calculate the end pointer and directly copy the range of data without the need to perform the resize() beforehand.
2018-08-09maxwell_to_gl: Implement VertexAttribute::Size::Size_16_16_16_16.bunnei1-0/+1
- Used by Super Mario Odyssey (in game).
2018-08-09maxwell_to_gl: Implement PrimitiveTopology::Points.bunnei1-0/+2
- Used by Super Mario Odyssey (in game).
2018-08-09gl_shader_decompiler: Declare predicates on use.bunnei1-4/+5
- Used by Super Mario Odyssey (when going in game).
2018-08-09maxwell_3d: Ignore macros that have not been uploaded yet.bunnei1-4/+9
- Used by Super Mario Odyssey (in game).
2018-08-09gl_rasterizer_cache: Make pointer const in LoadGLBuffer()Lioncash1-1/+1
This is only ever read from, so we can make the data it's pointing to const.
2018-08-09gl_rasterizer: Do not render when no render target is configured.bunnei1-0/+5
- Used by Super Mario Odyssey.
2018-08-08gpu: Add R11G11B10_FLOAT to RenderTargetBytesPerPixel.bunnei1-0/+1
- Used by Super Mario Odyssey.
2018-08-08gl_shader_decompiler: Stub input attribute Unknown_63.bunnei2-0/+9
2018-08-08maxwell_3d: Use correct const buffer size and check bounds.bunnei4-3/+12
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08renderer_opengl: Use trace log in a few places.bunnei2-2/+2
2018-08-08maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.bunnei1-0/+1
2018-08-08gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.bunnei2-0/+4
- Used by Super Mario Odyssey.
2018-08-08gl_shader_decompiler: Let OpenGL interpret floats.bunnei2-11/+6
- Accuracy is lost in translation to string, e.g. with NaN. - Needed for Super Mario Odyssey.
2018-08-08Fixed the sRGB pixel format (#963)Hexagon121-1/+2
* Changed the sRGB pixel format return * Add a message about SRGBA -> RGBA conversion
2018-08-07Lowered down the logging for methodsHexagon121-4/+4
2018-08-06maxwell_3d: Remove outdated assert.bunnei1-2/+0
2018-08-06gl_rasterizer_cache: Avoid superfluous surface copies.bunnei2-4/+21
2018-08-05gl_shader_decompiler: Fix TEXS mask and dest.bunnei1-2/+5
2018-08-05added braces for conditionsDavid Marcec1-2/+3
2018-08-05fix the attrib format for intsDavid Marcec1-2/+7
2018-08-04gl_shader_manager: Invert conditional in SetShaderUniformBlockBinding()Lioncash1-7/+9
This lets us indent the majority of the code and places the error case first.
2018-08-04gl_shader_manager: Amend sign differences in an assertion comparison in SetShaderUniformBlockBinding()Lioncash1-3/+2
Ensures both operands have the same sign in the comparison. While we're at it, we can get rid of the redundant casting of ub_size to an int. This type will always be trivial and alias a built-in type (not doing so would break backwards compatibility at a standard level).
2018-08-04renderer_base: Make Rasterizer() return the rasterizer by referenceLioncash2-4/+8
All calling code assumes that the rasterizer will be in a valid state, which is a totally fine assumption. The only way the rasterizer wouldn't be is if initialization is done incorrectly or fails, which is checked against in System::Init().
2018-08-04video_core: Eliminate the g_renderer global variableLioncash10-47/+43
We move the initialization of the renderer to the core class, while keeping the creation of it and any other specifics in video_core. This way we can ensure that the renderer is initialized and doesn't give unfettered access to the renderer. This also makes dependencies on types more explicit. For example, the GPU class doesn't need to depend on the existence of a renderer, it only needs to care about whether or not it has a rasterizer, but since it was accessing the global variable, it was also making the renderer a part of its dependency chain. By adjusting the interface, we can get rid of this dependency.
2018-08-03video_core: Remove unimplemented Start() function prototypeLioncash1-3/+0
Given this has no definition, we can just remove it entirely.
2018-08-03gl_shader_decompiler: Remove unused variable in GenerateDeclarations()Lioncash1-2/+0
This variable was being incremented, but we were never actually using it.
2018-08-03gl_shader_manager: Make ProgramManager's GetCurrentProgramStage() a const member functionLioncash1-1/+1
This function doesn't modify class state, so it can be made const.
2018-08-02Implement RGB32F PixelFormat (#886) (used by Go Vacation)greggameplayer3-9/+23
2018-08-02gl_state: Make texture_units a std::arrayLioncash1-2/+3
Gets rid of the use of a raw C array.
2018-08-02gl_shader_manager: Take ShaderSetup instances by const reference in UseProgrammableVertexShader() and UseProgrammableFragmentShader()Lioncash1-2/+2
Avoids performing unnecessary copies of 65560 byte sized ShaderSetup instances, considering it's only used as part of lookup and not modified. Given the parameters were already const, it's likely taking these parameters by reference was intended but the ampersand was forgotten.
2018-08-02video_core: Make global EmuWindow instance part of the base renderer classLioncash8-51/+41
Makes the global a member of the RendererBase class. We also change this to be a reference. Passing any form of null pointer to these functions is incorrect entirely, especially given the code itself assumes that the pointer would always be in a valid state. This also makes it easier to follow the lifecycle of instances being used, as we explicitly interact the renderer with the rasterizer, rather than it just operating on a global pointer.
2018-08-01Implement R32_FLOAT RenderTargetFormatUnknown3-0/+5
2018-07-31MacroInterpreter: Avoid left shifting negative values.Subv2-2/+6
The branch target is signed, so multiply by 4 instead of left shifting by 2
2018-07-26GPU: Allow using R16F as a render target format.Subv2-1/+4
2018-07-26Implement R16_G16Unknown4-19/+100
correct trailing white spaces Delete tabs correct placement Add RG16F & RG16UI & RG16I & RG16S PixelFormats Return correct data according to changes done previously correct PixelFormat declaration correct coding style error correct coding style error part 2 correct RG16S Declaration error correct alignment
2018-07-25GPU: Use the right texture format for sRGBA framebuffers.Subv2-9/+17
2018-07-25GPU: Allow the use of Z24S8 as a texture format.Subv1-0/+4
2018-07-25GPU: Implemented the Z32_S8_X24 depth buffer format.Subv4-1/+16
2018-07-25GPU: Allow using Z32 as a texture format.Subv1-0/+4
2018-07-25GPU: Allow the usage of R8 as a render target format.Subv2-0/+4
2018-07-24GPU: Remove the assert that required the CODE_ADDRESS to be 0.Subv1-8/+0
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24GPU: Implemented the R16 and R16F texture formats.Subv3-5/+32
2018-07-24gl_rasterizer: Replace magic number with GL_INVALID_INDEX in SetupConstBuffers()Lioncash1-3/+5
This is just the named constant that OpenGL provides, so we can use that instead of using a literal -1
2018-07-24gl_rasterizer: Use std::string_view instead of std::string when checking for extensionsLioncash1-1/+3
We can avoid heap allocations here by just using a std::string_view instead of performing unnecessary copying of the string data.
2018-07-24gl_rasterizer: Use in-class member initializers where applicableLioncash2-12/+5
We can just assign to the members directly in these cases.
2018-07-24video_core/memory_manager: Replace a loop with std::array's fill() function in PageSlot()Lioncash1-3/+1
We already have a function that does what this code was doing, so let's use that instead.
2018-07-24video_core/memory_manager: Avoid repeated unnecessary page slot lookupsLioncash1-11/+21
We don't need to keep calling the same function over and over again in a loop, especially when the behavior is slightly non-trivial. We can just keep a reference to the looked up location and do all the checking and assignments based off it instead.
2018-07-24gl_rasterizer: Implement texture border color.bunnei3-11/+11
2018-07-24maxwell_to_gl: Implement Texture::WrapMode::Border.bunnei1-0/+2
2018-07-24GPU: Implement texture format R32F.Subv3-6/+19
2018-07-24maxwell_to_gl: Implement VertexAttribute::Type::UnsignedInt.bunnei1-0/+3
2018-07-24gl_shader_decompiler: Correct return value of WriteTexsInstruction()Lioncash1-2/+2
This should be returning void, not a std::string
2018-07-24gl_shader_decompiler: Implement shader instruction TLDS.bunnei1-29/+43
2018-07-24gl_rasterizer_cache: Implement RenderTargetFormat RG32_FLOAT.bunnei5-7/+25
2018-07-24gl_rasterizer_cache: Implement RenderTargetFormat RGBA32_FLOAT.bunnei2-10/+34
2018-07-24gl_rasterizer_cache: Implement RenderTargetFormat BGRA8_UNORM.bunnei4-8/+22
2018-07-24gl_rasterizer_cache: Add missing log statements.bunnei1-0/+2
2018-07-24gl_shader_decompiler: Print instruction value in shader comments.bunnei1-1/+2
2018-07-24gl_shader_decompiler: Check if SetRegister result is ZeroIndex.bunnei1-0/+6
2018-07-23gl_shader_decompiler: Simplify GetCommonDeclarations()Lioncash1-5/+5
2018-07-22gl_shader_decompiler: Remove redundant Subroutine construction in AddSubroutine()Lioncash1-4/+8
We don't need to toss away the Subroutine instance after the find() call and reconstruct another instance with the same data right after it. Particularly give Subroutine contains a std::set.
2018-07-22shader_bytecode: Implement other TEXS masks.bunnei1-5/+9
2018-07-22gl_shader_decompiler: Remove unused state tracking and minor cleanup.bunnei1-78/+15
2018-07-22gl_shader_decompiler: Implement SEL instruction.bunnei2-0/+20
2018-07-22gl_rasterizer_cache: Blit surfaces on recreation instead of flush and load.bunnei2-2/+86
2018-07-22gl_rasterizer_cache: Use GPUVAddr as cache key, not parameter set.bunnei3-57/+46
2018-07-22gl_rasterizer_cache: Use zeta_width and zeta_height registers for depth buffer.bunnei2-11/+11
2018-07-22gl_rasterizer: Use zeta_enable register to enable depth buffer.bunnei1-2/+2
2018-07-22maxwell_3d: Add depth buffer enable, width, and height registers.bunnei1-2/+14
2018-07-21gl_shader_manager: Replace unimplemented function prototypeLioncash2-3/+3
This was just a linker error waiting to happen.
2018-07-21gpu: Rename Get3DEngine() to Maxwell3D()Lioncash3-11/+14
This makes it match its const qualified equivalent.
2018-07-21video_core: Use nested namespaces where applicableLioncash11-48/+24
Compresses a few namespace specifiers to be more compact.
2018-07-20gl_state: Make references const where applicable in Apply()Lioncash1-2/+3
2018-07-20gl_state: Get rid of mismatched sign conversionsLioncash1-14/+17
While we're at it, amend the loop variable type to be the same width as that returned by the .size() call.
2018-07-20maxwell_3d: Remove unused variable within GetStageTextures()Lioncash1-2/+0
2018-07-20gl_shader_decompiler: Eliminate variable and declaration shadowingLioncash1-6/+4
Ensures that no identifiers are being hidden, which also reduces compiler warnings.
2018-07-20gl_shader_decompiler: Remove unnecessary const from return valuesLioncash1-2/+2
This adds nothing from a behavioral point of view, and can inhibit the move constructor/RVO
2018-07-19gl_state: Temporarily disable culling and depth test.bunnei1-1/+3
2018-07-19decoders: Fix calc of swizzle image_width_in_gobs.bunnei1-1/+4
2018-07-19core: Don't construct instance of Core::System, just to access its live instanceLioncash3-15/+15
This would result in a lot of allocations and related object construction, just to toss it all away immediately after the call. These are definitely not intentional, and it was intended that all of these should have been accessing the static function GetInstance() through the name itself, not constructed instances.
2018-07-18astc: Initialize vector size directly in DecompressLioncash1-2/+1
There's no need to perform a separate resize.
2018-07-18astc: Mark functions as internally linked where applicableLioncash1-17/+20
2018-07-18astc: const-correctness changes where applicableLioncash1-14/+13
A few member functions didn't actually modify class state, so these can be amended as necessary.
2018-07-18astc: Delete Bits' copy contstructor and assignment operatorLioncash1-8/+6
This also potentially avoids warnings, considering the copy assignment operator is supposed to have a return value.
2018-07-18astc: In-class initialize member variables where appropriateLioncash1-39/+22
2018-07-18vi: Partially implement buffer crop parameters.bunnei3-4/+20
2018-07-17GPU: Added register definitions for the stencil parameters.Subv1-2/+25
2018-07-15gl_rasterizer_cache: Implement texture format G8R8.bunnei3-9/+40
2018-07-15gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8.bunnei1-1/+2
2018-07-15gl_rasterizer_cache: Implement depth format Z16_UNORM.bunnei3-1/+15
2018-07-14OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering.bunnei3-1/+27
2018-07-14GPU: Always enable the depth write when clearing the depth buffer.Subv1-3/+8
The GPU ignores that register when clearing, but OpenGL obeys the glDepthMask parameter, so we set the depth mask to GL_TRUE when clearing the depth buffer. It will be restored to the correct value automatically on the next draw call.
2018-07-13gl_rasterizer: Fix check for if a shader stage is enabled.bunnei3-35/+11
2018-07-13gl_shader_gen: Implement dual vertex shader mode.bunnei5-55/+139
- When VertexA shader stage is enabled, we combine with VertexB program to make a single Vertex Shader stage.
2018-07-13gl_shader_decompiler: Implement PredCondition::LessThanWithNan.bunnei2-5/+7
2018-07-13gl_shader_decompiler: Use FlowCondition field in EXIT instruction.bunnei2-8/+34
2018-07-12GPU: Implement the FADD32I shader instruction.Subv2-0/+32
2018-07-12GPU: Corrected the decoding of FFMA for immediate operands.Subv1-1/+1
2018-07-08gl_rasterizer: Flip triangles when regs.viewport_transform[0].scale_y is negative.bunnei1-1/+4
- Fixes a regression with Binding of Isaac.
2018-07-07GPU: Implemented the BC7U texture format.Subv3-7/+21
Note: Our version of glad exports GL_COMPRESSED_RGBA_BPTC_UNORM as GL_COMPRESSED_RGBA_BPTC_UNORM_ARB, maybe it's time we update it.
2018-07-05GPU: Allow using the old NV04 values for the depth test function.Subv2-9/+29
These seem to be just a valid as the GL token values. Thanks @ReinUsesLisp This restores graphical output to Disgaea 5
2018-07-04GPU: Implemented the IMNMX shader instruction.Subv2-3/+31
It's similar to the FMNMX instruction but it works on integers.
2018-07-04GPU: Implemented the F2F 'round' rounding mode.Subv1-0/+3
It's implemented via the GLSL 'roundEven()' function.
2018-07-04GPU: Stub the shader SYNC and DEPBAR instructions.Subv2-0/+12
It is unknown at this moment if we actually need to do something with these instructions or if the GLSL compiler takes care of that for us.
2018-07-04GPU: Implement the Size_16_16 and Size_10_10_10_2 vertex attribute types.Subv1-0/+8
Both signed and unsigned variants.
2018-07-04GPU: Ignore textures that the GLSL compiler deemed unused when binding textures to the shaders.Subv1-1/+4
2018-07-04GPU: Corrected the decoding for the TEX shader instruction.Subv1-1/+1
2018-07-04GPU: Implemented the PSETP shader instruction.Subv2-0/+43
It's similar to the isetp and fsetp instructions but it works on predicates instead.
2018-07-04GPU: Implemented the 32 bit float depth buffer format.Subv3-2/+15
2018-07-04GPU: Flip the triangle front face winding if the GPU is configured to not flip the triangles.Subv2-3/+29
OpenGL's default behavior is already correct when the GPU is configured to flip the triangles. This fixes 1-2 Switch's splash screen.
2018-07-04GPU: Only configure the used framebuffers during clear.Subv4-17/+48
Don't try to configure the color buffer if it is not being cleared, it may not be completely valid at this point.
2018-07-03GPU: Factor out the framebuffer configuration code for both Clear and Draw commands.Subv2-72/+39
2018-07-03GPU: Support clears that don't clear the color buffer.Subv2-6/+17
2018-07-03GPU: Bind and clear the render target when the CLEAR_BUFFERS register is written to.Subv4-0/+86
2018-07-03GPU: Added registers for the CLEAR_BUFFERS and CLEAR_COLOR methods.Subv1-2/+27
2018-07-03gl_rasterizer_cache: Implement PixelFormat S8Z24.bunnei3-11/+83
2018-07-03gl_rasterizer: Only set cull mode and front face if enabled.bunnei1-2/+5
2018-07-03GPU: Use only the least significant 3 bits when reading the depth test func.Subv1-9/+9
Some games set the full GL define value here (including nouveau), but others just seem to set those last 3 bits.
2018-07-03GPU: Don't try to parse the depth test function if the depth test is disabled.Subv1-0/+4
2018-07-03Update clang formatJames Rowe7-21/+20
2018-07-03Rename logging macro back to LOG_*James Rowe13-70/+70
2018-07-03GPU: Set up the culling configuration on each draw.Subv1-6/+8
2018-07-03GPU: Implemented MUFU suboperation 8, sqrt.Subv2-0/+5
2018-07-02GPU: Set up the depth test state on every draw.Subv2-0/+14
2018-07-02MaxwellToGL: Added conversion functions for depth test and cull mode.Subv1-0/+50
2018-07-02GPU: Added registers for depth test and cull mode.Subv1-3/+51
2018-07-02GPU: Implemented the Z24S8 depth format and load the depth framebuffer.Subv7-24/+124
2018-07-02GPU: Implement offsetted rendering when using non-indexed drawing.Subv1-1/+1
2018-07-02GPU: Fixed the index offset rendering, and implemented the base vertex functionality.Subv1-6/+8
This fixes Stardew Valley.
2018-07-02GPU: Added register definitions for the vertex buffer base element.Subv1-1/+6
2018-07-02GPU: Directly copy the pixels when performing a same-layout DMA.Subv1-1/+5
2018-07-02GPU: Ignore disabled textures and textures with an invalid address.Subv2-1/+10
2018-07-02GPU: Allow GpuToCpuAddress to return boost::none for unmapped addresses.Subv1-2/+2
2018-06-30GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.Subv2-6/+1
2018-06-30GPU: Implemented the RGBA32_UINT rendertarget format.Subv4-9/+28
2018-06-30GLCache: Specify the component type along the texture type in the format tuple.Subv1-17/+21
2018-06-30gl_shader_decompiler: Implement predicate NotEqualWithNan.bunnei2-17/+24
2018-06-29gl_rasterizer_cache: Only dereference color_surface/depth_surface if valid.bunnei1-2/+6
2018-06-27gl_shader_decompiler: Add a return path for unknown instructions.bunnei1-0/+1
2018-06-27gl_rasterizer_cache: Implement caching for texture and framebuffer surfaces.bunnei3-16/+168
gl_rasterizer_cache: Improved cache management based on Citra's implementation. gl_surface_cache: Add some docstrings.
2018-06-27gl_rasterizer_cache: Various fixes for ASTC handling.bunnei2-35/+39
2018-06-27gl_rasterizer_cache: Use SurfaceParams as a key for surface caching.bunnei2-43/+72
2018-06-27maxwell_3d: Add a struct for RenderTargetConfig.bunnei1-17/+19
2018-06-27gl_rasterizer: Implement AccelerateDisplay to forward textures to framebuffers.bunnei6-8/+62
2018-06-27gl_rasterizer_cache: Cache size_in_bytes as a const per surface.bunnei2-9/+13
2018-06-27gl_rasterizer_cache: Refactor to make SurfaceParams members const.bunnei2-52/+37
2018-06-27gl_rasterizer_cache: Remove Citra's rasterizer cache, always load/flush surfaces.bunnei4-1494/+210
2018-06-27gl_rasterizer: Workaround for when exceeding max UBO size.bunnei2-1/+7
2018-06-26gl_state: Fix state management for texture swizzle.bunnei5-12/+20
2018-06-26gl_state: Remove unused state management from 3DS.bunnei2-94/+0
2018-06-26gl_rasterizer_cache: Fix inverted B5G6R5 format.bunnei1-1/+1
2018-06-25Fix crash at exitmailwl1-2/+4
2018-06-20Build: Fixed some MSVC warnings in various parts of the code.Subv7-12/+13
2018-06-19GPU: Perform negation after absolute value in the float shader instructions.Subv1-7/+14
2018-06-19GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.Subv2-14/+18
Like the MOV32I and FMUL32I instructions. This fixes a potential crash when using these instructions.
2018-06-18gl_rasterizer: Get loose on independent blending.Jules Blok1-1/+1
2018-06-18gl_rasterizer: Implement texture format ASTC_2D_4X4.bunnei6-1/+1709
2018-06-18gl_rasterizer_cache: Loosen things up a bit.bunnei1-26/+8
2018-06-17gl_shader_decompiler: Implement LOP instructions.bunnei2-6/+42
2018-06-17gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP.bunnei2-57/+42
2018-06-16gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I.bunnei2-14/+43
2018-06-16gl_shader_gen: Set position.w to 1.bunnei1-0/+4
2018-06-16gl_shader_decompiler: Implement LOP32I LogicOperation PassB.bunnei1-6/+12
2018-06-12GPU: Implemented the iadd32i shader instruction.Subv2-2/+31
2018-06-12GPU: Partially implemented the Maxwell DMA engine.Subv7-1/+237
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12gl_shader_decompiler: Implement saturate for float instructions.bunnei2-39/+32
2018-06-10GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.Subv1-1/+1
This corrects the invalid position values in some games when doing attribute-less rendering.
2018-06-10Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.Subv4-18/+39
This should help a bit with GPU performance once we're GPU-bound.
2018-06-09GPU: Implement the iset family of shader instructions.Subv2-2/+46
2018-06-09GPU: Added decodings for the ISET family of instructions.Subv1-0/+7
2018-06-09gl_shader_decompiler: Implement SHR instruction.bunnei2-0/+17
2018-06-09GPU: Stub the SSY shader instruction.Subv2-0/+7
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-09gl_shader_decompiler: Implement IADD instruction.bunnei2-11/+37
2018-06-09gl_shader_decompiler: Add missing asserts for saturate_a instructions.bunnei2-8/+18
2018-06-09GPU: Synchronize the blend state on every draw call.Subv2-16/+20
Only independent blending on render target 0 is implemented for now. This fixes the elongated squids in Splatoon 2's boot screen.
2018-06-09GPU: Added registers for normal and independent blending.Subv2-31/+27
2018-06-08GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.Subv1-2/+7
This fixes issues with retrieving non-block-aligned tiled compressed textures from the cache.
2018-06-08Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.Subv1-0/+3
This fixes the flip_viewport uniform having invalid values when drawing.
2018-06-07GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.Subv1-3/+4
This should fix the bug with the vs_config UBO being uninitialized during shader execution.
2018-06-07gl_shader_decompiler: Implement BFE_IMM instruction.bunnei2-7/+44
2018-06-07GLCache: Use the full uncompressed size when blitting from one texture to another.Subv1-3/+6
This avoids the problem of only copying a tiny piece of the textures when they are compressed.
2018-06-07GLCache: Simplify the logic to copy from one texture to another in BlitTextures.Subv1-53/+3
We now use glCopyImageSubData, this should avoid errors with trying to attach a compressed texture as a framebuffer's color attachment and then blitting to it. Maybe in the future we can change this to glCopyTextureSubImage which only requires GL_ARB_direct_state_access.
2018-06-07gl_shader_decompiler: F2F: Implement rounding modes.bunnei2-10/+35
2018-06-07gl_shader_decompiler: Remove some attribute stuff that has nothing to do with TEX/TEXS.bunnei1-8/+4
2018-06-07shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD.bunnei1-0/+20
2018-06-07gl_shader_decompiler: Implement ISETP_IMM instruction.bunnei1-8/+9
2018-06-07GPU: Support changing the texture swizzles for Maxwell textures.Subv3-0/+45
2018-06-07GLState: Support changing the GL_TEXTURE_SWIZZLE parameter of each texture unit.Subv3-0/+20
2018-06-07gl_shader_decompiler: Implement LD_C instruction.bunnei2-0/+43
2018-06-07gl_shader_gen: Add uniform handling for indirect const buffer access.bunnei3-4/+40
2018-06-06gl_shader_decompiler: Refactor uniform handling to allow different decodings.bunnei2-26/+29
2018-06-06GPU: Implement sampling multiple textures in the generated glsl shaders.Subv9-69/+172
All tested games that use a single texture show no regression. Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06gl_shader_decompiler: Fix un/signed mismatch with SHL.bunnei1-1/+1
2018-06-06maxwell_to_gl: Implement WrapMode Mirror.bunnei1-0/+2
2018-06-06GPU: Allow the usage of RGBA16_FLOAT in the texture copy engine.Subv1-0/+2
2018-06-06GPU: Implemented the R11FG11FB10F texture and rendertarget formats.Subv4-11/+30
2018-06-06GPU: Fixed the compression factor for RGBA16F textures.Subv1-1/+1
They're not compressed.
2018-06-06GPU: Allow the usage of RGBA32_FLOAT in the texture copy engine.Subv2-0/+3
2018-06-05GPU: Corrected the branch targets for the shader bra instruction.Subv1-4/+5
2018-06-05GPU: Implemented the F2I_R shader instruction.Subv2-7/+64
2018-06-05gl_shader_decompiler: Fix typo with ISCADD instruction.bunnei1-1/+1
2018-06-05gl_shader_decompiler: Implement SHL instruction.bunnei2-14/+47
2018-06-05gl_shader_decompiler: Implement PredCondition::NotEqual.bunnei1-3/+3
2018-06-05GPU: Implement the ISCADD shader instructions.Subv2-0/+40
2018-06-05GPU: Added decodings for the ISCADD instructions.Subv1-0/+7
2018-06-05GPU: Implement predicated exit instructions in the shader programs.Subv1-4/+6
2018-06-05GPU: Take into account predicated exits when performing shader control flow analysis.Subv1-1/+10
2018-06-04GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f.Subv2-2/+7
2018-06-04GPU: Corrected the I2F_R implementation.Subv1-2/+12
2018-06-04GPU: Calculate the correct viewport dimensions based on the scale and translate registers.Subv2-14/+30
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04GPU: Implemented the LOP32I instruction.Subv2-1/+58
2018-06-04GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface.Subv1-1/+2
2018-06-04GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register.Subv1-9/+18
2018-06-04GPU: Implemented the ISETP_R and ISETP_C shader instructions.Subv2-0/+48
2018-06-04GPU: Partially implemented the shader BRA instruction.Subv2-1/+43
2018-06-04GPU: Added decoding for the BRA instruction.Subv1-0/+2
2018-06-04GPU: Partial implementation of long GPU queries.Subv1-9/+24
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp. In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU. This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03gl_shader_decompiler: Implement TEXS component mask.bunnei2-9/+26
2018-06-03gl_shader_decompiler: Implement RRO as a register move.bunnei2-9/+18
2018-06-02GPU: Implemented the DXN1 (BC4) texture format.Subv3-3/+16
2018-06-01gl_shader_decompiler: Implement TEX instruction.bunnei2-1/+36
2018-06-01gl_shader_decompiler: Support multi-destination for TEXS.bunnei2-2/+23
2018-05-31gl_rasterizer_cache: Assert that component type is UNorm or format is RGBA16F.bunnei1-1/+2
2018-05-31gl_rasterizer_cache: Implement PixelFormat RGBA16F.bunnei3-6/+22
2018-05-30Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.Subv2-1/+11
2018-05-30gl_shader_decompiler: F2F_R instruction: Implement abs.bunnei1-1/+7
2018-05-30gl_shader_decompiler: Partially implement F2F_R instruction.bunnei2-4/+9
2018-05-30GPU: Implemented the R8 texture format (0x1D)Subv3-5/+18
2018-05-30gl_rasterize_cache: Invert order of tex format RGB565.bunnei1-1/+1
2018-05-29add all the known TextureFormat (#474)greggameplayer1-2/+71
2018-05-27GPU: Implemented the A1B5G5R5 texture format (0x14)Subv4-5/+21
2018-05-26gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual.bunnei1-4/+3
2018-05-26shader_bytecode: Implement other variants of FMNMX.bunnei2-4/+10
2018-05-25Shader: Implemented compound predicates in fset.Subv1-28/+12
You can specify a predicate in the fset instruction: Result = ((Value1 Comp Value2) OP P0) ? 1.0 : 0.0;
2018-05-25GPU: Allow command lists to rebind a channel to another engine in the middle of the command list.Subv1-1/+0
2018-05-25Shader: Implemented compound predicates in fsetp.Subv1-19/+55
You can specify three predicates in an fsetp instruction: P1 = (Value1 Comp Value2) OP P0; P2 = !(Value1 Comp Value2) OP P0;
2018-05-21Shaders: Implemented the FMNMX shader instruction.Subv2-6/+26
2018-05-20GPU: Implemented nvhost-as-gpu's UnmapBuffer ioctl.Subv2-0/+20
It removes a mapping previously created with the MapBufferEx ioctl.
2018-05-19ShadersDecompiler: Added decoding for the PSETP instruction.Subv1-0/+3
2018-05-19GLRenderer: Remove unused hw_vao_enabled_attributes variable.Subv2-4/+0
2018-05-19GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.Subv2-9/+3
The stream buffer is where all the vertex data is copied, some games require this to be much bigger than the 4 MB we used to have.
2018-05-19GLRenderer: Log the shader source code when program linking fails.Subv1-0/+27
2018-05-02general: Make formatting of logged hex values more straightforwardLioncash1-1/+1
This makes the formatting expectations more obvious (e.g. any zero padding specified is padding that's entirely dedicated to the value being printed, not any pretty-printing that also gets tacked on).
2018-04-29maxwell_3d: Reset vertex counts after drawing.bunnei1-0/+10
2018-04-29gl_shader_decompiler: Implement MOV_R.bunnei1-1/+2
2018-04-29maxwell_to_gl: Implement type SignedNorm, Size_8_8_8_8.bunnei1-0/+12
2018-04-29shader_bytecode: Add decoding for FMNMX instruction.bunnei1-0/+2
2018-04-29Shaders: Implemented predicate condition 3 (LessEqual) in the fset and fsetp instructions.Subv1-0/+7
2018-04-29gl_shader_decompiler: Implement MOV_C.bunnei1-0/+5
2018-04-29fermi_2d: Fix surface copy block height.bunnei2-2/+7
2018-04-29gl_shader_decompiler: Partially implement I2I_R, and I2F_R.bunnei2-8/+34
2018-04-29gl_shader_decompiler: More cleanups, etc. with how we handle register types.bunnei1-44/+120
2018-04-29GLSLRegister: Simplify register declarations, etc.bunnei1-63/+31
2018-04-29shader_bytecode: Add decodings for i2i instructions.bunnei1-3/+20
2018-04-29gl_shader_decompiler: Implement MOV32_IMM instruction.bunnei2-2/+7
2018-04-27renderer_opengl: Replace usages of LOG_GENERIC with fmt-capable equivalentsLioncash1-6/+7
2018-04-27gl_shader_decompiler: Add GLSLRegisterManager class to track register state.bunnei1-154/+262
2018-04-27general: Convert assertion macros over to be fmt-compatibleLioncash4-7/+7
2018-04-26gl_shader_decompiler: Boilerplate for handling integer instructions.bunnei2-6/+111
2018-04-26gl_shader_decompiler: Move color output to EXIT instruction.bunnei1-6/+12
2018-04-25GPU: Partially implemented the Fermi2D surface copy operation.Subv2-0/+59
The hardware allows for some rather complicated operations to be performed on the data during the copy, this is not implemented. Only same-format same-size raw copies are implemented for now.
2018-04-25Shaders: Added bit decodings for the I2I instruction.Subv1-0/+6
2018-04-25Shaders: Implemented the FSET instruction.Subv1-0/+53
This instruction is similar to the FSETP instruction, but it doesn't set a predicate, it sets the destination register to 1.0 if the condition holds, and 0 otherwise.
2018-04-25GPU: Make the Textures::CopySwizzledData function accessible from the outside of the file.Subv2-3/+6
2018-04-25GPU: Added a function to retrieve the bytes per pixel of the render target formats.Subv2-0/+15
2018-04-25GPU: Added surface copy registers to Fermi2DSubv1-1/+57
2018-04-25GPU: Added boilerplate code for the Fermi2D engineSubv3-3/+34
2018-04-25GPU: Reduce the number of registers of Maxwell3D to 0xE00.Subv2-5/+5
The rest are just macro shim registers.
2018-04-25GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.Subv4-40/+23
It doesn't belong in the PFIFO handler.
2018-04-25GPU: Corrected the upper bound of the PFIFO method ids in the command processor.Subv1-1/+1
2018-04-25video-core: Move logging macros over to new fmt-capable onesLioncash5-18/+20
2018-04-25Shaders: Added decodings for the FSET instructions.Subv2-9/+30
2018-04-25renderer_opengl: Use correct byte order for framebuffer pixel format ABGR8.bunnei1-2/+1
2018-04-25gl_rasterizer_cache: Use CHAR_BIT for bpp conversions instead of 8.bunnei2-4/+4
2018-04-25gl_rasterizer_cache: Use GPU PAGE_BITS/SIZE, not CPU.bunnei1-5/+5
2018-04-25gl_rasterizer_cache: Use new logger.bunnei1-4/+4
2018-04-25gl_rasterizer_cache: Add a function for finding framebuffer GPU address.bunnei3-0/+31
2018-04-25gl_rasterizer_cache: Handle compressed texture sizes.bunnei2-24/+65
2018-04-25gl_rasterizer_cache: Update to be based on GPU addresses, not CPU addresses.bunnei8-50/+72
2018-04-24memory_manager: Add implement CpuToGpuAddress.bunnei2-0/+27
2018-04-24memory_manager: Make GpuToCpuAddress return an optional.bunnei6-24/+33
2018-04-24memory_manager: Use GPUVAdddr, not PAddr, for GPU addresses.bunnei6-58/+55
2018-04-24renderer_opengl: Silence a -Wdangling-else warning in DrawScreenTriangles()Lioncash1-1/+2
2018-04-24GPU: Added asserts to our code for handling the QUERY_GET GPU command.Subv2-2/+53
This is based on research from nouveau. Many things are currently unknown and will require hwtests in the future. This commit also stubs QueryMode::Write2 to do the same as Write. Nouveau code treats them interchangeably, it is currently unknown what the difference is.
2018-04-23GPU: Support multiple enabled vertex arrays.Subv3-43/+89
The vertex arrays will be copied to the stream buffer one after the other, and the attributes will be set using the ARB_vertex_attrib_binding extension. yuzu now thus requires OpenGL 4.3 or the ARB_vertex_attrib_binding extension.
2018-04-23GPU: Make the GPU virtual memory manager use 16 page bits and 10 page table bits.Subv2-34/+25
Also removed some dead code and added memory map consistency asserts.
2018-04-23GPU: Implement the RGB10_A2 RenderTarget format, it will use the same format as the A2BGR10 texture format.Subv1-0/+2
2018-04-22GPU: Implement the A2BGR10 texture format.Subv4-6/+18
2018-04-21gl_shader_decompiler: Skip RRO instruction.bunnei1-0/+4
2018-04-21gl_shader_decompiler: Cleanup error logging.bunnei1-14/+6
2018-04-21shader_bytecode: Add several more instruction decodings.bunnei1-5/+52
2018-04-21shader_bytecode: Decode instructions based on bit strings.bunnei2-205/+201
2018-04-21ShaderGen: Implemented the KIL instruction, which is equivalent to 'discard'.Subv1-1/+7
2018-04-21ShaderGen: Implemented predicated instruction execution.Subv2-1/+40
Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
2018-04-21ShaderGen: Implemented the fsetp instruction.Subv2-3/+112
Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id. These predicate variables are initialized to false on shader startup and are set via the fsetp instructions. TODO: * Not all the comparison types are implemented. * Only the single-predicate version is implemented.
2018-04-21opengl: Remove unnecessary header inclusionsLioncash4-11/+0
2018-04-21gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operatorsLioncash1-20/+19
Standard library containers may use std::move_if_noexcept to perform move operations. If a move cannot be performed under these circumstances, then a copy is attempted. Given we only intend for these types to be move-only this can be somewhat problematic. By defining these to be noexcept we prevent cases where copies may be attempted.
2018-04-21gl_rasterizer_cache: Make MatchFlags an enum classLioncash1-4/+9
Prevents implicit conversions and scope pollution.
2018-04-20ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO).Subv2-0/+5
2018-04-20ShaderGen: Ignore the 'sched' instruction when generating shaders.Subv1-0/+16
The 'sched' instruction has a very convoluted encoding, but fortunately it seems to only appear on a fixed interval (once every 4 instructions).
2018-04-20math_util: Remove the Clamp() functionLioncash2-16/+17
C++17 adds clamp() to the standard library, so we can remove ours in favor of it.
2018-04-20common_funcs: Remove ARRAY_SIZE macroLioncash1-2/+2
C++17 has non-member size() which we can just call where necessary.
2018-04-20renderer_opengl: Add missing header guardsLioncash2-0/+4
2018-04-20glsl_shader_decompiler: Use std::string_view instead of std::string for AddLine()Lioncash1-1/+2
This function doesn't need to take ownership of the string data being given to it, considering all we do is append the characters to the internal string instance. Instead, use a string view to simply reference the string data without any potential heap allocation. Now anything that is a raw const char* won't need to be converted to a std::string before appending.
2018-04-20glsl_shader_decompiler: Add AddNewLine() function to ShaderWriterLioncash1-6/+12
Avoids constructing a std::string just to append a newline character
2018-04-20glsl_shader_decompiler: Add char overload for ShaderWriter's AddLine()Lioncash1-4/+11
Avoids constructing a std::string just to append a character.
2018-04-20glsl_shader_decompiler: Append indentation without constructing a separate std::stringLioncash1-1/+5
The interface of std::string already lets us append N copies of a character to an existing string.
2018-04-19ShaderGen: Implemented the fmul32i shader instruction.Subv2-9/+30
2018-04-19ShaderGen: Fixed a case where the TEXS instruction would use the same registers for the input and the output.Subv1-2/+9
It will now save the coords before writing the outputs in a subscope.
2018-04-19GPU: Add support for the DXT23 and DXT45 compressed texture formats.Subv3-28/+35
2018-04-19GPU: Implemented the B5G6R5 format.Subv4-8/+28
2018-04-18gl_shader_gen: Support vertical/horizontal viewport flipping. (#347)bunnei4-5/+29
* gl_shader_gen: Support vertical/horizontal viewport flipping. * fixup! gl_shader_gen: Support vertical/horizontal viewport flipping.
2018-04-18GLCache: Added boilerplate code to make supporting configurable texture component types.Subv3-9/+69
For now only the UNORM type is supported.
2018-04-18GLCache: Unify texture and framebuffer formats when converting to OpenGL.Subv2-26/+13
2018-04-18GPU: Texture format 8 and framebuffer format 0xD5 are actually ABGR8.Subv2-10/+10
2018-04-18GPU: Pitch textures are now supported, don't assert when encountering them.Subv1-2/+3
2018-04-18GLCache: Take into account the texture's block height when caching and unswizzling.Subv3-43/+43
2018-04-18GLCache: Added a function to convert cached PixelFormats back to texture formats.Subv1-0/+12
TODO: The way we handle cached formats must change, framebuffer and texture formats are too different to keep them in the same place.
2018-04-18GPU: Allow using a configurable block height when unswizzling textures.Subv4-7/+23
2018-04-18GPU/TIC: Added the pitch and block height fields to the TIC structure.Subv1-1/+16
2018-04-18gl_rasterizer_cache: Add missing LOG statements.bunnei1-0/+3
2018-04-18texture: Add missing formats.bunnei1-1/+3
2018-04-18gpu: Add several framebuffer formats to RenderTargetFormat.bunnei1-0/+3
2018-04-18maxwell3d: Allow Texture2DNoMipmap as Texture2D.bunnei1-1/+2
2018-04-18shader_bytecode: Make ctor's constexpr and explicit.bunnei1-7/+7
2018-04-18renderer_opengl: Implement BlendEquation and BlendFunc.bunnei6-7/+140
2018-04-17gl_shader_decompiler: Fix warnings with MarkAsUsed.bunnei1-1/+2
2018-04-17gl_shader_decompiler: Cleanup logging, updating to NGLOG_*.bunnei1-24/+22
2018-04-17gl_shader_decompiler: Implement several MUFU subops and abs_d.bunnei1-7/+21
2018-04-17gl_shader_decompiler: Fix swizzle in GetRegister.bunnei1-1/+1
2018-04-17gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions.bunnei2-12/+53
2018-04-17gl_shader_decompiler: Allow vertex position to be used in fragment shader.bunnei2-16/+18
2018-04-17gl_shader_decompiler: Implement IPA instruction.bunnei1-0/+11
2018-04-17gl_shader_decompiler: Add support for TEXS instruction.bunnei2-12/+43
2018-04-17gl_shader_decompiler: Use fragment output color for GPR 0-3.bunnei1-0/+5
2018-04-17gl_shader_decompiler: Partially implement MUFU.bunnei1-2/+11
2018-04-17MaxwellToGL: Implemented tex wrap mode 1 (Wrap, GL_REPEAT).Subv1-0/+2
2018-04-17MaxwellToGL: Added a TODO and partial implementation of maxwell wrap mode 4 (Clamp, GL_CLAMP).Subv1-0/+5
This clamp mode was removed from OpenGL as of 3.1, we can emulate it by using GL_CLAMP_TO_BORDER to get the border color of the texture, and then manually sampling the edge to mix them in the fragment shader.
2018-04-17gl_rendering: Use NGLOG* for changed code.bunnei2-10/+11
2018-04-17gl_rasterizer: Implement indexed vertex mode.bunnei5-23/+92
2018-04-15GPU: Use the same buffer names in the generated GLSL and the buffer uploading code.Subv4-17/+24
2018-04-15GPU: Don't use explicit binding points when uploading the constbuffers to opengl.Subv3-7/+47
The bindpoints will now be dynamically calculated based on the number of buffers used by the previous shader stage.
2018-04-15GPU: Don't use GetPointer when uploading the constbuffer data to the GPU.Subv1-3/+4
2018-04-15GPU: Use the buffer hints from the shader decompiler to upload only the necessary const buffers for each shader stage.Subv3-31/+41
2018-04-15shaders: Expose hints about used const buffers.bunnei5-31/+146
2018-04-15GPU: Upload the entirety of each constbuffer for each shader stage as SSBOs.Subv4-14/+48
We're going to need the shader generator to give us a mapping of the actual used const buffers to properly bind them to the shader.
2018-04-15GPU: Allow configuring ssbos in the opengl state manager.Subv4-0/+30
2018-04-15GPU: Added a function to determine whether a shader stage is enabled or not.Subv3-3/+27
2018-04-15shaders: Add NumTextureSamplers const, remove unused #pragma.bunnei4-4/+5
2018-04-14shaders: Address PR review feedback.bunnei2-7/+9
2018-04-14gl_shader_decompiler: Cleanup log statements.bunnei1-15/+15
2018-04-14shaders: Fix GCC and clang build issues.bunnei3-5/+5
2018-04-14gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup.bunnei2-40/+96
2018-04-14shader_bytecode: Add FSETP and KIL to GetInfo.bunnei1-0/+3
2018-04-14shader_bytecode: Add SubOp decoding.bunnei1-0/+10
2018-04-14gl_shader_decompiler: Add shader stage hint.bunnei2-5/+12
2018-04-14renderer_opengl: Fix Morton copy byteswap, etc.bunnei2-6/+6
2018-04-14gl_shader_manager: Implement SetShaderSamplerBindings.bunnei1-0/+8
2018-04-14gl_rasterizer: Generate shaders and upload uniforms.bunnei2-32/+77
2018-04-14gl_shader_decompiler: Basic impl. for very simple vertex shaders.bunnei2-16/+311
- Tested with Puyo Puyo Tetris and Cave Story+
2018-04-14gl_shader_manager: Cleanup and consolidate uniform handling.bunnei2-26/+24
2018-04-14maxwell_3d: Make memory_manager public.bunnei1-2/+1
2018-04-14maxwell_3d: Fix shader_config decodings.bunnei1-6/+3
2018-04-14gl_rasterizer: Use shader program manager, remove test shader.bunnei2-196/+31
2018-04-14renderer_opengl: Add gl_shader_manager class.bunnei3-0/+209
2018-04-14maxwell_to_gl: Add a few types, etc.bunnei1-0/+10
2018-04-14gl_shader_gen: Add hashable setup/config structs.bunnei2-29/+50
2018-04-14gl_shader_util: Add missing includes.bunnei1-0/+2
2018-04-14renderer_opengl: Use OGLProgram instead of OGLShader.bunnei6-6/+6
2018-04-14gl_shader_util: Grab latest upstream.bunnei2-149/+74
2018-04-14gl_resource_manager: Grab latest upstream.bunnei1-30/+86
2018-04-14gl_shader_decompiler: Add skeleton code from Citra for shader analysis.bunnei2-44/+142
2018-04-14shader_bytecode: Add initial module for shader decoding.bunnei2-0/+298
2018-04-07Fix clang format issuesJames Rowe1-1/+1
2018-04-07GPU: Assert when finding a texture with a format type other than UNORM.Subv2-4/+16
2018-04-07GL: Set up the textures used for each draw call.Subv2-2/+39
Each Maxwell shader stage can have an arbitrary number of textures, but we're limited to a certain number in OpenGL. We try to only use the minimum amount of host textures by not keeping a 1:1 relation between guest texture ids and host texture ids, ie, guest texture id 8 can be host texture id 0 if it's the only texture used in the guest shader program. This mapping will have to be passed to the shader decompiler so it can rewrite the texture accesses.
2018-04-07GL: Bind the textures to the shaders used for drawing.Subv1-2/+11
2018-04-07GLCache: Specialize the MortonCopy function for the DXT1 texture format.Subv1-1/+15
It will now use the UnswizzleTexture function instead of the MortonCopyPixels128, which doesn't seem to work for textures.
2018-04-07GLCache: Implemented GetTextureSurface.Subv1-3/+28
2018-04-07GLCache: Support uploading compressed textures to the GPU.Subv1-5/+17
Compressed texture formats like DXT1, DXT2, DXT3, etc will use this to ease the load on the CPU.
2018-04-07GL: Remove remaining references to 3DS-specific pixel formatsSubv1-83/+22
2018-04-07RasterizerCache: Remove 3DS-specific pixel formats.Subv2-71/+32
We're only left with RGB8 and DXT1 for now. More will be added as they are needed.
2018-04-07GL: Create the sampler objects when starting up the GL rasterizer.Subv1-0/+6
2018-04-07GL: Ported the SamplerInfo struct from citra.Subv2-1/+59
2018-04-07GL: Rename PicaTexture to MaxwellTexture.Subv2-2/+2
2018-04-07GL: Added functions to convert Maxwell tex filters and wrap modes to OpenGL.Subv1-0/+23
2018-04-07Textures: Added a helper function to know if a texture is blocklinear or pitch.Subv1-0/+5
2018-04-04rasterizer_interface.h: Update from citra to yuzuN00byKing1-3/+3
2018-04-04gl_rasterizer_cache.cpp: Update from citra to yuzuN00byKing1-1/+1
2018-04-04gl_rasterizer_cache.h: Update from citra to yuzuN00byKing1-3/+3
2018-04-04renderer_opengl.h: Update from citra to yuzuN00byKing1-2/+2
2018-04-01GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them.Subv2-121/+13
2018-04-01GPU: Implemented a gpu macro interpreter.Subv5-0/+431
The Ryujinx macro interpreter and envydis were used as reference. Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-03-27renderer_opengl: Use better naming for DrawScreens and DrawSingleScreen.bunnei2-8/+8
2018-03-27gl_rasterizer: Move code to bind framebuffer surfaces before draw to its own function.bunnei2-22/+31
2018-03-27gl_rasterizer: Add a SyncViewport method.bunnei3-18/+30
2018-03-27gl_rasterizer: Move PrimitiveTopology check to MaxwellToGL.bunnei2-11/+12
2018-03-27graphics_surface: Fix merge conflicts.bunnei1-0/+1
2018-03-27gl_rasterizer: Use ReadBlock instead of GetPointer for SetupVertexArray.bunnei1-1/+1
2018-03-27gl_rasterizer: Normalize vertex array data as appropriate.bunnei2-1/+5
2018-03-27maxwel_to_gl: Fix string formatting in log statements.bunnei1-2/+2
2018-03-27rasterizer: Rename DrawTriangles to DrawArrays.bunnei3-5/+5
2018-03-27gl_rasterizer: Use passthrough shader for SetupVertexShader.bunnei1-1/+2
2018-03-27renderer_opengl: Logging, etc. cleanup.bunnei6-33/+34
2018-03-27renderer_opengl: Remove framebuffer RasterizerFlushVirtualRegion hack.bunnei1-5/+0
2018-03-27gl_rasterizer_cache: Implement UpdatePagesCachedCount.bunnei2-8/+37
2018-03-27gl_rasterizer: Implement SetupVertexArray.bunnei1-20/+38
2018-03-27gl_rasterizer_cache: Fix an ASSERT_MSG.bunnei1-1/+1
2018-03-27maxwell_to_gl: Add module and function for decoding VertexType.bunnei2-0/+41
2018-03-27maxwell_3d: Use names that match envytools for VertexType.bunnei1-8/+8
2018-03-27maxwell_3d: Add VertexAttribute struct and cleanup.bunnei1-121/+160
2018-03-27gl_rasterizer: Use 32 texture units instead of 3.bunnei3-2/+3
2018-03-27gl_rasterizer: Implement DrawTriangles.bunnei1-1/+194
2018-03-27Maxwell3D: Call AccelerateDrawBatch on DrawArrays.bunnei1-1/+8
2018-03-27gl_rasterizer: Implement AnalyzeVertexArray.bunnei2-1/+56
2018-03-27gl_rasterizer_cache: MortonCopy Switch-style.bunnei1-72/+32
2018-03-27gl_rasterizer_cache: Implement GetFramebufferSurfaces.bunnei2-4/+104
2018-03-27maxwell: Add RenderTargetFormat enum.bunnei2-4/+5
2018-03-27renderer_opengl: Only draw the screen if a framebuffer is specified.bunnei1-6/+7
2018-03-26GPU: Load the sampler info (TSC) when retrieving active textures.Subv2-21/+67
2018-03-26GPU: Added the TSC structure. It contains information about the sampler.Subv1-0/+50
2018-03-26GPU: Added more fields to the TIC structure.Subv1-4/+30
2018-03-25GPU: Make the debug_context variable a member of the frontend instead of a global.Subv3-15/+13
2018-03-24GPU: Added a function to retrieve the active textures for a shader stage.Subv2-50/+59
TODO: A shader may not use all of these textures at the same time, shader analysis should be performed to determine which textures are actually sampled.
2018-03-24Frontend: Updated the surface view debug widget to work with Maxwell surfaces.Subv2-0/+15
2018-03-24GPU: Implement the Incoming/FinishedPrimitiveBatch debug breakpoints.Subv1-0/+7
2018-03-24GPU: Implement the MaxwellCommandLoaded/Processed debug breakpoints.Subv1-0/+10
2018-03-24Frontend: Ported the GPU breakpoints and surface viewer widgets from citra.Subv5-0/+242
2018-03-24GPU: Added a method to unswizzle a texture without decoding it.Subv4-5/+95
Allow unswizzling of DXT1 textures.
2018-03-24GPU: Preliminary work for texture decoding.Subv5-0/+139
2018-03-24GPU: Added viewport registers to Maxwell3D's reg structure.Subv1-1/+18
2018-03-24gl_rasterizer: Fake render in green, because it's cooler.bunnei1-1/+1
2018-03-24gl_rasterizer: Log warning instead of sync'ing unimplemented funcs.bunnei1-7/+1
2018-03-23gl_rasterizer_cache: Add missing include for vm_manager.bunnei1-0/+1
2018-03-23renderer_opengl: Only invalidate the framebuffer region, not flush.bunnei1-4/+3
2018-03-23renderer_opengl: Fixes for properly flushing & rendering the framebuffer.bunnei1-6/+12
2018-03-23RasterizerCacheOpenGL: FlushAll should flush full memory region.bunnei1-1/+1
2018-03-23rasterizer: Flush and invalidate regions should be 64-bit.bunnei3-9/+9
2018-03-23renderer_opengl: Add framebuffer_transform_flags member variable.bunnei1-2/+2
2018-03-23renderer_opengl: Better handling of framebuffer transform flags.bunnei2-3/+20
2018-03-23renderer_opengl: Use accelerated framebuffer load with LoadFBToScreenInfo.bunnei1-31/+25
2018-03-23gl_rasterizer: Implement AccelerateDisplay method from Citra.bunnei2-2/+44
2018-03-23LoadGLBuffer: Use bytes_per_pixel, not bits.bunnei1-1/+2
2018-03-23gl_rasterizer_cache: LoadGLBuffer should do a morton copy.bunnei1-16/+5
2018-03-23video_core: Move MortonCopyPixels128 to utils header.bunnei2-111/+113
2018-03-23video_core: Remove usage of PAddr and replace with VAddr.bunnei5-39/+39
2018-03-23video_core: Move FramebufferInfo to FramebufferConfig in GPU.bunnei7-66/+74
2018-03-23gl_rasterizer: Replace a bunch of UNIMPLEMENTED with ASSERT.bunnei2-20/+20
2018-03-23gl_rasterizer: Add a simple passthrough shader in lieu of shader generation.bunnei2-5/+68
2018-03-23gpu: Expose Maxwell3D engine.bunnei1-0/+4
2018-03-23maxwell_3d: Add some format decodings and string helper functions.bunnei1-3/+107
2018-03-23renderer: Create rasterizer and cleanup.bunnei4-4/+16
2018-03-21GPU: Added vertex attribute format registers.Subv1-1/+14
2018-03-21GPU: Added registers for the number of vertices to render.Subv1-2/+13
2018-03-20renderer_gl: Port boilerplate rasterizer code over from Citra.bunnei5-1/+495
2018-03-20gl_shader_util: Sync latest version with Citra.bunnei3-46/+116
2018-03-20renderer_gl: Port over gl_shader_gen module from Citra.bunnei3-0/+88
2018-03-20renderer_gl: Port over gl_shader_decompiler module from Citra.bunnei3-0/+87
2018-03-20renderer_gl: Port over gl_rasterizer_cache module from Citra.bunnei3-0/+1714
2018-03-20gl_resource_manager: Sync latest version with Citra.bunnei1-8/+77
2018-03-20renderer_gl: Port over gl_stream_buffer module from Citra.bunnei3-0/+218
2018-03-20gl_state: Sync latest version with Citra.bunnei2-47/+111
2018-03-19GPU: Added Z buffer registers to Maxwell3D's reg structure.Subv1-1/+17
2018-03-19GPU: Added the render target (RT) registers to Maxwell3D's reg structure.Subv1-1/+32
2018-03-19Clang FixesN00byKing1-1/+2
2018-03-19Clean Warnings (?)N00byKing1-1/+1
2018-03-19GPU: Added the TSC registers to the Maxwell3D register structure.Subv1-1/+15
2018-03-19GPU: Added the TIC registers to the Maxwell3D register structure.Subv1-1/+16
2018-03-19GPU: Implement macro 0xE1A BindTextureInfoBuffer in HLE.Subv2-1/+29
This macro simply sets the current CB_ADDRESS to the texture buffer address for the input shader stage.
2018-03-18GPU: Implement the BindStorageBuffer macro method in HLE.Subv2-1/+36
This macro binds the SSBO Info Buffer as the current ConstBuffer. This buffer is usually bound to c0 during shader execution. Games seem to use this macro instead of directly writing the address for some reason.
2018-03-18GPU: Handle writes to the CB_DATA method.Subv2-0/+39
Writing to this method will cause the written value to be stored in the currently-set ConstBuffer plus CB_POS. This method is usually used to upload uniforms or other shader-visible data.
2018-03-18GPU: Move the GPU's class constructor and destructors to a cpp file.Subv3-10/+30
This should reduce recompile times when editing the Maxwell3D register structure.
2018-03-18GPU: Store uploaded GPU macros and keep track of the number of method parameters.Subv4-27/+74
2018-03-18GPU: Macros are specific to the Maxwell3D engine, so handle them internally.Subv8-63/+55
2018-03-18GPU: Renamed ShaderType to ShaderStage as that is less confusing.Subv2-19/+19
2018-03-18GPU: Store shader constbuffer bindings in the GPU state.Subv2-5/+61
2018-03-18GPU: Corrected some register offsets and removed superfluous macro registers.Subv1-9/+3
2018-03-18GPU: Make the SetShader macro call do the same as the real macro's code.Subv2-3/+44
It'll now set the CB_SIZE, CB_ADDRESS and CB_BIND registers when it's called. Presumably this SetShader function is binding the constant shader uniforms to buffer 1 (c1[]).
2018-03-17GPU: Corrected the parameter documentation for the SetShader macro call.Subv2-11/+12
Register 0xE24 is actually a macro that sets some shader parameters in the register structure. Macros are uploaded to the GPU at startup and have their own ISA, we'll probably write an interpreter for this in the future.
2018-03-17GPU: Handle the SetShader method call (0xE24) and store the shader config.Subv2-4/+38
2018-03-17GPU: Added the vertex array registers.Subv1-2/+33
2018-03-17GPU: Process command mode 5 (IncreaseOnce) differently from other commands.Subv9-8/+97
Accumulate all arguments before calling the desired method. Note: Maybe we should do the same for the NonIncreasing mode?
2018-03-17GPU: Assert that we get a 0 CODE_ADDRESS register in the 3D engine.Subv1-0/+8
Shader address calculation depends on this value to some extent, we do not currently know what it being 0 entails.
2018-03-17GPU: Added Maxwell registers for Shader Program control.Subv1-2/+55
2018-03-05GPU: Intercept writes to the VERTEX_END_GL register.Subv2-1/+18
This is the register that gets written after a game calls DrawArrays(). We should collect all GPU state and draw using our graphics API here.
2018-02-14maxwell_3d: Make constructor explicitLioncash1-1/+1
2018-02-12GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.Subv3-3/+95
Only QueryMode::Write is supported at the moment.
2018-02-12Make a GPU class in VideoCore to contain the GPU state.Subv12-44/+252
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.
2018-02-12GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines.Subv9-0/+280
2018-02-12renderer_opengl: Support framebuffer flip vertical.bunnei3-5/+13
2018-01-27memory: Replace all memory hooking with Special regionsMerryMage1-1/+1
2018-01-21Format: Run the new clang format on everythingJames Rowe4-4/+4
2018-01-18CMakeLists: Derive the source directory grouping from targets themselvesLioncash1-19/+15
Removes the need to store to separate SRC and HEADER variables, and then construct the target in most cases.
2018-01-16clang-formatMerryMage1-1/+2
2018-01-15renderer_gl: Clear screen to black before rendering framebuffer.bunnei2-5/+8
2018-01-15renderer: Render previous frame when no new one is available.bunnei3-16/+18
2018-01-13Fix build on macOS and linuxMerryMage1-0/+1
2018-01-13Remove gpu debugger and get yuzu qt to compileJames Rowe2-5/+0
2018-01-13Remove references to PICA and rasterizers in video_coreJames Rowe64-14952/+3
2018-01-12renderer_opengl: Fix LOG_TRACE in LoadFBToScreenInfo.bunnei1-1/+1
2018-01-11renderer_opengl: Support rendering Switch framebuffer.bunnei3-138/+83
2018-01-11render_base: Add a struct describing framebuffer metadata.bunnei1-0/+26
2018-01-11renderer_opengl: Add MortonCopyPixels function for Switch framebuffer.bunnei1-0/+111
2018-01-11renderer_opengl: Update DrawScreens for Switch.bunnei2-23/+11
2018-01-01core/video_core: Fix a bunch of u64 -> u32 warnings.bunnei4-8/+8
2017-10-15hle: Initial implementation of NX service framework and IPC.bunnei1-1/+1
2017-10-04Extracted the attribute setup and draw commands into their own functionsHuw Pascoe1-217/+222
2017-09-30Fixed type conversion ambiguityHuw Pascoe2-3/+3
2017-09-27Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.Subv1-1/+1
It is unlikely we will ever use this without first doing a Cast to a signed type. Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-25Optimized Float<M,E> multiplicationHuw Pascoe1-11/+7
Before: ucomiss xmm1, xmm1 jp .L9 pxor xmm2, xmm2 mov edx, 1 ucomiss xmm0, xmm2 setp al cmovne eax, edx test al, al jne .L9 .L3: movaps xmm0, xmm2 ret .L9: ucomiss xmm0, xmm0 jp .L10 pxor xmm2, xmm2 mov edx, 1 ucomiss xmm1, xmm2 setp al cmovne eax, edx test al, al je .L3 After: movaps xmm2, xmm1 mulss xmm2, xmm0 ucomiss xmm2, xmm2 jnp .L3 ucomiss xmm1, xmm0 jnp .L11 .L3: movaps xmm0, xmm2 ret .L11: pxor xmm2, xmm2 jmp .L3
2017-09-24Optimized MortonHuw Pascoe1-10/+4
2017-09-23Remove pipeline.gpu_mode and fix minor issuesJames Rowe1-12/+2
2017-09-17Improved performance of FromAttributeBufferHuw Pascoe1-1/+2
Ternary operator is optimized by the compiler whereas std::min() is meant to return a value. I've noticed a 5%-10% emulation speed increase.
2017-09-17Fixed framebuffer warningHuw Pascoe1-7/+18
2017-09-11GPU: Add draw for immediate and batch modesJames Rowe1-2/+17
PR #1461 introduced a regression where some games would change configuration even while in the poorly named "drawing" mode, which broke the heuristic citra was using to determine when to draw the batch. This change adds back in a draw call for batching, and also adds in a draw call in immediate mode each time it adds a triangle.
2017-09-03pica/lighting: only apply Fresnel factor for the last lightwwylele2-7/+9
2017-08-31video_core: report telemetry for gas modewwylele1-0/+6
2017-08-26Warnings: Fixed a few missing-return warnings in video_core.Subv3-6/+10
2017-08-25SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGLwwylele1-9/+9
2017-08-25gl_rasterizer: implement custom clip planewwylele3-34/+83
2017-08-24SwRasterizer: implement custom clip planewwylele2-4/+25
2017-08-22gl_rasterizer/lighting: more accurate CP formulawwylele1-2/+2
2017-08-22SwRasterizer/Lighting: implement LUT input CPwwylele1-0/+11
2017-08-22SwRasterizer/Lighting: implement bump mappingwwylele3-8/+27
2017-08-21swrasterizer: remove invalid TODOwwylele1-4/+2
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21swrasterizer/clipper: remove tested TODOwwylele1-4/+0
hwtested. Current implementation is the correct behavior
2017-08-21gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shaderwwylele1-2/+5
2017-08-21gl_rasterizer: add clipping plane z<=0 defined in PICAwwylele4-0/+21
2017-08-19pica/command_processor: build geometry pipeline and run geometry shaderwwylele6-28/+383
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes: - no GS mode: sends VS output directly to the primitive assembler (what citra currently does) - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size. - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size. hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode. In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input. In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense). Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19pica/shader/jit: implement SETEMIT and EMITwwylele2-2/+49
2017-08-19pica/primitive_assembly: Handle winding for GS primitivewwylele2-3/+19
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19correct constnesswwylele2-2/+4
2017-08-19pica/shader/interpreter: implement SETEMIT and EMITwwylele1-0/+16
2017-08-19pica/shader: extend UnitState for GSwwylele2-0/+84
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access. This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-11gl_shader_gen: don't call SampleTexture when bump map is not usedwwylele1-4/+5
2017-08-11SwRasterizer/Lighting: implement spot lightwwylele1-3/+19
2017-08-11SwRasterizer/Lighting: implement geometric factorwwylele1-4/+16
2017-08-10SwRasterizer/Lighting: use make_tuple instead of constructorwwylele1-1/+1
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10pica/regs: layout geometry shader configuration regswwylele2-2/+39
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-07pica: upload shared shader code to both unitwwylele2-26/+45
2017-08-03SwRasterizer/Lighting: shorten file namewwylele4-4/+4
2017-08-02SwRasterizer/Lighting: move to its own filewwylele4-240/+271
2017-08-02SwRasterizer/Lighting: reduce confusionwwylele1-1/+1
2017-08-02SwRasterizer/Lighting: move quaternion normalization to the callerwwylele1-3/+3
2017-07-27pica/shader_interpreter: fix off-by-one in LOOPwwylele1-1/+1
2017-07-18telemetry: Log performance, configuration, and system data.bunnei2-6/+16
2017-07-11SwRasterizer/Lighting: dist atten lut input need to be clampwwylele1-1/+1
2017-07-11SwRasterizer/Lighting: unify float suffixwwylele1-11/+13
2017-07-11SwRasterizer/Lighting: get rid of nested returnwwylele1-10/+11
2017-07-11SwRasterizer/Lighting: refactor GetLutValue into a function.wwylele1-83/+27
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11SwRasterizer: only interpolate quat and view when lighting is enabledwwylele1-14/+14
2017-07-11SwRasterizer/Lighting: pass lighting state as parameterwwylele1-13/+13
2017-07-11SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body.Subv1-17/+17
2017-07-11SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function.Subv1-7/+6
2017-07-11SwRasterizer/Lighting: Do not use global registers state in ComputeFragmentsColors.Subv1-3/+3
2017-07-11SwRasterizer/Lighting: Do not use global state in LookupLightingLut.Subv2-13/+22
2017-07-11SwRasterizer/Lighting: Fixed a bug where the distance attenuation bias was being set to the dist atten scale.Subv1-3/+2
2017-07-11SwRasterizer: Fixed a few conversion warnings and moved per-light values into the per-light loop.Subv1-5/+6
2017-07-11SwRasterizer: Run clang-formatSubv1-45/+83
2017-07-11SwRasterizer: Flip the vertex quaternions before clipping (if necessary).Subv2-20/+15
2017-07-11SwRasterizer: Corrected the light LUT lookups.Subv1-6/+7
2017-07-11SwRasterizer: Corrected the light LUT lookups.Subv1-33/+43
2017-07-11SwRasterizer: Fixed the lighting lut lookup function.Subv1-2/+4
2017-07-11SwRasterizer: Calculate fresnel for fragment lighting.Subv1-1/+25
2017-07-11SwRasterizer: Calculate specular_1 for fragment lighting.Subv1-3/+59
2017-07-11SwRasterizer: Calculate specular_0 for fragment lighting.Subv1-13/+94
2017-07-11SwRasterizer: Implement primary fragment color.Subv1-4/+113
2017-07-01gl_rasterizer: use texture buffer for proctex LUTwwylele5-70/+80
2017-06-22gl_rasterizer: use texture buffer for fog LUTwwylele7-29/+32
2017-06-22gl_rasterizer: create the texture before applying the statewwylele1-2/+2
this is a rebasing error from #2792. It doesn't affect much though, because the later more Apply() call fixes/hides it
2017-06-21gl_state: reset 1d textureswwylele1-0/+14
2017-06-21gl_rasterizer: fix glGetUniformLocation typewwylele1-8/+8
2017-06-21gl_rasterizer: manage texture ids in one placewwylele3-31/+55
2017-06-21gl_rasterizer/lighting: fix LUT interpolationwwylele7-116/+102
2017-06-18gl_rasterizer/lighting: use the formula from the paper for germetic factorwwylele1-8/+8
2017-06-17Stop using reserved operator names (and/or/xor) with XbyakYuri Kunde Schlesner1-13/+13
Also has the Dynarmic upgrade with the same change
2017-06-15gl_rasterizer/lighting: implement geometric factorwwylele3-1/+20
2017-06-11gl_rasterizer/lighting: Implement tangent mappingwwylele1-7/+12
2017-06-11gl_rasterizer/lighting: implement lut input 5 (CP)wwylele2-3/+26
2017-06-10gl_rasterizer_cache: depth write is disabled if allow_depth_stencil_write is falsewwylele1-4/+5
2017-06-10OpenGL: Update comment on AreQuaternionsOpposite with new informationYuri Kunde Schlesner1-8/+11
While debugging the software renderer implementation, it was noticed that this is actually exactly what the hardware does, upgrading the status of this "hack" to being a proper implementation. And there was much rejoicing.
2017-06-04pica/rasterizer: implement/stub texture wrap mode 4-7wwylele4-12/+48
2017-05-30gl_rasterizer: implement spot lightwwylele1-6/+24
2017-05-30gl_rasterizer: sync spot light statuswwylele4-2/+61
2017-05-30pica: prepare registers for spotlightwwylele1-20/+43
2017-05-29swrasterizer: implement TextureCubewwylele1-2/+51
2017-05-29pica: add registers for texture cubewwylele1-1/+26
2017-05-28CMake: Create INTERFACE targets for microprofile and nihstroYuri Kunde Schlesner1-1/+1
2017-05-28CMake: Use IMPORTED target for libpngYuri Kunde Schlesner1-3/+2
2017-05-28CMake: Correct inter-module dependencies and library visibilityYuri Kunde Schlesner1-5/+7
Modules didn't correctly define their dependencies before, which relied on the frontends implicitly including every module for linking to succeed. Also changed every target_link_libraries call to specify visibility of dependencies to avoid leaking definitions to dependents when not necessary.
2017-05-28Move screen size constants from video_core to coreYuri Kunde Schlesner2-27/+8
video_core didn't even properly use them, and they were the source of many otherwise-unnecessary dependencies from core to video_core.
2017-05-28OpenGL: Remove unused RendererOpenGL fieldsYuri Kunde Schlesner2-11/+2
2017-05-27OpenGL: Improve accuracy of quaternion interpolationYuri Kunde Schlesner1-3/+5
Current order of operations (rotate then normalize) seems to produce a lot more distortion than normalizing and then rotating. This makes Citra results match pretty closesly with hardware, and indicates that hardware may also be using lerp instead of slerp to interpolate the quaternions.
2017-05-27gl_shader: refactor texture sampler into its own functionwwylele1-40/+39
2017-05-21swrasterizer: add missing tc0_w and fragment lighting attribute processingwwylele2-5/+8
2017-05-20gl_rasterizer: implement procedural texturewwylele6-7/+600
2017-05-20pica/swrasterizer: implement procedural texturewwylele8-4/+438
2017-05-17pica: use correct register value for shader bool_uniformswwylele1-2/+2
variable value is not masked. the masked and combined register value should be used instead
2017-05-16pica: correct bit field length for some registerswwylele4-17/+25
2017-05-12Pica: Write GS registersJannik Vogel1-0/+52
This adds the handlers for the geometry shader register writes which will call the functions from the previous commit to update registers for the GS.
2017-05-12Pica: Write shader registers in functionsJannik Vogel1-57/+103
The commit after this one adds GS register writes, so this moves the VS handlers into functions so they can be re-used and extended more easily.
2017-05-11Pica: Set program code / swizzle data limit to 4096Jannik Vogel5-13/+16
One of the later commits will enable writing to GS regs. It turns out that on startup, most games will write 4096 GS program words. The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages: ``` HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024 ``` New constants have been introduced to represent these limits. The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
2017-05-05pica: shader_dirty if texture2 coord changedwwylele5-7/+12
2017-05-03pica: use correct coordinates for texture 2wwylele4-5/+22
2017-04-20gl_shader_gen: remove TODO about Lerp behaviour verification. The implementation is verified against hardwarewwylele1-2/+0
2017-04-19rasterizer: implement combiner operation 7 (Dot3_RGBA)wwylele4-20/+39
2017-04-17OpenGL: Pass Pica regs via parameterYuri Kunde Schlesner3-7/+5
2017-04-17OpenGL: Move PicaShaderConfig to gl_shader_gen.hYuri Kunde Schlesner4-202/+206
Also move the implementation of CurrentConfig to the cpp file.
2017-04-17OpenGL: Move Attributes enum to a more appropriate fileYuri Kunde Schlesner3-12/+11
2017-04-08Pica/Regs: Correct bit width for blend-equationsJannik Vogel1-2/+2
2017-03-01Input: remove unused stuff & clean upwwylele1-0/+1
1. removed zl, zr and c-stick from HID::PadState. They are handled by IR, not HID 2. removed button handling in EmuWindow 3. removed key_map 4. cleanup #include
2017-02-27Doxygen: Amend minor issues (#2593)Mat M3-3/+5
Corrects a few issues with regards to Doxygen documentation, for example: - Incorrect parameter referencing. - Missing @param tags. - Typos in @param tags. and a few minor other issues.
2017-02-27Core: Re-write frame limiterYuri Kunde Schlesner1-3/+3
Now based on std::chrono, and also works in terms of emulated time instead of frames, so we can in the future frame-limit even when the display is disabled, etc. The frame limiter can also be enabled along with v-sync now, which should be useful for those with displays running at more than 60 Hz.
2017-02-27Core: Make PerfStats internally lockedYuri Kunde Schlesner1-8/+2
More ergonomic to use and will be required for upcoming changes.
2017-02-27Remove built-in (non-Microprofile) profilerYuri Kunde Schlesner1-8/+0
2017-02-27Add performance statistics to status barYuri Kunde Schlesner1-0/+9
2017-02-18OpenGL: Check if uniform block exists before updating it (#2581)Jannik Vogel1-29/+30
2017-02-15video_core: remove #pragma once in cpp file (#2570)Weiyi Wang2-4/+0
2017-02-13SWRasterizer: Move more framebuffer functions to fileYuri Kunde Schlesner3-100/+105
2017-02-13SWRasterizer: Move texturing functions to their own fileYuri Kunde Schlesner4-210/+259
2017-02-13SWRasterizer: Convert large no-capture lambdas to standalone functionsYuri Kunde Schlesner1-315/+310
2017-02-13SWRasterizer: Move framebuffer operation functions to their own fileYuri Kunde Schlesner4-236/+285
2017-02-13VideoCore: Move software rasterizer files to sub-directoryYuri Kunde Schlesner8-12/+12
2017-02-12video_core/shader: Document sanitized MUL operationYuri Kunde Schlesner1-0/+8
2017-02-11video_core: Fix benign out-of-bounds indexing of array (#2553)Yuri Kunde Schlesner1-2/+1
The resulting pointer wasn't written to unless the index was verified as valid, but that's still UB and triggered debug checks in MSVC. Reported by garrettboast on IRC
2017-02-09VideoCore: Split u64 Pica reg unions into 2 separate u32 unionsYuri Kunde Schlesner1-36/+42
This eliminates UB when aliasing it with the array of u32 regs, and is compatible with non-LE architectures.
2017-02-09VideoCore: Force enum sizes to u32 in LightingRegsYuri Kunde Schlesner1-4/+4
All enums that are used with BitField must have their type forced to u32 to ensure correctness.
2017-02-09OpenGL: Remove unused duplicate of IsPassThroughTevStageYuri Kunde Schlesner1-12/+0
This copy was left behind when the shader generation code was moved to a separate file.
2017-02-09VideoCore: Split regs.h inclusionsYuri Kunde Schlesner13-24/+45
2017-02-09Pica/Regs: Use binary search to look up reg namesYuri Kunde Schlesner2-15/+10
This gets rid of the static unordered_map. Also changes the return type const char*, avoiding unnecessary allocations (the result was only used by calling .c_str() on it.)
2017-02-09VideoCore: Use union to index into Regs structYuri Kunde Schlesner2-46/+28
Also remove some unused members.
2017-02-05Use std::array<u8,2> instead of u8[2] to fix MSVC buildLectem1-1/+1
2017-02-04VideoCore: Move Regs to its own fileYuri Kunde Schlesner22-658/+679
2017-02-04VideoCore: Split shader regs from Regs structYuri Kunde Schlesner9-102/+116
2017-02-04VideoCore: Split geometry pipeline regs from Regs structYuri Kunde Schlesner9-264/+292
2017-02-04VideoCore: Split lighting regs from Regs structYuri Kunde Schlesner6-312/+341
2017-02-04VideoCore: Split framebuffer regs from Regs structYuri Kunde Schlesner10-445/+491
2017-02-04VideoCore: Split texturing regs from Regs structYuri Kunde Schlesner15-494/+532
2017-02-04VideoCore: Split rasterizer regs from Regs structYuri Kunde Schlesner13-187/+218
2017-02-04Pica/Texture: Move part of ETC1 decoding to new file and cleanupsYuri Kunde Schlesner4-110/+159
2017-02-04Pica/Texture: Simplify/cleanup texture tile addressingYuri Kunde Schlesner4-37/+111
2017-02-04VideoCore: Move LookupTexture out of debug_utils.hYuri Kunde Schlesner7-301/+340
2017-02-03ShaderJIT: add 16 dummy bytes at the bottom of the stackwwylele1-2/+5
2017-01-31Common/x64: remove legacy emitter and abi (#2504)Weiyi Wang1-1/+0
These are not used any more since we moved shader JIT to xbyak.
2017-01-31shader_jit_x64_compiler: esi and edi should be persistent (#2500)Merry1-0/+2
2017-01-30VideoCore: Make PrimitiveAssembler const-correctYuri Kunde Schlesner2-3/+4
2017-01-30VideoCore: Extract swrast-specific data from OutputVertexYuri Kunde Schlesner5-58/+64
2017-01-30VideoCore/Shader: Clean up OutputVertex::FromAttributeBufferYuri Kunde Schlesner2-10/+16
This also fixes a long-standing but neverthless harmless memory corruption bug, whech the padding of the OutputVertex struct would get corrupted by unused attributes.
2017-01-30VideoCore: Split shader output writing from semantic loadingYuri Kunde Schlesner3-24/+24
2017-01-30VideoCore: Consistently use shader configuration to load attributesYuri Kunde Schlesner6-44/+23
2017-01-30VideoCore: Use correct register for immediate mode attribute countYuri Kunde Schlesner2-7/+13
2017-01-30VideoCore: Rename some types to more accurate namesYuri Kunde Schlesner8-18/+18
2017-01-30VideoCore: Change misleading register namesYuri Kunde Schlesner4-8/+9
A few registers had names such as "count" or "number" when they actually contained the maximum (that is, count - 1). This can easily lead to hard to notice off by one errors.
2017-01-30video_core: gl_rasterizer_cache.cpp removed unused type aliasKloen1-1/+0
2017-01-30video_core: gl_rasterizer.cpp removed unused type aliasKloen1-2/+0
2017-01-29video_core: silence unused-local-typedef boost related warning on GCCKloen1-0/+7
2017-01-26VideoCore/Shader: Move entry_point to SetupBatchYuri Kunde Schlesner6-26/+27
2017-01-26VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetupYuri Kunde Schlesner6-44/+40
2017-01-26Shader: Remove OutputRegisters structYuri Kunde Schlesner4-22/+17
2017-01-26Shader: Initialize conditional_code in interpreterYuri Kunde Schlesner2-3/+3
This doesn't belong in LoadInputVertex because it also happens for non-VS invocations. Since it's not used by the JIT it seems adequate to initialize it in the interpreter which is the only thing that cares about them.
2017-01-26Shader: Don't read ShaderSetup from global stateYuri Kunde Schlesner1-3/+3
2017-01-26shader_jit_x64: Don't read program from global stateYuri Kunde Schlesner3-22/+22
2017-01-26VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngineYuri Kunde Schlesner4-19/+10
2017-01-26VideoCore/Shader: Split interpreter and JIT into separate ShaderEnginesYuri Kunde Schlesner8-97/+153
2017-01-26VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h}Yuri Kunde Schlesner4-4/+4
2017-01-26VideoCore/Shader: Split shader uniform state and shader engineYuri Kunde Schlesner4-21/+54
Currently there's only a single dummy implementation, which will be split in a following commit.
2017-01-26VideoCore/Shader: Add constness to methodsYuri Kunde Schlesner2-4/+4
2017-01-26VideoCore/Shader: Use only entry_point as ShaderSetup paramYuri Kunde Schlesner3-11/+13
This removes all implicit dependency of ShaderState on global PICA state.
2017-01-26VideoCore/Shader: Use self instead of g_state.vs in ShaderSetupYuri Kunde Schlesner2-11/+8
2017-01-26VideoCore/Shader: Extract input vertex loading code into functionYuri Kunde Schlesner3-22/+26
2017-01-23video_core: fix shader.cpp signed / unsigned warningKloen1-2/+2
2017-01-23video_core: gl_rasterizer float to int warningKloen1-1/+2
2017-01-23video_core: fix gl_rasterizer warning on MSVCKloen1-1/+1
2017-01-07config: Add option for specifying screen resolution scale factor.bunnei3-5/+10
2017-01-04Fix some warnings (#2399)Jonathan Hao1-2/+0
2016-12-25Minor cleanup in GLSL codeJannik Vogel1-3/+2
2016-12-25Offset lighting LUT samples correctlyJannik Vogel1-7/+7
2016-12-23core: Move emu_window and key_map into coreMerryMage2-2/+2
* Removes circular dependences (common should not depend on core)
2016-12-19Use GL_TRUE when setting color_maskAlbin Bernhardsson1-4/+4
2016-12-16VideoCore/Shader: Extract DebugData out from UnitStateYuri Kunde Schlesner8-103/+99
2016-12-16Remove unnecessary castYuri Kunde Schlesner1-3/+1
2016-12-16VideoCore/Shader: Extract evaluate_condition lambda to function scopeYuri Kunde Schlesner1-26/+24
2016-12-16VideoCore/Shader: Extract call lambda up a scope and remove unused paramYuri Kunde Schlesner1-21/+17
2016-12-16VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffsetYuri Kunde Schlesner2-18/+11
2016-12-16VideoCore/Shader: Move DebugData to a separate fileYuri Kunde Schlesner4-172/+189
2016-12-15shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexingYuri Kunde Schlesner1-1/+1
2016-12-15VideoCore: Make profiling scope more representativeYuri Kunde Schlesner2-0/+15
2016-12-15VideoCore: Inline IsPicaTracingYuri Kunde Schlesner3-16/+15
Speeds up ALBW main menu slightly (~3%)
2016-12-15VideoCore: Eliminate an unnecessary copy in the drawcall loopYuri Kunde Schlesner3-5/+3
2016-12-15shader_jit_x64: Use Reg32 for LOOP* registers, eliminating castsYuri Kunde Schlesner1-16/+16
2016-12-15VideoCore: Convert x64 shader JIT to use Xbyak for assemblyYuri Kunde Schlesner3-223/+228
2016-12-11Add all services to the Service namespaceLioncash2-6/+7
Previously there was a split where some of the services were in the Service namespace and others were not.
2016-12-07OpenGL: Drop framebuffer completeness check.Markus Wick5-47/+8
This OpenGL call synchronize the worker thread of the nvidia blob. It can be verified on linux with the __GL_THREADED_OPTIMIZATIONS=1 environment variable. Those errors should not happen on tested drivers. It was used as a workaround for https://bugs.freedesktop.org/show_bug.cgi?id=94148
2016-12-06Implement Frame rate limiter (#2223)emmauss2-0/+2
* implement frame limiter * fixes
2016-12-05ASSERT that shader was linked successfullyJannik Vogel1-0/+2
2016-12-05Report shader uniform block size in case of mismatchJannik Vogel1-1/+3
2016-12-05Print broken shader code to logJannik Vogel1-3/+9
2016-12-04OpenGL: Non-zero stride only makes sense for linear buffersYuri Kunde Schlesner3-7/+11
2016-12-04OpenGL: Ensure framebuffer binding is restored if completion check failsYuri Kunde Schlesner1-10/+7
2016-12-04OpenGL: Fix DisplayTransfer accel when input width != output widthYuri Kunde Schlesner1-1/+10
Fixes #2246, #2261
2016-12-04shader_jit: Fix non-SSE4.1 path where FLR would not truncateJannik Vogel1-1/+1
2016-12-03clang-format: Fix coding styleYuri Kunde Schlesner1-1/+1
2016-12-02shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shiftedJannik Vogel1-6/+9
2016-11-30ClangFormat: Fixed the clang-format errorsSubv2-6/+10
2016-11-29Build: Fixed a few warnings.Subv2-7/+7
2016-11-27GPU: Remove the broken frame_skip option.Emmanuel Gil Peyrot1-4/+0
Fixes #1960.
2016-11-27RasterizerGL: Use GL_TRUE and 0xFF in the stencil and depth masks instead of simply true and -1Subv2-4/+4
2016-11-27Rasterizer/Memfill: Set the correct stencil write mask when clearing the stencil buffer.Subv1-1/+1
2016-11-24Cache Vertices instead of Output registers (#2165)jphalimi1-6/+7
This patch brings +3% performance improvement on average. It removes ToVertex() as an important hotspot of the emulator.
2016-11-22Fix format error from #2195wwylele1-1/+1
2016-11-20GPU/CiTrace: Avoid calling GetTextures() when not necessary.Subv1-6/+5
2016-11-19Minor formatting changeJames Rowe1-1/+1
2016-11-05Add default hotkey to swap primary screens.James Rowe1-3/+2
Also minor style changes
2016-11-05Support additional screen layouts.James Rowe1-6/+12
Allows users to choose a single screen layout or a large screen layout. Adds a configuration option to change the prominent screen.
2016-10-20Fix typosRicardo de Almeida Gonzaga1-1/+1
2016-09-30VideoCore: Shader interpreter cleanupsYuri Kunde Schlesner1-32/+42
2016-09-30VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfoYuri Kunde Schlesner1-3/+1
As far as I can tell, memset was replaced by a fill without correcting the parameter type, causing an out-of-bounds array read in the Vec4 constructor.
2016-09-30OpenGL: Take cached viewport sub-rect into account for scissorYuri Kunde Schlesner3-29/+25
Fixes #1938
2016-09-29rasterizer: separate TextureCopy from DisplayTransferwwylele3-6/+12
2016-09-21Remove special rules for Windows.h and library includesYuri Kunde Schlesner1-1/+1
2016-09-21Use negative priorities to avoid special-casing the self-includeYuri Kunde Schlesner18-18/+18
2016-09-21Remove empty newlines in #include blocks.Emmanuel Gil Peyrot35-105/+17
This makes clang-format useful on those. Also add a bunch of forgotten transitive includes, which otherwise prevented compilation.
2016-09-19Manually tweak source formatting and then re-run clang-formatYuri Kunde Schlesner23-125/+119
2016-09-18Sources: Run clang-format on everything.Emmanuel Gil Peyrot42-2532/+2943
2016-09-16VideoCore: Fix dangling lambda context in shader interpreterYuri Kunde Schlesner1-1/+1
The static meant that after the first execution, these lambda context would be pointing to a random location on the stack. Fixes a random crash when using the interpreter.
2016-08-30OpenGL: Avoid error on unsupported lighting LUTJannik Vogel1-0/+1
2016-08-30config: Add a setting for graphics V-Sync.bunnei1-0/+1
2016-06-28OpenGL: Add scaled resolution support to scissorYuri Kunde Schlesner4-3/+16
2016-06-28PICA: Scissor fixes and cleanupsYuri Kunde Schlesner5-45/+39
2016-06-28PICA: Implement scissor testSubv5-3/+105
2016-06-25Remove superfluous std::move in return std::move(local_var)scurest1-1/+1
2016-06-07OpenGL: Implement fogJannik Vogel5-7/+124
2016-06-07Rasterizer: Implement fogJannik Vogel1-21/+52
2016-06-07Pica: Add fog stateJannik Vogel3-14/+69
2016-06-07OpenGL: Avoid undefined behaviour for UNIFORM_BLOCK_DATA_SIZEJannik Vogel2-6/+8
2016-06-01gsp::gpu: Reset g_thread_id in UnregisterInterruptRelayQueuemailwl1-1/+1
2016-05-23OpenGL: Set shader_dirty on lighting changesJannik Vogel1-0/+23
2016-05-23Pica: Name LightSrc.config registerJannik Vogel2-17/+15
2016-05-23Pica: Name lighting.config0 and .config1 registersJannik Vogel2-18/+18
2016-05-23OpenGL: Use uniforms for dist_atten_bias and dist_atten_scaleJannik Vogel3-8/+84
2016-05-21Refactor Tev stage dumperJannik Vogel2-115/+114
2016-05-21Extend Tev stage dumperJannik Vogel1-14/+38
2016-05-16Retrieve shader result from new OutputRegisters-typeJannik Vogel4-64/+81
2016-05-14OpenGL: Only update depth uniforms if the depth changedJannik Vogel2-9/+22
2016-05-14OpenGL: value-initialize variables which cause uninitialised access otherwiseJannik Vogel1-2/+2
2016-05-13Use new shader-jit signature for interpreterJannik Vogel3-8/+8
2016-05-13Refactor access to state in shader-jitJannik Vogel4-24/+42
2016-05-12OpenGL: Support blend equationJannik Vogel4-0/+31
2016-05-12Move program_counter and call_stack from UnitState to interpreterJannik Vogel3-45/+42
2016-05-12Move default_attributes into Pica stateJannik Vogel4-4/+4
2016-05-11Turn ShaderSetup into structJannik Vogel4-57/+58
2016-05-11OpenGL: Implement texture type 3Jannik Vogel4-35/+67
2016-05-11Rasterizer: Implement texture type 3Jannik Vogel1-2/+27
2016-05-11Pica: Add tc0.w to OutputVertexJannik Vogel1-1/+2
2016-05-11Pica: Add texture type to stateJannik Vogel1-0/+10
2016-05-10gl_rasterizer: Fix compilation for debug buildsLioncash1-1/+1
2016-05-10OpenGL: Implement W-Buffers and fix depth-mappingJannik Vogel3-4/+23
2016-05-10Pica: Implement W-Buffer in SW rasterizerJannik Vogel4-11/+43
2016-05-09vertex_loader: Correct forward declaration of InputVertexLioncash1-1/+1
It's actually a struct, not a class.
2016-05-09vertex_loader: Provide an assertion for ensuring the loader has been setupLioncash2-0/+7
Also adds an assert to ensure that Setup is not called more than once during a VertexLoader's lifetime.
2016-05-09vertex_loader: Add constructors to facilitate immediate and two-step initializationLioncash2-2/+6
2016-05-09vertex_loader: initialize_num_total_attributes.Lioncash1-1/+1
Keeps the public API sane.
2016-05-09vertex_loader: Use std::array instead of raw C arraysLioncash1-6/+7
2016-05-09vertex_loader: Correct header orderingLioncash1-1/+1
2016-05-07fixup simple type conversions where possibleAlexander Laties4-7/+8
2016-05-06Frontends, VideoCore: Move glad initialisation to the frontendEmmanuel Gil Peyrot1-6/+0
On SDL2 this allows it to use SDL_GL_GetProcAddress() instead of the default function loader, and fixes a crash when using apitrace with an EGL context. On Qt we will need to migrate from QGLWidget to QOpenGLWidget and QOpenGLContext before we can use gladLoadGLLoader() instead of gladLoadGL(), since the former doesn’t expose a function loader.
2016-05-04Pica: Rename VertexLoaded breakpoint to VertexShaderInvocationJannik Vogel2-7/+5
2016-05-03Pica: Use a union for PicaShaderConfigJannik Vogel3-125/+139
2016-05-03Pica: Add TevStageConfigRaw to PicaShaderConfig (MSVC workaround)Jannik Vogel2-2/+23
2016-05-03Pica: Make PicaShaderConfig trivially_copyable and clear it before useJannik Vogel1-21/+28
2016-05-03OpenGL: Don't copy const_color (Reverts #1745)Jannik Vogel1-2/+3
2016-05-03Pica: Replace logic in shader.cpp with loopJannik Vogel1-34/+4
2016-05-01OpenGL: Copy TevStageConfig using a loop. Fixes bug: const_color not copiedJannik Vogel1-30/+11
2016-04-30OpenGL: border_color was never set. Fixed. (#1740)Jannik Vogel1-0/+1
2016-04-30VideoCore: Run include-what-you-use and fix most includes.Emmanuel Gil Peyrot34-79/+212
2016-04-30Remove TGA dumperJannik Vogel3-62/+0
2016-04-29Common: Remove section measurement from profiler (#1731)Yuri Kunde Schlesner4-11/+0
This has been entirely superseded by MicroProfile. The rest of the code can go when a simpler frametime/FPS meter is added to the GUI.
2016-04-29Move and rename the MemoryAccesses class to MemoryAccessTracker.Henrik Rydgard4-32/+35
2016-04-28Debugger fixHenrik Rydgard1-2/+2
2016-04-28Optimize the vertex loader, nearly doubling its speed.Henrik Rydgard2-32/+54
2016-04-28Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached).Henrik Rydgard3-11/+10
2016-04-28Move "&" to their proper place, add missing includes and make some properly relative.Henrik Rydgard2-8/+11
2016-04-28Refactor: Extract VertexLoader from command_processor.cpp.Henrik Rydgard5-125/+185
Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
2016-04-28Remove late accesses to attribute_configHenrik Rydgard1-5/+7
2016-04-24shader: Shader size is long uint, not uint.Sam Spilsbury1-1/+1
2016-04-24shader: Handle non-CALL opcodes with a breakSam Spilsbury1-0/+2
2016-04-24shader: Format string must be provided inline and not as a variableSam Spilsbury1-1/+1
2016-04-24Replace std::map with std::array for graphics event breakpoints, and allow the compiler to inline. Saves 1%+ in vertex heavy situations.Henrik Rydgard2-7/+14
2016-04-23pica: Handle default lighting caseSam Spilsbury1-1/+6
2016-04-22HWRasterizer: reorder declarations to match defstfarley1-9/+9
2016-04-22HWRasterizer: sync specular uniform for new shaderstfarley1-0/+2
2016-04-21HWRasterizer: Texture forwardingtfarley13-759/+1371
2016-04-21Config: Add scaled resolution optiontfarley2-0/+2
2016-04-17Rasterizer: Allow all blend factors for alpha blend-funcJannik Vogel1-57/+42
2016-04-15debug_utils: use std::make_unique for initializing PicaTraceLioncash1-1/+1
2016-04-14shader_jit_x64: Rename RuntimeAssert to Compile_Assert.bunnei2-5/+5
2016-04-14shader_jit_x64.cpp: Rename JitCompiler to JitShader.bunnei3-92/+92
2016-04-14shader_jit_x64: Free memory that's no longer needed after compilation.bunnei1-0/+6
2016-04-14shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses.bunnei2-5/+8
2016-04-14shader_jit_x64: Use CALL/RET instead of JMP for subroutines.bunnei1-17/+7
2016-04-14shader_jit_x64: Separate initialization and code generation for readability.bunnei1-9/+8
2016-04-14shader_jit_x64: Get rid of unnecessary last_program_counter variable.bunnei2-6/+2
2016-04-14shader_jit_x64: Execute certain asserts at runtime.bunnei2-5/+19
- This is because we compile the full shader code space, and therefore its common to compile malformed instructions.
2016-04-14shader: Remove unused 'state' argument from 'Setup' function.bunnei3-5/+4
2016-04-14shader_jit_x64: Specify shader main offset at runtime.bunnei3-10/+6
2016-04-14shader_jit_x64: Allocate each program independently and persist for emu session.bunnei3-38/+28
2016-04-14shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions.bunnei2-35/+119
2016-04-14shader_jit_x64: Fix strict memory aliasing issues.bunnei1-1/+3
2016-04-14file_util: Don't expose IOFile internals through the APILioncash1-1/+16
2016-04-10Pica: Remove geometry dumper (PICA_DUMP_GEOMETRY)Jannik Vogel4-71/+0
2016-04-10OpenGL: Implement color combiner Operation::Dot3_RGBJannik Vogel1-0/+3
2016-04-08OpenGL: Respect buffer-write allow registersJannik Vogel1-6/+28
2016-04-08OpenGL: Split buffer-write mask sync into seperate functionsJannik Vogel2-8/+39
2016-04-08Rasterizer: Respect buffer-write allow registersJannik Vogel2-4/+16
2016-04-08OpenGL: Keep stencil-test and framebuffer.depth_format in syncJannik Vogel1-0/+1
2016-04-05Common: Remove Common::make_unique, use std::make_uniqueMerryMage5-11/+7
2016-04-03OpenGL: Fix a double framebuffer completeness checks.Emmanuel Gil Peyrot1-4/+6
2016-04-03OpenGL: Check for framebuffer completenessJannik Vogel1-0/+3
2016-04-01Avoid warnings by casting to size_t for ARRAY_SIZE() comparisonsJannik Vogel1-6/+6
2016-03-24Pica: Improve accuracy of immediate-mode supportYuri Kunde Schlesner5-29/+56
This partially fixes Etrian Odyssey IV.
2016-03-24OpenGL: Don't attempt to draw empty triangle batchesYuri Kunde Schlesner1-0/+3
Our code did not handle this well, causing random crashes in some situations.
2016-03-17video_core: Don't cast away constLioncash3-18/+19
2016-03-17shader_interpreter: use std::inner_product for the dot productLioncash1-5/+3
Same thing, less code.
2016-03-17core/video_core: Make NumIds functions constexprLioncash1-1/+1
2016-03-17core/video_core: Don't cast away const in subscript operatorsLioncash1-3/+3
Not to say these subscript operators aren't totally ugly as is.
2016-03-17PICA: Alignment happens locally in vertexJannik Vogel1-6/+6
2016-03-15PICA: Fix MAD/MADI encodingJannik Vogel2-29/+33
2016-03-14PICA: Fix viewport offsetJannik Vogel1-2/+2
2016-03-14Respect vs output mapJannik Vogel2-7/+19
2016-03-13PICA: Align vertex attributesJannik Vogel1-1/+5
2016-03-12shader_jit_x64: Clear cache after code space fills up.bunnei3-2/+19
2016-03-12shader_jit_x64: Make assert outputs more useful & cleanup formatting.bunnei1-4/+7
2016-03-12shader: Update log message to use proper log class.bunnei1-1/+1
2016-03-09Common: Get rid of alignment macrosLioncash1-4/+4
The gl rasterizer already uses alignas, so we may as well move everything over.
2016-03-09renderer_base: In-class initialize variablesLioncash1-5/+2
2016-03-09render_base: Clarify/normalize getter functionsLioncash1-2/+2
2016-03-09renderer_base: Don't directly expose the rasterizer unique_ptrLioncash3-8/+11
There's no reason to allow direct access to the unique_ptr instance. Only its contained pointer.
2016-03-08Improve error report from Init() functionsLittleWhite5-8/+18
Add error popup when citra initialization failed
2016-03-06Pica: Write depth value even when depth test is disabledYuri Kunde Schlesner2-10/+12
This has been confirmed on hardware. Fixes Etrian Odyssey IV.
2016-03-03Add immediate mode vertex submissionDwayne Slater17-60/+172
2016-02-26renderer_opengl: Initalise fragment shader LUT texturesMerryMage1-0/+4
2016-02-21Fix out of bounds array access when loading a component >= 12Dwayne Slater1-1/+4
2016-02-21Add support for padding vertex attributesDwayne Slater1-6/+13
2016-02-12BitField: Make trivially copyable and remove assignment operatorMerryMage2-6/+6
2016-02-05pica: Cleanup lighting register definitions and documentation.bunnei2-48/+51
2016-02-05gl_rasterizer: Use alignas(16) instead of explicit padding.bunnei1-13/+6
2016-02-05renderer_opengl: Use GLvec3/GLvec4 aliases for commonly used types.bunnei4-14/+18
2016-02-05gl_rasterizer: Fix issue with interpolation of opposite quaternions.bunnei2-4/+32
2016-02-05pica_types: Fix typo in docstring.bunnei1-1/+1
2016-02-05pica_types: Replace float24/20/16 with a template class.bunnei5-116/+82
2016-02-05command_processor: Add an assertion to ensure LUTs are not written past their boundaries.bunnei1-0/+3
2016-02-05gl_rasterizer: Remove unnecessary casts.bunnei1-6/+6
2016-02-05gl_rasterizer: Fix PicaShaderConfig on GCC.bunnei1-29/+27
2016-02-05gl_rasterizer: Initial implementation of bump mapping.bunnei3-5/+42
2016-02-05gl_shader_gen: Fix bug in LUT range (should within range [0, 255] not [0, 256]).bunnei1-3/+3
2016-02-05gl_shader_gen: Implement lighting red, green, and blue reflection.bunnei3-21/+77
2016-02-05gl_shader_gen: View should be normalized.bunnei1-2/+2
2016-02-05gl_shader_gen: Implement fragment lighting fresnel effect.bunnei3-9/+38
2016-02-05gl_shader_gen: Implement fragment lighting specular 1 component.bunnei3-11/+41
2016-02-05gl_shader_gen: Add support for D0 LUT scaling.bunnei3-3/+71
2016-02-05gl_shader_gen: Refactor lighting config to match Pica register naming.bunnei3-42/+50
- Also implement D0 LUT enable.
2016-02-05pica: Cleanup and add some comments to lighting registers.bunnei2-19/+19
2016-02-05gl_rasterizer: Minor naming refactor on Pica register naming.bunnei2-20/+23
2016-02-05gl_shader_gen: Reorganize and cleanup lighting code.bunnei1-100/+107
- No functional difference.
2016-02-05gl_shader_gen: Fix directional lights.bunnei1-1/+1
2016-02-05gl_shader_gen: Fix bug with lighting where clamp highlights was only applied to last light.bunnei1-6/+6
2016-02-05gl_shader_gen: View vector needs to be normalized when computing half angle vector.bunnei1-3/+4
2016-02-05renderer_opengl: Use textures for fragment shader LUTs instead of UBOs.bunnei5-27/+64
- Gets us LUT interpolation for free. - Some older Intel GPU drivers did not support the big UBOs needed to store the LUTs.
2016-02-05renderer_opengl: Initial implementation of basic specular lighting.bunnei4-13/+165
2016-02-05renderer_opengl: Implement HW fragment lighting distance attenuation.bunnei2-17/+38
2016-02-05renderer_opengl: Implement HW fragment lighting LUTs within our default UBO.bunnei4-16/+67
2016-02-05renderer_opengl: Implement diffuse component of HW fragment lighting.bunnei6-15/+270
2016-02-05pica: Implement decoding of basic fragment lighting components.bunnei5-15/+120
- Diffuse - Distance attenuation - float16/float20 types - Vertex Shader 'view' output
2016-02-05pica: Implement fragment lighting LUTs.bunnei2-0/+34
2016-02-05pica: Add decodings for distance attenuation and LUT registers.bunnei1-1/+104
2016-02-05pica: Add pica_types module and move float24 definition.bunnei3-112/+127
2016-02-03hwrasterizer: Use proper cached fb addr/sizetfarley2-42/+34
2016-02-03OpenGL: Downgrade GL_DEBUG_SEVERITY_NOTIFICATION to Debug logging levelYuri Kunde Schlesner1-2/+0
The nVidia driver is *extremely* spammy on this category, sending a message on every buffer or texture upload, slowing down the emulator and making the log useless.
2016-01-25Debugger: Use 3dbrew names for GPU registersYuri Kunde Schlesner1-57/+465
This list was imported from the 3dbrew wiki page and is pretty much complete.
2016-01-25Shader: Implement "invert condition" feature of IFU instructionYuri Kunde Schlesner2-2/+5
If the bit 0 of the JMPU instruction is set, then the jump condition will be inverted. That is, a jump will happen when the boolean is false instead of when it is true.
2016-01-24Shader JIT: Fix off-by-one error when compiling JMPsYuri Kunde Schlesner2-6/+6
There was a mistake in the JMP code which meant that one instruction at the destination would be skipped when the jump was taken. This commit also changes the meaning of the culprit parameter to make it less confusing and avoid similar mistakes in the future.
2016-01-21hwrasterizer: Use depth offsettfarley3-2/+24
2016-01-17command_processor: Get rid of variable shadowingLioncash1-2/+1
2015-12-30video_core: Make the renderer global a unique_ptrLioncash2-6/+10
2015-12-30swrasterizer: Add missing override specifierLioncash1-1/+1
2015-12-21VideoCore: Sync state after changing rasterizersYuri Kunde Schlesner1-0/+1
This fixes various bugs that appear in the HW rasterizer after switching between it and the SW one during emulation.
2015-12-08VideoCore: Unify interface to OpenGL and SW rasterizersYuri Kunde Schlesner13-67/+105
This removes explicit checks sprinkled all over the codebase to instead just have the SW rasterizer expose an implementation with no-ops for most operations.
2015-12-07VideoCore: Rename HWRasterizer methods to be less confusingYuri Kunde Schlesner4-12/+12
2015-12-07OpenGL: Rename cache functions to better match what they actually doYuri Kunde Schlesner3-12/+11
2015-12-06GPU/PrimitiveAssembler: Fixed drawing triangle fans.Subv1-5/+4
It was skipping the second vertex assignment and using uninitialized garbage when assembling the corresponding triangle.
2015-12-05OpenGL: Flip framebuffers during transfer rather than when renderingYuri Kunde Schlesner2-12/+11
2015-12-05OpenGL: Add support for glFrontFace in the state trackerYuri Kunde Schlesner2-0/+6
2015-12-01PICA: Properly emulate 1-stage delay in the combiner bufferYuri Kunde Schlesner2-12/+19
This was discovered and verified by @fincs. The tev combiner buffer actually lags behind by one stage, meaning stage 1 reads the initial color, stage 2 reads stage 0's output, and so on. Fixes character portraits in Fire Emblem: Awakening and world textures in Zelda: ALBW. Closes #1140.
2015-11-26renderer_opengl: Fix uniform issues introduced with kemenaran/avoid-explicit-uniform-location.bunnei2-6/+8
2015-11-25Use regular uniform locationPierre de La Morinerie3-15/+5
The support for GL_ARB_explicit_uniform_location is not that good (53% according to http://feedback.wildfiregames.com/report/opengl/feature/GL_ARB_explicit_uniform_location). This fix the shader compilation on Intel HD 4000 (#1222).
2015-11-19FragShader: Use an UBO instead of several individual uniformsSubv6-13/+67
2015-11-10GPU/Loaders: Log an error when a loader tries to load from a component beyond the available ones (12).Subv1-0/+2
Related to #1170
2015-10-24OpenGL: Log GL_KHR_debug messages we receiveEmmanuel Gil Peyrot1-0/+57
This allows the driver to communicate errors, warnings and improvement suggestions about our usage of the API.
2015-10-22gl_shader_gen: Use explicit locations for vertex shader attributes.bunnei2-15/+9
2015-10-22gl_shader_gen: Optimize code for AppendAlphaTestCondition.bunnei1-16/+11
- Also add a comment to AppendColorCombiner.
2015-10-22gl_rasterizer: Define enum types for each vertex texcoord attribute.bunnei3-12/+14
2015-10-22gl_shader_gen: Various cleanups to shader generation.bunnei3-48/+52
2015-10-22gl_rasterizer: Use MMH3 hash for shader cache hey.bunnei4-83/+63
- Includes a check to confirm no hash collisions.
2015-10-22gl_shader_gen: Require explicit uniform locations.bunnei3-56/+34
- Fixes uniform issue on AMD.
2015-10-22gl_shader_gen: Rename 'o' to 'attr' in vertex/fragment shaders.bunnei1-11/+11
2015-10-22gl_shader_gen: AppendAlphaModifier default should be 0.0, not vec4(0.0).bunnei1-1/+1
2015-10-22gl_shader_gen: Fix bug where TEV stage outputs should be clamped.bunnei1-3/+3
2015-10-22gl_rasterizer: Add documentation to ShaderCacheKey.bunnei1-0/+16
2015-10-22gl_shader_gen: Add additional function documentation.bunnei2-0/+18
2015-10-22gl_shader_util: Cleanup header file + add docstring.bunnei1-1/+7
2015-10-22gl_shader_gen: Various cleanups + moved TEV stage generation to its own function.bunnei1-161/+170
2015-10-22renderer_opengl: Refactor shader generation/caching to be more organized + various cleanups.bunnei10-788/+509
2015-10-22gl_rasterizer: Move logic for creating ShaderCacheKey to a static function.bunnei3-22/+50
2015-10-22gl_shader_util: Use vec3 constants for AppendColorCombiner.bunnei1-6/+6
2015-10-22gl_rasterizer: Fix typo in uploading TEV const color uniforms.bunnei1-5/+5
2015-10-22gl_shader_util: Fix precision bug with alpha testing.bunnei2-9/+9
- Alpha testing is not done with float32 precision, this makes the HW renderer match the SW renderer.
2015-10-22Initial implementation of fragment shader generation with caching.Subv7-261/+568
2015-10-09CitraQt, SkyEye, Loader, VideoCore: Remove newlines in LOG_* calls.Emmanuel Gil Peyrot2-7/+7
The LOG_* function itself already appends one.
2015-10-07Silence -Wsign-compare warnings.Rohit Nirmal1-3/+3
2015-09-29fix some xcode 7.0 warningsMartin Lindhe3-2/+4
2015-09-16general: Silence some warnings when using clangLioncash3-7/+7
2015-09-11video_core: Reorganize headersLioncash19-62/+56
2015-09-11video_core: Remove unnecessary includes from headersLioncash5-13/+3
2015-09-10renderer_opengl: Remove unimplemented function declarationLioncash1-3/+0
2015-09-10video_core: Remove unused variablesLioncash3-4/+0
2015-09-10gl_rasterizer: Replace push_back calls with emplace_back in AddTriangleLioncash1-3/+3
2015-09-07Shader JIT: Use SCALE constant from emitteraroulin1-4/+4
2015-09-07Shader: Fix size_t to int casts of register offsetsaroulin2-15/+21
2015-09-03OpenGL: Use Sampler Objects to decouple sampler config from texturesYuri Kunde Schlesner4-21/+76
Fixes #978
2015-09-03OpenGL: Remove ugly and endian-unsafe color pointer castsYuri Kunde Schlesner4-9/+13
2015-09-03OpenGL: Add support for Sampler Objects to state trackerYuri Kunde Schlesner3-4/+42
2015-09-02video_core: Fix format specifiers warningsaroulin2-2/+3
2015-09-01x64: Proper stack alignment in shader JIT function callsaroulin2-28/+18
Import Dolphin stack handling and register saving routines Also removes the x86 parts from abi files
2015-08-31Pica: Added the primitive_restart register (0x25f) to the registers map.Subv2-1/+5
2015-08-31Pica: Add the vertex_offset register to the Pica registers map.Subv2-0/+2
2015-08-31Shader JIT: Fix SGE/SGEI NaN behavioraroulin1-3/+3
SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE instruction was used with NLT
2015-08-30GPU: Implemented register 0x22A.Subv2-2/+8
This is the equivalent of the "first" parameter in glDrawArrays, it tells the GPU the vertex index at which to start rendering. Register 0x22A doesn't affect indexed rendering.
2015-08-30Replace the previous OpenGL loader with a glad-generated 3.3 oneYuri Kunde Schlesner11-2812/+12
The main advantage of switching to glad from glLoadGen is that, apart from being actively maintained, it supports a customizable entrypoint loader function, which makes it possible to also support OpenGL ES.
2015-08-28gl_rasterizer_cache: Detect and ignore unnecessary texture flushes.bunnei3-8/+18
2015-08-27Shader JIT: Fix float to integer rounding in MOVAaroulin1-2/+2
MOVA converts new address register values from floats to integers using truncation
2015-08-27Shader JIT: ifdef out reference to ifdef'd out shader_maparchshift1-0/+2
shader_map was only defined on x86 architectures, but was cleared on shutdown with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
2015-08-25Integrate the MicroProfile profiling libraryYuri Kunde Schlesner5-0/+25
This brings goodies such as a configurable user interface and multi-threaded timeline view.
2015-08-24HWRenderer: Added a workaround for the Intel Windows driver bug that causes glTexSubImage2D to not change the stencil buffer.Subv1-2/+9
Reported here https://communities.intel.com/message/324464
2015-08-24fixup! Shaders: Fix multiplications between 0.0 and infYuri Kunde Schlesner1-4/+4
2015-08-24Shader JIT: Tiny micro-optimization in DPHYuri Kunde Schlesner1-4/+4
2015-08-24Shaders: Fix multiplications between 0.0 and infYuri Kunde Schlesner3-40/+58
The PICA200 semantics for multiplication are so that when multiplying inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by IEEE. This is relied upon by games. Fixes #1024 (missing OoT interface items)
2015-08-24Shaders: Explicitly conform to PICA semantics in MAX/MINYuri Kunde Schlesner2-2/+10
2015-08-24Shader JIT: Add name to second scratch register (XMM4)Yuri Kunde Schlesner1-3/+5
2015-08-24shader_jit: Replace two MDisp usages with MatRLioncash1-2/+2
2015-08-24Shader JIT: Fix CMP NaN behavior to match hardwareYuri Kunde Schlesner1-8/+23
2015-08-23HWRenderer: Only reload the framebuffer from gpu memory if the hw renderer is in use during a breakpoint.Subv1-2/+6
2015-08-23Shader: Use std::sqrt for float instead of sqrtaroulin1-1/+1
2015-08-23Shader: RCP and RSQ computes only the 1st componentaroulin2-10/+10
2015-08-22Shader: implement DPH/DPHI in JITaroulin2-2/+36
2015-08-22Shader: implement DPH/DPHI in interpreteraroulin1-1/+8
Tests revealed that the component with w=1 is SRC1 and not SRC2, it is now fixed on 3dbrew.
2015-08-21HWRasterizer: Implemented stencil ops 6 and 7.Subv1-1/+3
2015-08-21SWRasterizer: Implemented stencil ops 6 and 7.Subv2-6/+14
IncrementWrap and DecrementWrap, verified with hwtests.
2015-08-21HWRasterizer: Implemented stencil op 1 (GL_ZERO)Subv1-1/+1
2015-08-21SWRasterizer: Implemented stencil action 1 (GL_ZERO).Subv2-1/+4
Verified with hwtests.
2015-08-21SWRasterizer: Removed a todo. Verified with hwtests.Subv1-1/+0
2015-08-21SWRenderer: The stencil depth_pass action is executed even if depth testing is disabled.Subv1-7/+5
The HW renderer already did this.
2015-08-21Rasterizer: Abstract duplicated stencil code into a lambda.Subv1-6/+9
2015-08-20GLRasterizer: Implemented stencil testing in the hw renderer.Subv4-2/+44
2015-08-20GPU/Rasterizer: Corrected the stencil implementation.Subv2-18/+39
Verified the behavior with hardware tests.
2015-08-19Shader: implement SGE, SGEI and SLT in JITaroulin2-15/+36
2015-08-19Shader: implement SGE, SGEI in interpreteraroulin1-0/+14
2015-08-19Shader: Save caller-saved registers in JIT before a CALLaroulin2-0/+33
2015-08-17Shader: implement EX2 and LG2 in JITaroulin2-2/+22
2015-08-16Fix Linux GCC 4.9 build (complaining about undeclared memset)LittleWhite1-1/+2
2015-08-16Shader: implement EX2 and LG2 in interpreteraroulin1-0/+36
2015-08-16Build fix for Debug configurations.Tony Wasserka1-1/+1
2015-08-16Introduce a shader tracer to allow inspection of input/output values for each processed instruction.Tony Wasserka8-41/+326
2015-08-16Pica/DebugUtils: Include uniform information into shader dumps.Tony Wasserka2-11/+51
2015-08-16citra-qt: Improve shader debugger.Tony Wasserka4-13/+28
Now supports dumping the current shader and recognizes a larger number of output semantics.
2015-08-16videocore: Added RG8 texture supportPatrick Martin2-1/+8
2015-08-16Shader: Use a POD struct for registers.bunnei5-40/+43
2015-08-16Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64.bunnei2-7/+6
2015-08-16Common: Cleanup CPU capability detection code.bunnei1-5/+5
2015-08-16Common: Move cpu_detect to x64 directory.bunnei1-2/+1
2015-08-16x64: Refactor to remove fake interfaces and general cleanups.bunnei6-150/+26
2015-08-16JIT: Support negative address offsets.bunnei1-26/+25
2015-08-16Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders.bunnei10-3/+940
- Config: Add an option for selecting to use shader JIT or interpreter. - Qt: Add a menu option for enabling/disabling the shader JIT.
2015-08-15Common: Added MurmurHash3 hash function for general-purpose use.bunnei1-1/+1
2015-08-15Shader: Define a common interface for running vertex shader programs.bunnei7-186/+289
2015-08-15Shader: Move shader code to its own subdirectory, "shader".bunnei9-12/+12
2015-08-15GPU: Refactor "VertexShader" namespace to "Shader".bunnei13-50/+48
- Also renames "vertex_shader.*" to "shader_interpreter.*"
2015-08-11ARM Core, Video Core, CitraQt, Citrace: Use CommonTypes types instead of the standard u?int*_t types.Emmanuel Gil Peyrot1-1/+2
2015-08-06OpenGL: Fix state tracking in situations with reused object handlesYuri Kunde Schlesner4-0/+45
If an OpenGL object is created, bound to a binding using the state tracker, and then destroyed, a newly created object can be assigned the same numeric handle by OpenGL. However, even though it is a new object, and thus needs to be bound to the binding again, the state tracker compared the current and previous handles and concluded that no change needed to be made, leading to failure to bind objects in certain cases. This manifested as broken text in VVVVVV, which this commit fixes along with similar texturing problems in other games.
2015-08-06OpenGL: Remove redundant texture.enable_2d field from OpenGLStateYuri Kunde Schlesner4-26/+3
All uses of this field where it's false can just set the texture id to 0 instead.
2015-08-05Videocore: Implement simple vertex cachingYuri Kunde Schlesner1-62/+89
This gives a ~2/3 reduction in the amount of vertices that need to be processed through the vertex loaders and the vertex shader, yielding a good speedup.
2015-07-28OpenGL: Add a profiler category measuring framebuffer readbackYuri Kunde Schlesner1-0/+7
2015-07-26citra-qt/debug_utils: Use lock_guard everywhereLectem1-6/+5
unique_lock were being used as lock_guards. Also replaced manual lock/unlock by lock_guard for harmonization.
2015-07-26citra-qt/command list: Add mask columnLectem3-25/+24
2015-07-26OpenGL: Make OpenGL object resource wrappers fully inlineYuri Kunde Schlesner3-143/+79
The functions are so simple that having them separate only bloats the code and hinders optimization.
2015-07-26Videocore: Don't reinitialize register name map on every queryYuri Kunde Schlesner2-65/+72
This greatly speeds up the command list debug widget.
2015-07-26Videocore: Simplify variables in vertex shader interpreterYuri Kunde Schlesner1-24/+21
Simplifies the code and gives a tiny speed-up.
2015-07-26Videocore: Replace std::stack in shader interpreter with static_vectorYuri Kunde Schlesner1-6/+6
Shaves off 1/3rd of the vertex shader time in Fire Emblem
2015-07-26VideoCore: #ifdef out some debugging routinesYuri Kunde Schlesner5-13/+18
Some disabled debugging functionality was being called from rendering routines in VideoCore. Although disabled, many of them still allocated memory or did some extra work that was enough to show up in a profiler. Gives a slight (~2ms) speedup.
2015-07-25Address error that remained in last mergeYuri Kunde Schlesner1-1/+1
2015-07-23VideoCore: Fix values of unset components in input attribute arraysYuri Kunde Schlesner1-42/+38
If an input attribute array had a field with less than 4 components, the remaining components were left unset if not specified by a default vertex attribute. If neither mechanism would set a component, it would assume a garbage value. It has been verified that the hardware behavior is to instead to set the missing components from the fixed default of (0 0 0 1). The default vertex attribute values aren't used at all if a vertex array is specified for that attribute. Fixes UI graphics on Fire Emblem: Awakening, a small texturing glitch when selecting a character in Cubic Ninja, as well as eliminating the unset-W hack which was required for Ocarina of Time to not have garbled triangles. This change has been tested against hardware.
2015-07-23VideoCore: Saturate vertex colors before interpolatingYuri Kunde Schlesner1-0/+6
During testing, it was discovered that hardware does not interpolate colors output by the vertex shader as-is. Rather, it drops the sign and saturates the value to 1.0. This is done before interpolation, such that (e.g.) interpolating outputs 1.5 and -0.5 is equivalent to as if the shader had output the values 1.0 and 0.5 instead, with the interpolated value never crossing 0.0. This change has been tested against hardware.
2015-07-23Qt/GPU Breakpoints: Added three more breakpoint types:Subv2-4/+7
* IncomingDisplayTransfer: Triggered just before a display transfer is performed. * GSPCommandProcessed: Triggered right after a GSP command is processed. * BufferSwapped: Triggered when the frames flip
2015-07-23Rasterizer/GL: Set the border color when binding a texture.Subv1-2/+9
2015-07-22GL Renderer: Remove erroneous glEnable(GL_TEXTURE_2D) callsYuri Kunde Schlesner1-8/+5
In OpenGL 3, texturing is always enabled, and this call is invalid. While it produced no effect in the rest of the execution, it wouldn't have the intended effect of disabling texturing for that unit. Instead bind a null texture to the unit.
2015-07-21GPU: Added registers for min and mag texture filters and implemented them in the hw renderer.Subv4-3/+37
2015-07-20Pica: Correct switched S/T texture wrapping registersYuri Kunde Schlesner1-2/+2
This was found and hwtested by Lectem
2015-07-20Pica: Fix DP3 instruction, which wasn't assigning to the w componentYuri Kunde Schlesner1-1/+1
2015-07-19GLRasterizer: Don't try to get a pointer to the depth buffer if it doesn't exist.Subv1-3/+7
2015-07-19Rasterizer/Textures: Fixed a bug where the I4 format would get twice the real stride.Subv1-0/+1
Also added its name to the texture viewer widget
2015-07-19Vertex Shader : Undo castingzawata1-1/+1
2015-07-19Video_Core : Type fixeszawata2-2/+2
2015-07-19Video_Core: Finally fix pesky warningzawata1-1/+1
2015-07-19Video_Core : Change Tabs to Spaceszawata1-0/+15
This really should be universalized, I keep getting errors creating commits because lines I've edited use tabs instead of spaces(and yes I did read the contributing guide and i know they are supposed to be spaces)
2015-07-19Video_Core : Fix Conversion Warningszawata3-18/+3
2015-07-15Pica/Shader: Add geometry shader definitions.Tony Wasserka5-149/+162
2015-07-15Pica/CommandProcessor: Move default attribute setup to the proper position.Tony Wasserka1-40/+40
2015-07-15Pica/Clipper: Output proper number of triangles in debugging logs.Tony Wasserka1-1/+1
2015-07-14VideoCore: Implement the DOT3_RGB combinerLectem2-1/+13
2015-07-13Pica: Implement stencil testing.Tony Wasserka2-12/+173
2015-07-13Clean up command_processor.cpp.Tony Wasserka1-22/+27
2015-07-13Add CiTrace recording support.Tony Wasserka3-2/+63
This is exposed in the GUI as a new "CiTrace Recording" widget. Playback is implemented by a standalone 3DS homebrew application (which only runs reliably within Citra currently; on an actual 3DS it will often crash still).
2015-07-09Added GL_CLAMP_TO_BORDER supportLectem3-13/+28
2015-06-28Core: Cleanup hw includes.Emmanuel Gil Peyrot5-4/+13
2015-06-28Core, VideoCore: Replace or fix exit() calls.Emmanuel Gil Peyrot1-6/+9
2015-06-28CitraQt: Cleanup includes.Emmanuel Gil Peyrot3-5/+10
2015-06-28Common: Cleanup emu_window includes.Emmanuel Gil Peyrot3-10/+8
2015-06-28Common: Cleanup key_map includes.Emmanuel Gil Peyrot2-3/+9
2015-06-27VideoCore: Fix floating point warningzawata1-1/+1
2015-06-16VideoCore: Log the GL driver’s vendor and renderer.Emmanuel Gil Peyrot1-0/+2
2015-06-14video_core: add extra braces around initializerYuri Kunde Schlesner1-3/+3
Trivial change and fixes several warnings in the clang build.
2015-06-09Renderer formatting editstfarley2-26/+29
2015-06-09Render-to-texture flush, interval math fixtfarley1-1/+13
2015-06-09Liberal texture unbind (clout menu)tfarley2-4/+40
2015-06-09Depth format fix (crush3d intro/black screens)tfarley1-46/+46
2015-06-09Implemented glColorMasktfarley3-0/+24
2015-05-31Pica: Use zero for the SecondaryFragmentColor source.bunnei3-11/+21
- This is a workaround until we support fragment lighting.
2015-05-31rasterizer: Remove unnecessary 'using' for BlendEquation.bunnei1-2/+1
2015-05-31Pica: Implement LogicOp function.bunnei7-8/+135
2015-05-31rasterizer: Implement AddSigned combiner function for alpha channel.bunnei1-0/+7
2015-05-31vertex_shader: Use address offset on src2 in inverted mode.bunnei1-3/+3
2015-05-31Pica: Implement command buffer execution registers.bunnei2-44/+76
2015-05-31vertex_shader: Implement SLT/SLTI instructions.bunnei1-4/+10
2015-05-31vertex_shader: Implement MIN instruction.bunnei1-0/+9
2015-05-30Move video_core/color.h to common/color.harchshift5-218/+4
2015-05-30Move video_core/math.h to common/vector_math.harchshift7-648/+6
The file only contained vector manipulation code, and such widely-useable code doesn't belong in video_core.
2015-05-29Remove every trailing whitespace from the project (but externals).Emmanuel Gil Peyrot11-25/+25
2015-05-23gl_state: Remove unnecessary const specifier on ApplyLioncash2-2/+2
2015-05-23video_core/utils: Remove unused variables in GetMortonOffsetLioncash1-3/+0
2015-05-23Pica: Create 'State' structure and move state memory there.bunnei12-428/+451
2015-05-23gl_state: Fix a condition typo in ApplyLioncash1-1/+1
2015-05-23OpenGL renderertfarley21-44/+2196
2015-05-17GPU/DefaultAttributes: Clear up a comment in command_processorSubv1-2/+2
2015-05-17GPU/DefaultAttributes: Let the attribute data from the loaders overwrite the default attributes, if set.Subv1-21/+23
closes #735
2015-05-15Memmap: Re-organize memory function in two filesYuri Kunde Schlesner4-5/+3
memory.cpp/h contains definitions related to acessing memory and configuring the address space mem_map.cpp/h contains higher-level definitions related to configuring the address space accoording to the kernel and allocating memory.
2015-05-14pica: Add the ULL specifier in IsDefaultAttributeLioncash1-1/+1
This is necessary otherwise there are warnings about a 32-bit result being casted to a 64-bit value.
2015-05-12GPU: Add more fine grained profiling for vertex shader and rasterizationYuri Kunde Schlesner2-0/+10
2015-05-11Implement I4 texture formatarchshift2-1/+12
@neobrain, could you confirm that this is correct? It's been tested with various different games and fixes different textures, including in Animal Crossing, Kirby Triple Deluxe, and SMB3D.
2015-05-10rasterizer: Implemented combiner output scaling.bunnei2-2/+16
2015-05-10rasterizer: Implemented AddSigned combiner op.bunnei1-0/+10
2015-05-10rasterizer: Fixed a depth testing bug.bunnei2-6/+19
2015-05-10rasterizer: Implement combiner buffer input.bunnei2-4/+53
2015-05-10rasterizer: Return zero'd vectors on error conditions.bunnei1-3/+3
2015-05-10vertex_shader: Implement FLR instruction.bunnei1-0/+9
2015-05-10vertex_shader: Implement MADI instruction.bunnei1-4/+7
nihstro: Update submodule to latest upstream/master to support MADI instruction decoding.
2015-05-09Memory: Add GetPhysicalPointer helper functionYuri Kunde Schlesner3-11/+11
2015-05-09Memory: Support more regions in the VAddr-PAddr translation functionsYuri Kunde Schlesner3-18/+7
Also adds better documentation and removes the one-off reimplementation of the function in pica.h.
2015-05-09Memory: Re-organize and rename memory area address constantsYuri Kunde Schlesner1-1/+1
2015-05-07Common: Remove common.hYuri Kunde Schlesner6-3/+8
2015-05-07GPU: Implemented default vertex shader attributes.Subv4-68/+137
Fixes some games crashing.
2015-04-29VideoCore: Remove a superfluous auto variable declaration in debug_utils.Emmanuel Gil Peyrot1-1/+1
2015-04-10Silence some -Wsign-compare warnings.Rohit Nirmal1-2/+2
2015-04-05Changed occurences of colour to color for consistencyGareth Higgins2-4/+4
2015-04-04Allow the user to set the background clear color during emulationarchshift1-1/+2
The background color can be seen at the sides of the bottom screen or when the window is wider than normal.
2015-03-16VideoCore: Add static_cast around expressions where the compiler doesn’t deduce the right type.Emmanuel Gil Peyrot2-4/+4
2015-03-12Pica/VertexShader: Fix a bug caused due to incorrect assumptions of consecutive output register tables.Tony Wasserka1-20/+24
We now write create a temporary buffer for output registers and copy all of them to the actual output vertex structure after the shader has run. This is technically not necessary, but it's easier to vectorize in the future.
2015-03-10GPU: Added the stencil test structure to the Pica Regs struct.Subv3-50/+65
2015-03-10GPU: Implemented more depth buffer formats.Subv3-9/+115
This fixes the horizontal lines in Picross E, Cubic Ninja, Cave Story 3D and possibly others
2015-03-09Added LCD registers, and implementation for color filling in OGL code.archshift2-11/+48
2015-03-09Pica/PrimitiveAssembly: Fix triangle strips and fans being generated with incorrect winding order.Tony Wasserka1-6/+3
2015-03-08Update nihstro submodule to the initial release version.archshift1-37/+38
Includes more opcodes to implement in the future.
2015-03-07Set framebuffer layout from EmuWindow.bunnei3-43/+9
2015-03-07GPU/Textures: Fixed ETC texture decoding.Subv1-1/+1
2015-03-04GPU: Added RGB565/RGB8 framebuffer support and various cleanups.bunnei5-85/+155
- Centralizes color format encode/decode functions. - Fixes endianness issues. - Implements remaining framebuffer formats in the debugger.
2015-03-02Add profiling infrastructure and widgetYuri Kunde Schlesner2-0/+18
2015-02-28Added RGBA5551 compatibility in the rasterizerarchshift3-2/+41
This allows Virtual Console games to display properly.
2015-02-27GPU: Implemented bits 3 and 1 from the display transfer flags.Subv3-54/+91
Bit 3 is used to specify a raw copy, where no processing is done to the data, seems to behave exactly as a DMA. Bit 1 is used to specify whether to convert from a tiled format to a linear format or viceversa.
2015-02-26Video core: Fix A4 texture decodingYuri Kunde Schlesner1-2/+2
It was trying to take the LSB from `coarse_x`, which would always be 0 and thus would always return the same texel from each byte. To add insult to the injury, the conditional was actually the wrong way around too. Fixes blocky text in OoT.
2015-02-26Video core: Fix pixelation/blockiness in textures.Yuri Kunde Schlesner1-3/+3
This was caused during morton decoding by me not masking the bits of each coordinate before merging them, so the bits from x could set bits in y if it was >255.
2015-02-25Rasterizer: Add support for RGBA4 framebuffer format.bunnei1-0/+21
2015-02-22Rasterize with the correct color component order.bunnei1-11/+24
- Fixes a regression with #594.
2015-02-21Pica/VertexShader: Fixed LOOP with more than one iteration.Subv1-1/+4
Previously it wouldn't jump back to the start of the loop code once it reached the end of the block. Fixes the texture problems in a lot of games.
2015-02-20Remove duplication of INSERT_PADDING_WORDS between pica.h and gpu.harchshift1-11/+0
2015-02-19Rasterizer: Fixed a warning in GetWrappedTexCoord.Subv1-4/+4
Redeclaring the variable inside the switch was causing weird behavior.
2015-02-18Pica/Rasterizer: Replace exit() calls with UNIMPLEMENTED().Tony Wasserka1-5/+5
2015-02-18Pica/Rasterizer: Make some local lambdas static.Tony Wasserka1-8/+8
2015-02-18Pica/BlendUnit: Implement separate color/alpha blend equations.Tony Wasserka2-65/+59
2015-02-18Pica/TextureEnvironment: Add a note.Tony Wasserka1-0/+4
2015-02-18Pica/TextureEnvironment: Treat texture combiner source 1 as the PrimaryColor.Tony Wasserka2-0/+4
Not really sure where the difference is, but some applications seem to use this 1:1 the same way...
2015-02-18Pica/TextureEnvironment: Add support for the MAD-like texture combiners and clean up texture environment logic.Tony Wasserka2-0/+28
2015-02-18Pica/OutputMerger: Fix flipped framebuffers.Tony Wasserka1-0/+10
2015-02-18Pica/TextureUnit: Implement mirrored repeating texture wrapping.Tony Wasserka2-3/+12
2015-02-18Pica: Fix a bug in the register definitions, relating to texture wrapping.Tony Wasserka2-2/+2
2015-02-18Pica/OutputMerger: Implement color format checking.Tony Wasserka2-4/+13
2015-02-18Pica/Rasterizer: Rasterize actual pixel centers instead of pixel corners.Tony Wasserka1-2/+3
2015-02-18Pica/Rasterizer: Fix garbage pixels at triangle borders.Tony Wasserka1-1/+3
2015-02-18Pica/Rasterizer: Clean up and fix backface culling.Tony Wasserka1-11/+27
2015-02-18Pica: Cleanup clipping code and change screenspace z to range from -1..0.Tony Wasserka2-53/+42
The change in depth range seems to reflect better to what applications are expecting, and makes for cleaner code overall (hence is more likely to reflect hardware behavior).
2015-02-18Pica/VertexShader: Implement the LOOP instruction.Tony Wasserka1-14/+36
2015-02-18Pica/CommandProcessor: Properly implement shader load destination offset registers.Tony Wasserka2-20/+10
2015-02-18Pica/CommandProcessor: Work around initialized vertex attributes some more.Tony Wasserka1-2/+8
2015-02-17core/video_core: Use in-place construction where possibleLioncash2-4/+4
2015-02-16VideoCore: Fix a typo in Vec4 MakeVec(T, Vec3<T>), where the second argument was Vec2<T> instead.Emmanuel Gil Peyrot1-1/+1
2015-02-15video_core: Implement the remaining framebuffer formats in the OpenGL renderer.Emmanuel Gil Peyrot2-12/+67
2015-02-12Build: Fixed some warningsSubv2-3/+3
2015-02-11Fix Min and Max blend equationsDarius Goad1-6/+8
2015-02-11Asserts: break/crash program, fit to style guide; log.h->assert.harchshift8-23/+18
Involves making asserts use printf instead of the log functions (log functions are asynchronous and, as such, the log won't be printed in time) As such, the log type argument was removed (printf obviously can't use it, and it's made obsolete by the file and line printing) Also removed some GEKKO cruft.
2015-02-10Add more blend equations from 3dbrewDarius Goad2-2/+49
2015-02-05Rasterizer: Implement the other color and alpha modifiers.bunnei2-58/+69
2015-02-05VideoCore: Added same-component swizzlers to math utility functions.bunnei1-16/+35
2015-01-31Pica: Implement blend factors.bunnei2-10/+67
2015-01-28Pica: Implement color/alpha channel enable.bunnei2-1/+12
2015-01-27Rasterizer: Implemented alpha testing.bunnei2-7/+52
2015-01-26GPU: Implement the remaining depth testing functions.bunnei2-3/+28
2015-01-14GSP: Update framebuffer info on all interruptsYuri Kunde Schlesner1-3/+1
Hardware testing determined that the GSP processes shared memory framebuffer update info even when no memory transfer or filling GX commands are used. They are now updated on every interrupt, which isn't confirmed correct but matches hardware behaviour more closely. This also reverts the hack introduced in #404. It made a few games behave better, but I believe it's incorrect and also breaks other games.
2015-01-13Pica/Rasterizer: Add ETC1 texture decompression support.Tony Wasserka2-14/+142
2015-01-13Pica/VertexShader: Implement JMPC/JMPU/CALLC/CALLU.Tony Wasserka1-23/+52
2015-01-13Pica/VertexShader: Implement the MAD instruction.Tony Wasserka1-0/+69
2015-01-08GSP: Toggle active framebuffer each framebunnei1-1/+4
2014-12-31Pica/Rasterizer: Remove some redundant casts.Tony Wasserka1-3/+3
2014-12-31Pica/Rasterizer: Make orient2d a free function and rename it to SignedArea.Tony Wasserka1-31/+38
2014-12-31Pica: Cleanup color conversion.Tony Wasserka2-18/+46
2014-12-31VideoCore: Remove some unused functions.Tony Wasserka1-26/+0
2014-12-31Pica/Rasterizer: Fix a bug related to multitexturing and texture wrapping.Tony Wasserka1-2/+2
2014-12-31Pica/Rasterizer: Clean up long code lines.Tony Wasserka1-4/+8
2014-12-31Pica/VertexShader: Coding style fixes.Tony Wasserka1-16/+8
2014-12-31Pica/CommandProcessor: Cleanups.Tony Wasserka1-3/+4
2014-12-31Pica/CommandProcessor: Workaround games not setting the input position's w component.Tony Wasserka1-0/+14
2014-12-31Pica/Rasterizer: Implement backface culling.Tony Wasserka2-10/+36
2014-12-31Pica/Rasterizer: Textures seem to be laid out flipped vertically.Tony Wasserka1-1/+1
Not sure if this is a correct fix. Probably should instead change the decoding logic itself.
2014-12-31Pica/DebugUtils: Fix a bug in RGBA4 texture decoding.Tony Wasserka1-2/+2
2014-12-31Pica/Rasterizer: Implement alpha blending.Tony Wasserka1-0/+84
2014-12-31Pica/Rasterizer: Implement depth testing.Tony Wasserka2-6/+34
2014-12-31Pica/Rasterizer: Further enhance Tev support.Tony Wasserka1-4/+19
2014-12-31Pica: Add output merger definitions.Tony Wasserka1-1/+56
2014-12-31Pica: Fix A4, IA4 and IA8 texture formats.Tony Wasserka1-13/+7
Both IA4 and IA8 had their component order mixed up. Additionally, IA4 used the wrong number of nibbles per texel. A4 skipped every second texel.
2014-12-31Pica/CommandProcessor: Add support for integer uniforms.Tony Wasserka4-1/+30
2014-12-29Rasterizer: Pre-divide vertex attributes by WYuri Kunde Schlesner3-8/+32
Execute the division-by-W for perspective-correct interpolation of values in the clipper, moving them out of the rasterization inner loop.
2014-12-29GPU: Bitwise texture swizzlingYuri Kunde Schlesner1-27/+24
Replace the loop-based texture address swizzling code by a bit-twiddling implementation, providing a very small speed up. Also simplify addressing code.
2014-12-29Rasterizer: Common sub-expression eliminationYuri Kunde Schlesner1-14/+17
Move the computation of some values out of loops so that they're not constantly recalculated even when they don't change.
2014-12-29Clipper: Compact buffers on each clipping passYuri Kunde Schlesner1-28/+27
Use a new buffer management scheme in the clipper that allows using a bounded minimal amount of buffer space. Even though it copies more data it is still slightly faster likely due to using less cache.
2014-12-29Clipper: Avoid dynamic allocationsYuri Kunde Schlesner1-10/+7
The triangle clipper was allocating its temporary input, output and work buffers using a std::vector. Since this is a hot path, it's desirable to use stack allocation instead.
2014-12-29Vertex Shader: Zero OutputVertex to avoid denormalsYuri Kunde Schlesner1-0/+4
Unused OutputVertex attributes were being left un-initialized. The leftover garbage sometimes decoded as floating-point denormalized values, causing fallbacks to microcode and massive slowdowns in the rest of the rasterization pipeline even though the results were unused. By zeroing the structure we ensure these attributes only contain harmless zeros.
2014-12-29GPU: Implement frameskip and remove forced framebuffer swap hack.bunnei1-0/+5
2014-12-21Fix visual studio ambiguous symbol errorApology111-4/+4
2014-12-21More warning cleanupsChin2-7/+7
2014-12-21License changepurpasmart9623-23/+23
2014-12-20Pica/VertexShader: Promote a log message to critical status.Tony Wasserka1-1/+1
2014-12-20Pica/VertexShader: Small optimization.Tony Wasserka1-7/+7
2014-12-20Pica/VertexShader: Be robust against invalid inputs.Tony Wasserka1-2/+9
More specifically, this also fixes crashes by Citra trying to load a src2 register even if the current instruction does not use that.
2014-12-20Pica/VertexShader: Clarify a comment.Tony Wasserka1-1/+3
2014-12-20Pica/DebugUtils: Further cleanups to LookupTexture.Tony Wasserka1-7/+7
2014-12-20Pica/DebugUtils: Fix two warnings.Tony Wasserka1-2/+2
2014-12-20Pica/DebugUtils: Better document LookupTexture.Tony Wasserka2-7/+16
2014-12-20Pica/Rasterizer: Get rid of C-style casts.Tony Wasserka1-4/+4
2014-12-20Pica/DebugUtils: Make a number of variables static.Tony Wasserka1-13/+13
Makes for cleaner and faster code.
2014-12-20Pica/VertexShader: Cleanup flow control logic and implement CMP/IFU instructions.Tony Wasserka1-50/+56
2014-12-20Pica/VertexShader: Run instruction handlers according to the effective opcode.Tony Wasserka1-1/+1
This allows for proper emulation of the different CMP/LRP/MAD instructions.
2014-12-20Pica/VertexShader: Implement MAX instructions.Tony Wasserka1-0/+9
2014-12-20Pica: Add support for boolean uniforms.Tony Wasserka4-2/+21
2014-12-20Pica/VertexShader: Add support for MOVA, CMP and IFC.Tony Wasserka2-7/+138
2014-12-20Pica/VertexShader: Move code around a bit.Tony Wasserka1-42/+58
2014-12-20Pica/VertexShader: Some cleanups using std::array.Tony Wasserka2-5/+19
2014-12-20Pica/VertexShader: Support negating src2.Tony Wasserka2-3/+9
2014-12-20Pica/DebugUtils: Replace duplicated SHBIN structures in favor of nihstro's ones.Tony Wasserka1-61/+8
2014-12-20Pica/VertexShader: Remove (now) duplicated shader bytecode definitions in favor of nihstro's ones.Tony Wasserka2-222/+30
2014-12-20Pica/DebugUtils: Add an event triggered after loading a vertex.Tony Wasserka2-0/+4
2014-12-20Pica/PrimitiveAssembly: Implement triangle strips.Tony Wasserka2-8/+16
2014-12-20Pica/CommandProcessor: Add a safety check for invalid (?) GPU configurations.Tony Wasserka1-0/+7
2014-12-20Pica/CommandProcessor: Fix vertex decoding if multiple memory areas are accessed for different attributes.Tony Wasserka1-7/+8
2014-12-20Add support for a ridiculous number of texture formats.Tony Wasserka2-7/+80
2014-12-20Pica: Unify ugly address translation hacks.Tony Wasserka5-16/+25
2014-12-20Pica: Further improve Tev emulation.Tony Wasserka3-12/+51
2014-12-20Pica: Merge texture lookup logic for DebugUtils and Rasterizer.Tony Wasserka3-55/+41
This effectively adds support for a lot texture formats in the rasterizer.
2014-12-20Pica: Implement texture wrapping.Tony Wasserka2-2/+31
2014-12-20Pica/DebugUtils: Add support for RGBA8, RGBA5551, RGBA4 and A8 texture formats.Tony Wasserka2-3/+48
2014-12-20Pica: Initial support for multitexturing.Tony Wasserka3-24/+83
2014-12-20Clean up some warningsChin1-2/+2
2014-12-19Properly erase/remove an observerchinhodado1-1/+1
2014-12-13Convert old logging calls to new logging macrosYuri Kunde Schlesner10-38/+50
2014-12-12MemMap: Renamed "GSP" heap to "linear", as this is not specific to GSP.bunnei1-2/+2
- Linear simply indicates that the mapped physical address is always MappedVAddr+0x0C000000, thus this memory can be used for hardware devices' DMA (such as the GPU).
2014-12-10GSP: Trigger GPU interrupts at more accurate locations.bunnei2-1/+15
2014-12-10GPU: Fixed bug in command list size decoding.bunnei1-1/+2
2014-12-09Pica: Re-enable command names on MSVC.Tony Wasserka1-5/+0
The affected code is no longer limited by compiler support on that platform.
2014-12-09More coding style fixes.Tony Wasserka1-6/+12
2014-12-09Some code cleanup.Tony Wasserka1-3/+1
2014-12-09citra_qt: Add enhanced texture debugging widgets.Tony Wasserka3-1/+30
Double-clicking a texture parameter command in the pica command lists will spawn these as a new tab in the pica command list dock area.
2014-12-09citra-qt: Add texture viewer to Pica command list.Tony Wasserka2-21/+45
The texture viewer is enabled when selecting a write command to one of the texture config registers.
2014-12-09Pica/DebugUtils: Add breakpoint functionality.Tony Wasserka3-0/+189
2014-12-09Build fix for something which shouldn't have compiled successfully to begin with.Tony Wasserka1-1/+1
2014-12-07Integrate Boost into build system and perform a trivial cleanup in vertex_shader.cpp.Tony Wasserka1-6/+10
2014-12-03Change NULLs to nullptrs.Rohit Nirmal2-7/+7
2014-12-01Silence a few -Wsign-compare warnings.Rohit Nirmal3-6/+6
2014-11-30Fixed viewport error caused by roundingvaguilar1-2/+2
2014-11-19Remove tabs in all files except in skyeye imports and in generated GL codeEmmanuel Gil Peyrot4-14/+14
2014-11-19Remove trailing spaces in every file but the ones imported from SkyEye, AOSP or generatedEmmanuel Gil Peyrot2-3/+3
2014-11-18OpenGL Renderer: Cleanup viewport extent calculation.Tony Wasserka2-44/+29
2014-11-18Fixup EmuWindow interface and implementations thereof.Tony Wasserka1-3/+3
2014-11-18Viewport scaling and display density independenceKevin Hartman2-1/+50
The view is scaled to be as large as possible, without changing the aspect, within the bounds of the window. On "retina" displays, or other displays where window units != pixels, the view should no longer draw incorrectly.
2014-11-16vertex_shader: Fix control reaches end of function warningLioncash1-1/+1
2014-11-14Fix two format strings.Lioncash1-2/+2
2014-10-30Fix some warningsSean2-3/+3
2014-10-29Renamed souce files of services to match port namesGareth Poole1-1/+1
2014-10-26Add `override` keyword through the code.Yuri Kunde Schlesner1-4/+4
This was automated using `clang-modernize`.
2014-10-21Only check OpenGL shader log if size is >1.Yuri Kunde Schlesner1-9/+6
This prevents a crash when the buffer size returned by the driver is 0, in which case no space is allocated to store even the NULL byte and glGetShaderInfoLog errors out. Thanks to @Relys for the bug report.
2014-10-12Rework OpenGL renderer.Yuri Kunde Schlesner4-233/+193
The OpenGL renderer has been revised, with the following changes: - Initialization and rendering have been refactored to reduce the number of redundant objects used. - Framebuffer rotation is now done directly, using texture mapping. - Vertex coordinates are now given in pixels, and the projection matrix isn't hardcoded anymore.
2014-10-12OpenGL renderer: Shuffle initialization code around and rename functions.Yuri Kunde Schlesner2-25/+18
2014-10-12Remove virtual inheritance from RendererOpenGLYuri Kunde Schlesner2-3/+3
Also make destructor virtual so that instances are properly destructed.
2014-10-08Fix warnings in video_coreLioncash7-23/+23
2014-09-17Common: Rename the File namespace to FileUtil, to match the filename and prevent collisions.Emmanuel Gil Peyrot1-1/+1
2014-09-14Core: Fix warnings in gpu.cppLioncash1-1/+1
2014-09-12Added support for multiple input device types for KeyMap and connected Qt.Kevin Hartman1-0/+1
2014-09-09Moved common_types::Rect from common to Common namespacearchshift2-3/+3
2014-09-07renderer_opengl.cpp: improved alignment for readabilityarchshift1-16/+16
2014-09-07Dead code removal: video_core.cpp, load_symbol_map.cpparchshift1-7/+0
2014-09-07utils: cleaned up DumpTGA, removing redundanciesarchshift2-21/+13
2014-09-01Remove hand-crafted Visual Studio solution.Yuri Kunde Schlesner2-217/+0
2014-09-01CMake cleanupYuri Kunde Schlesner1-13/+26
Several cleanups to the buildsystem: - Do better factoring of common libs between platforms. - Add support to building on Windows. - Remove Qt4 support. - Re-sort file lists and add missing headers.
2014-09-01Replace GLEW with a glLoadGen loader.Yuri Kunde Schlesner10-13/+2819
This should fix the GL loading errors that occur in some drivers due to the use of deprecated functions by GLEW. Side benefits are more accurate auto-completion (deprecated function and symbols don't exist) and faster pointer loading (less entrypoints to load). In addition it removes an external library depency, simplifying the build system a bit and eliminating one set of binary libraries for Windows.
2014-08-28Downgrade GLSL version to 1.50 (compatible with GL 3.2)Yuri Kunde Schlesner3-10/+15
2014-08-26VideoCore: Fixes rendering issues on Qt and corrects framebuffer output size.bunnei4-8/+15
2014-08-26Rewrite of OpenGL renderer, including OS X supportKevin Hartman8-211/+340
Screen contents are now displayed using textured quads. This can be updated to expose an FBO once an OpenGL backend for when Pica rendering is being worked on. That FBO's texture can then be applied to the quads. Previously, FBO blitting was used in order to display screen contents, which did not work on OS X. The new textured quad approach is less of a compatibility risk.
2014-08-25Pica/Rasterizer: Clarify a TODO.Tony Wasserka1-1/+3
2014-08-25Pica/VertexShader: Fix a bug in the call stack handling.Tony Wasserka1-2/+3
2014-08-25Math: Warning fixes.Tony Wasserka1-14/+23
2014-08-25Pica: Consolidate the primitive assembly code in PrimitiveAssembly and GeometryDumper.Tony Wasserka5-46/+74
2014-08-25Pica/Rasterizer: Add texturing support.Tony Wasserka3-18/+69
2014-08-25Pica/DebugUtils: Add convenient tev setup printer.Tony Wasserka3-0/+101
2014-08-25Pica/Rasterizer: Add initial implementation of texture combiners.Tony Wasserka2-2/+225
2014-08-25Pica: Add support for dumping textures.Tony Wasserka3-1/+177
2014-08-25Pica/Math: Improved the design of the Vec2/Vec3/Vec4 classes and simplified rasterizer code accordingly.Tony Wasserka3-98/+133
- Swizzlers now return const objects so that things like "first_vec4.xyz() = some_vec3" now will fail to compile (ideally we should support some vector holding references to make this actually work). - The methods "InsertBeforeX/Y/Z" and "Append" have been replaced by more versions of MakeVec, which now also supports building new vectors from vectors. - Vector library now follows C++ type promotion rules (hence, the result of Vec2<u8> with another Vec2<u8> is now a Vec2<int>).
2014-08-25Pica/VertexShader: Fix a bug in the bitfield definitions and add the "negate" field for swizzlers.Tony Wasserka2-14/+92
2014-08-25Pica/citra-qt: Replace command list view and command list debugging code with something more sophisticated.Tony Wasserka4-63/+78
2014-08-25Pica/CommandProcessor: Implement parameter masking.Tony Wasserka2-6/+25
2014-08-25Pica: Add debug utilities for dumping shaders.Tony Wasserka4-1/+227
2014-08-25Pica: Add debug utility functions for dumping geometry data.Tony Wasserka6-4/+123
2014-08-24Fix the threading for GL Context in Qt5.Sacha1-1/+0
Connect the emu_thread start/finish to a moveContext slot.
2014-08-13float24: Remove private default constructorarchshift1-2/+0
Fixes building with clang.
2014-08-12Use glewExperimental on Linux in order to fix GLFW-modearchshift1-3/+2
2014-08-12Pica: Add basic rasterizer.Tony Wasserka7-2/+260
2014-08-12Pica: Add triangle clipper.Tony Wasserka7-8/+230
2014-08-12Pica: Add primitive assembly stage.Tony Wasserka7-2/+95
2014-08-12Pica: Add vertex shader implementation.Tony Wasserka7-10/+722
2014-08-12Pica: Implement vertex loading.Tony Wasserka2-8/+102
2014-08-12Pica: Add register definition for vertex loading and rendering.Tony Wasserka1-33/+128
2014-08-12Pica: Add command processor.Tony Wasserka7-5/+107
2014-08-12Pica: Add float24 structure.Tony Wasserka1-0/+75
24-bit floating points are used internally for calculations on the GPU, however the current code will still emulate that with 32-bit floating points. In the future we might want to accurately perform the calculations with correct bitness in the future, but for now we just wrap the calculations around this class.
2014-08-12Video core: Add utility class for vector operations.Tony Wasserka4-1/+582
I wrote most of this for ppsspp, so I hold full copyright over it. In addition to the original release in ppsspp, this provides functionality to easily extend e.g. two-dimensional vectors to three-dimensional vectors.
2014-08-12Pica/GPU: Change hardware registers to use physical addresses rather than virtual ones.Tony Wasserka2-8/+8
This cleans up the mess that address reading/writing had become and makes the code a *lot* more sensible. This adds a physical<->virtual address converter to mem_map.h. For further accuracy, we will want to properly extend this to support a wider range of address regions. For now, this makes simply homebrew applications work in a good manner though.
2014-08-12Remove the fancy RegisterSet class introduced in 4c2bff61e.Tony Wasserka2-100/+146
While it was some nice and fancy template usage, it ultimately had many practical issues regarding length of involved expressions under regular usage as well as common code completion tools not being able to handle the structures. Instead, we now use a more conventional approach which is a lot more clean to use.
2014-08-06GSP: Removed dumb GX prefixes to functions/structs in GSP namespace.bunnei1-6/+6
- Various other cleanups.
2014-07-23Use uniform formatting when printing hexadecimal numbers.Tony Wasserka1-1/+1
2014-07-23GSP: Clean up GX command processing a lot and treat command id as a u8 rather than a u32.Tony Wasserka1-3/+2
Anonymous structs are not standard C++, hence don't use them.
2014-07-23RegisterSet: Simplify code by using structs for register definition instead of unions.Tony Wasserka1-9/+9
2014-07-23GPU: Make use of RegisterSet.Tony Wasserka1-26/+28
2014-07-23Renderer: Fix component order in bottom framebuffer.Tony Wasserka2-5/+4
2014-07-23Renderer: Respect the active_fb GPU register.Tony Wasserka1-2/+9
2014-07-23Renderer: Add a few TODOs.Tony Wasserka1-3/+10
2014-07-22GPU debugger: Don't keep track of debugging data if no debugger views are active.Tony Wasserka1-0/+6
2014-06-12GPU debugger: Const correctness and build fix.Tony Wasserka1-3/+3
2014-06-12Preprocessor: #if's out OSX-specific GL changes on other platformsarchshift1-0/+3
2014-06-12Pica: Use some template magic to define register structures efficiently.Tony Wasserka1-25/+102
2014-06-12Further refine GPU command list debugging.Tony Wasserka2-0/+17
2014-06-12Refine command list debugging functionality and its qt interface.Tony Wasserka2-8/+17
2014-06-12citra-qt: Add command list view.Tony Wasserka1-2/+2
2014-06-12GPU debugger: Add functionality to inspect command lists.Tony Wasserka1-1/+53
2014-06-12video core: added PICA definitions file.Tony Wasserka3-0/+37
2014-06-12Rename LCD to GPU.Tony Wasserka1-3/+3
2014-06-12Add initial graphics debugger interface.Tony Wasserka3-3/+102
2014-05-20common_types: Changed BasicRect back to Rect, in the common namespacearchshift2-3/+3
Only Rect is in the namespace for now; the rest of common should be added in the future
2014-05-20Improved clarity and whitespacearchshift2-3/+4
Changed QGL version to 3,2 in order to be less restrictive, yet it should still change up to 4,1 on OSX on Qt5.
2014-05-20CMakeLists: rename HEADS, improved commentsarchshift1-2/+2
Changes for clarity of comments, removed redundant compiler flags.
2014-05-19Indent fixesarchshift1-31/+31
2014-05-08Update FlipFramebufferSethpaien1-7/+6
Less calculations + fix
2014-05-01Fixed indentsarchshift2-37/+35
2014-05-01Reverse debugging changesarchshift1-2/+0
2014-05-01Unintended change reversalarchshift1-36/+36
2014-05-01TGA dumps work, courtesy of @bunneiarchshift2-36/+38
2014-05-01OpenGL 3+ on OSX with GLFWarchshift1-0/+2
2014-04-29IT'S ALIVE!archshift1-1/+6
2014-04-28Xcode complains that the class name is redundant.archshift1-1/+1