summaryrefslogtreecommitdiffstats
path: root/src/video_core/gpu.h (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-06-28Memory Tracking: Optimize tracking to only use atomic writes when contested with the host GPUFernando Sahmkow1-0/+4
2023-05-07GPU: Add Reactive flushingFernando Sahmkow1-0/+4
2022-11-24GPU: Implement additional render target formats.Fernando Sahmkow1-7/+7
2022-11-24Fermi2D: Implement Bilinear software filtering and address feedback.Fernando Sahmkow1-2/+2
2022-10-07Update 3D regsKelebek11-9/+41
2022-10-06VideoCore: Refactor fencing system.Fernando Sahmkow1-2/+2
2022-10-06NVDRV: Further refactors and eliminate old code.Fernando Sahmkow1-9/+0
2022-10-06NVDRV: Refactor Host1xFernando Sahmkow1-6/+0
2022-10-06VideoCore: Refactor syncing.Fernando Sahmkow1-3/+16
2022-10-06Texture cache: Fix the remaining issues with memory mnagement and unmapping.Fernando Sahmkow1-0/+2
2022-10-06VideoCore: implement channels on gpu caches.Fernando Sahmkow1-42/+13
2022-10-06NvHost: Remake Ctrl Implementation.Fernando Sahmkow1-1/+1
2022-04-23general: Convert source file copyright comments over to SPDXMorph1-3/+2
This formats all copyright comments according to SPDX formatting guidelines. Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-03-25hle: vi: Integrate new NVFlinger and HosBinderDriverServer service.bunnei1-0/+1
2022-01-25gpu: Tidy up forward declarationsLioncash1-10/+0
Over time a few forward declarations became unnecessary, so we can remove these to tidy up the header a little bit.
2022-01-25gpu: Remove obsoleted CDMAPusher() accessorsLioncash1-6/+0
These were obsoleted in 2c47f8aa1886522898b5b3a73185b5662be3e9f3 but were accidentally overlooked.
2022-01-04gpu: Add shut down method to synchronize threads before destructionameerj1-0/+3
2022-01-04Revert "Merge pull request #7668 from ameerj/fence-stop-token"ameerj1-2/+1
This reverts commit e7733544779f2706d108682dd027d44e7fa5ff4b, reversing changes made to abbbdc2bc027ed7af236625ae8427a46df63f7e7.
2022-01-03gpu: Use std::stop_token in WaitFence for VSync threadameerj1-1/+2
Fixes a hang that may occur when stopping emulation and the VSync thread is blocked on the syncpoint condition variable.
2021-12-02Support multiple videos playingFeng Chen1-2/+2
2021-11-17video_core: Add S8_UINT stencil formatMorph1-0/+1
2021-10-03nvhost_ctrl: Refactor usage of gpu.LockSync()ameerj1-12/+0
This seems to only be used to protect a later gpu function call. So we can move the lock into that call instead.
2021-10-03gpu: Migrate implementation to the cpp fileameerj1-190/+27
2021-09-16gpu: Use std::jthread for async gpu threadameerj1-3/+0
2021-05-29video_core: gpu: WaitFence: Do not block threads during shutdown.bunnei1-0/+2
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur. - Commonly observed when stopping emulation in Super Mario Odyssey.
2021-05-16perf_stats: Rework FPS counter to be more accurateameerj1-0/+2
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case. This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics. The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values. The status bar update frequency was also changed from 2 seconds to 500ms.
2021-04-25nvhost_vic: Fix device closureameerj1-2/+2
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation. Also cleans up some of the surrounding code.
2021-04-07video_core/gpu_thread: Implement a ShutDown method.Markus Wick1-2/+2
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone. This should fix a race condition while removing the other subsystems while the GPU is still active.
2021-03-30nvdrv: Cleanup CDMA Processor on device closureChloe Marcec1-0/+3
Brings us a step closer to unifying all channels to share a common interface.
2021-02-13gpu: Report renderer errors with exceptionsReinUsesLisp1-0/+1
Instead of using a two step initialization to report errors, initialize the GPU renderer and rasterizer on the constructor and report errors through std::runtime_error.
2021-01-15common/common_funcs: Rename INSERT_UNION_PADDING_{BYTES,WORDS} to _NOINITReinUsesLisp1-4/+4
INSERT_PADDING_BYTES_NOINIT is more descriptive of the underlying behavior.
2020-12-29video_core: gpu: Implement synchronous mode using threaded GPU.bunnei1-2/+2
2020-12-29video_core: gpu: Refactor out synchronous/asynchronous GPU implementations.bunnei1-36/+19
- We must always use a GPU thread now, even with synchronous GPU.
2020-12-04video_core: Resolve more variable shadowing scenariosLioncash1-6/+6
Resolves variable shadowing scenarios up to the end of the OpenGL code to make it nicer to review. The rest will be resolved in a following commit.
2020-11-17gpu: Make use of [[nodiscard]] where applicableLioncash1-31/+35
2020-11-05General: Fix clang buildLioncash1-1/+1
Allows building on clang to work again
2020-11-01video_core: gpu: Implement WaitFence and IncrementSyncPoint.bunnei1-4/+21
2020-10-27video_core: NVDEC Implementationameerj1-3/+20
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library. The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data. To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library. Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header. Async GPU is not properly implemented at the moment. Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
2020-09-06video_core: Remove all Core::System references in rendererReinUsesLisp1-2/+1
Now that the GPU is initialized when video backends are initialized, it's no longer needed to query components once the game is running: it can be done when yuzu is booting. This allows us to pass components between constructors and in the process remove all Core::System references in the video backend.
2020-08-22video_core: Initialize renderer with a GPUReinUsesLisp1-6/+7
Add an extra step in GPU initialization to be able to initialize render backends with a valid GPU instance.
2020-07-26video_core/gpu: Correct the size of the puller registersBilly Laws1-2/+2
The puller register array is made up of u32s however the `NUM_REGS` value is the size in bytes, so switch it to avoid making the struct unnecessary large. Also fix a small typo in a comment.
2020-07-17async shadersDavid Marcec1-0/+11
2020-07-13video_core: Rearrange pixel format namesReinUsesLisp1-43/+43
Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs.
2020-07-13video_core: Implement RGBA32_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement RGBA32_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement RGBA16_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement RGBA8_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement RG32_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement RG8_SINT render target and fix RG8_UINTReinUsesLisp1-0/+1
2020-07-13video_core: Implement R8_SINT render targetReinUsesLisp1-0/+1
2020-07-13video_core: Implement R8_SNORM render targetReinUsesLisp1-0/+1
2020-06-27General: Correct rebase, sync gpu and context management.Fernando Sahmkow1-0/+6
2020-04-30texture: Implement R8G8UIMorph1-0/+1
- Used by The Walking Dead: The Final Season
2020-04-23Clang Format.Fernando Sahmkow1-2/+4
2020-04-23DMAPusher: Propagate multimethod writes into the engines.Fernando Sahmkow1-1/+7
2020-04-22Address Feedback.Fernando Sahmkow1-3/+9
2020-04-22GPU: Implement Flush Requests for Async mode.Fernando Sahmkow1-0/+21
2020-04-22OpenGL: Implement Fencing backend.Fernando Sahmkow1-1/+1
2020-04-22GPU: Delay Fences.Fernando Sahmkow1-0/+1
2020-04-22GPU: Refactor synchronization on Async GPUFernando Sahmkow1-0/+1
2020-04-06GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddrFernando Sahmkow1-3/+3
2020-03-25Frontend/GPU: Refactor context managementJames Rowe1-3/+15
Changes the GraphicsContext to be managed by the GPU core. This eliminates the need for the frontends to fool around with tricky MakeCurrent/DoneCurrent calls that are dependent on the settings (such as async gpu option). This also refactors out the need to use QWidget::fromWindowContainer as that caused issues with focus and input handling. Now we use a regular QWidget and just access the native windowHandle() directly. Another change is removing the debug tool setting in FrameMailbox. Instead of trying to block the frontend until a new frame is ready, the core will now take over presentation and draw directly to the window if the renderer detects that its hooked by NSight or RenderDoc Lastly, since it was in the way, I removed ScopeAcquireWindowContext and replaced it with a simple subclass in GraphicsContext that achieves the same result
2020-03-13video_core: Implement RGBA16_SNORMReinUsesLisp1-0/+1
Implement RGBA16_SNORM with the current API. Nothing special here.
2020-02-25video_core/surface: Add R32_SINT render target formatReinUsesLisp1-0/+1
2020-02-25video_core/gpu: Remove unused functionsReinUsesLisp1-6/+0
2020-02-10GPU: Implement GPU Clock correctly.Fernando Sahmkow1-0/+2
2019-12-30video_core: Block in WaitFence.Markus Wick1-1/+4
This function is called rarely and blocks quite often for a long time. So don't waste power and let the CPU sleep. This might also increase the performance as the other cores might be allowed to clock higher.
2019-11-04common_func: Use std::array for INSERT_PADDING_* macros.bunnei1-4/+4
- Zero initialization here is useful for determinism.
2019-10-05Core: Wait for GPU to be idle before shutting down.Fernando Sahmkow1-0/+3
2019-10-05GPU_Async: Correct fences, display events and more.Fernando Sahmkow1-0/+3
This commit uses guest fences on vSync event instead of an articial fake fence we had. It also corrects to keep signaling display events while loading the game as the OS is suppose to send buffers to vSync during that time.
2019-09-22video_core: Implement RGBX16F PixelFormatFearlessTobi1-0/+1
2019-08-30video_core: Silent miscellaneous warnings (#2820)Rodrigo Locatti1-1/+1
* texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables
2019-08-21Video_Core: Implement a new Buffer CacheFernando Sahmkow1-0/+4
2019-08-21renderer_opengl: Implement RGB565 framebuffer formatReinUsesLisp1-0/+1
2019-08-21renderer_opengl: Use VideoCore pixel formatReinUsesLisp1-5/+0
2019-08-21gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfigReinUsesLisp1-2/+1
2019-07-26GPU: Flush commands on every dma pusher step.Fernando Sahmkow1-0/+2
This commit ensures that the host gpu is constantly fed with commands to work with, while the guest gpu keeps producing the rest of the commands. This reduces syncing time between host and guest gpu.
2019-07-18GPU: Add missing puller methods.Fernando Sahmkow1-1/+8
This adds some missing puller methods. We don't assert them as these are nop operations for us.
2019-07-15gl_rasterizer: Implement compute shadersReinUsesLisp1-0/+6
2019-07-05NVServices: Styling, define constructors as explicit and correctionsFernando Sahmkow1-11/+7
2019-07-05NVServices: Make NVEvents Automatic according to documentation.Fernando Sahmkow1-1/+1
2019-07-05GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardwareFernando Sahmkow1-10/+4
2019-07-05nv_host_ctrl: Make Sync GPU variant always return synced result.Fernando Sahmkow1-1/+7
2019-07-05Gpu: use an std mutex instead of a spin_lock to guard syncpointsFernando Sahmkow1-4/+4
2019-07-05Gpu: Mark areas as protected.Fernando Sahmkow1-0/+11
2019-07-05nv_services: Stub CtrlEventSignalFernando Sahmkow1-1/+3
2019-07-05Gpu: Implement Hardware Interrupt Manager and manage GPU interruptsFernando Sahmkow1-3/+2
2019-07-05video_core: Implement GPU side SyncpointsFernando Sahmkow1-0/+24
2019-04-12video_core/gpu: Create threads separately from initializationLioncash1-0/+5
Like with CPU emulation, we generally don't want to fire off the threads immediately after the relevant classes are initialized, we want to do this after all necessary data is done loading first. This splits the thread creation into its own interface member function to allow controlling when these threads in particular get created.
2019-03-27video_core/gpu: Amend typo in GPU member variable nameLioncash1-3/+3
smaphore -> semaphore
2019-03-21gpu: Rewrite virtual memory manager using PageTable.bunnei1-3/+3
2019-03-15gpu: Use host address for caching instead of guest address.bunnei1-3/+8
2019-03-07video_core/gpu: Make GPU's destructor virtualLioncash1-1/+1
Because of the recent separation of GPU functionality into sync/async variants, we need to mark the destructor virtual to provide proper destruction behavior, given we use the base class within the System class. Prior to this, it was undefined behavior whether or not the destructor in the derived classes would ever execute.
2019-03-07gpu: Refactor a/synchronous implementations into their own classes.bunnei1-15/+11
2019-03-07gpu: Move command processing to another thread.bunnei1-4/+18
2019-03-07gpu: Refactor command and swap buffers interface for asynch.bunnei1-3/+12
2019-03-07gpu: Refactor to take RendererBase instead of RasterizerInterface.bunnei1-15/+19
2019-02-27common/math_util: Move contents into the Common namespaceLioncash1-1/+1
These types are within the common library, so they should be within the Common namespace.
2019-02-16video_core: Remove usages of System::GetInstance() within the enginesLioncash1-2/+5
Avoids the use of the global accessor in favor of explicitly making the system a dependency within the interface.
2019-02-10kepler_compute: Fixup assert and rename enginesReinUsesLisp1-3/+3
When I originally added the compute assert I used the wrong documentation. This addresses that. The dispatch register was tested with homebrew against hardware and is triggered by some games (e.g. Super Mario Odyssey). What exactly is missing to get a valid program bound by this engine requires more investigation.
2019-02-09Implement BGRA8 framebuffer formatgreggameplayer1-0/+1
2019-01-30video_core/GPU Implemented the GPU PFIFO puller semaphore operations. (#1908)Kevin1-0/+71
* Implemented the puller semaphore operations. * Nit: Fix 2 style issues * Nit: Add Break to default case. * Fix style. * Update for comments. Added ReferenceCount method * Forgot to remove GpuSmaphoreAddress union. * Fix the clang-format issues. * More clang formatting. * two more white spaces for the Clang formatting. * Move puller members into the regs union * Updated to use Memory::WriteBlock instead of Memory::Write* * Fix clang style issues * White space clang error * Removing unused funcitons and other pr comment * Removing unused funcitons and other pr comment * More union magic for setting regs value. * union magic refcnt as well * Remove local var * Set up the regs and regs_assert_positions up properly * Fix clang error
2018-11-27gpu: Rewrite GPU command list processing with DmaPusher class.bunnei1-2/+25
- More accurate impl., fixes Undertale (among other games).
2018-09-15Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX)raven021-0/+1
2018-09-12GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).Subv1-0/+3
This engine writes data from a FIFO register into the configured address.
2018-09-10video_core: Refactor command_processor.Markus Wick1-3/+0
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10video_core: Move command buffer loop.Markus Wick1-1/+3
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.bunnei1-0/+1
- Used by Octopath Traveler (with multiple render targets).
2018-09-04command_processor: Use std::array for bound_engines.Markus Wick1-2/+2
subchannel is a 3 bit field. So there must not be more than 8 bound engines. And using a hashmap for up to 8 values is a bit overpowered.
2018-08-28gpu: Make memory_manager privateLioncash1-3/+9
Makes the class interface consistent and provides accessors for obtaining a reference to the memory manager instance. Given we also return references, this makes our more flimsy uses of const apparent, given const doesn't propagate through pointers in the way one would typically expect. This makes our mutable state more apparent in some places.
2018-08-20Implemented RGBA8_UINTDavid Marcec1-0/+1
Needed by kirby
2018-08-14renderer_opengl: Implement RenderTargetFormat::RGBA16_UNORM.bunnei1-0/+1
- Used by Breath of the Wild.
2018-08-13Implement RG32UI and R32UIDavid Marcec1-0/+2
Needed for xenoblade
2018-08-13renderer_opengl: Implement RenderTargetFormat::RGBA16_UINT.bunnei1-0/+1
- Used by Breath of the Wild.
2018-08-13renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.bunnei1-0/+1
- Used by Breath of the Wild.
2018-08-12Implement R8_UINT RenderTargetFormat & PixelFormat (#1014)greggameplayer1-0/+1
- Used by Go Vacation
2018-08-12gl_rasterizer: Implement render target format RG8_SNORM.bunnei1-0/+1
- Used by Super Mario Odyssey.
2018-08-12gl_rasterizer: Implement render target format RGBA8_SNORM.bunnei1-0/+1
- Used by Super Mario Odyssey.
2018-08-11Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats and more (R16_UNORM needed by Fate Extella) (#848)greggameplayer1-0/+7
* Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats Do a separate function in order to get Bytes Per Pixel of DepthFormat Apply the new function in gpu.h delete unneeded white space * correct merging error
2018-08-11video_core; Get rid of global g_toggle_framelimit_enabled variableLioncash1-9/+1
Instead, we make a struct for renderer settings and allow the renderer to update all of these settings, getting rid of the need for global-scoped variables. This also uncovered a few indirect inclusions for certain headers, which this commit also fixes.
2018-08-08gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.bunnei1-0/+1
- Used by Super Mario Odyssey.
2018-08-04video_core: Eliminate the g_renderer global variableLioncash1-1/+5
We move the initialization of the renderer to the core class, while keeping the creation of it and any other specifics in video_core. This way we can ensure that the renderer is initialized and doesn't give unfettered access to the renderer. This also makes dependencies on types more explicit. For example, the GPU class doesn't need to depend on the existence of a renderer, it only needs to care about whether or not it has a rasterizer, but since it was accessing the global variable, it was also making the renderer a part of its dependency chain. By adjusting the interface, we can get rid of this dependency.
2018-08-01Implement R32_FLOAT RenderTargetFormatUnknown1-0/+1
2018-07-26GPU: Allow using R16F as a render target format.Subv1-0/+1
2018-07-26Implement R16_G16Unknown1-0/+5
correct trailing white spaces Delete tabs correct placement Add RG16F & RG16UI & RG16I & RG16S PixelFormats Return correct data according to changes done previously correct PixelFormat declaration correct coding style error correct coding style error part 2 correct RG16S Declaration error correct alignment
2018-07-25GPU: Implemented the Z32_S8_X24 depth buffer format.Subv1-0/+1
2018-07-25GPU: Allow the usage of R8 as a render target format.Subv1-0/+1
2018-07-24gl_rasterizer_cache: Implement RenderTargetFormat RG32_FLOAT.bunnei1-0/+1
2018-07-24gl_rasterizer_cache: Implement RenderTargetFormat BGRA8_UNORM.bunnei1-0/+1
2018-07-21gpu: Rename Get3DEngine() to Maxwell3D()Lioncash1-5/+4
This makes it match its const qualified equivalent.
2018-07-18vi: Partially implement buffer crop parameters.bunnei1-0/+1
2018-07-02GPU: Implemented the Z24S8 depth format and load the depth framebuffer.Subv1-0/+9
2018-06-30GPU: Implemented the RGBA32_UINT rendertarget format.Subv1-0/+1
2018-06-12GPU: Partially implemented the Maxwell DMA engine.Subv1-0/+3
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-06GPU: Implemented the R11FG11FB10F texture and rendertarget formats.Subv1-0/+1
2018-06-06GPU: Allow the usage of RGBA32_FLOAT in the texture copy engine.Subv1-0/+1
2018-04-25GPU: Added a function to retrieve the bytes per pixel of the render target formats.Subv1-0/+3
2018-04-25GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.Subv1-7/+0
It doesn't belong in the PFIFO handler.
2018-04-18gpu: Add several framebuffer formats to RenderTargetFormat.bunnei1-0/+3
2018-03-27graphics_surface: Fix merge conflicts.bunnei1-0/+1
2018-03-27maxwell: Add RenderTargetFormat enum.bunnei1-1/+1
2018-03-24Frontend: Updated the surface view debug widget to work with Maxwell surfaces.Subv1-0/+4
2018-03-24Frontend: Ported the GPU breakpoints and surface viewer widgets from citra.Subv1-0/+5
2018-03-23renderer_opengl: Better handling of framebuffer transform flags.bunnei1-1/+4
2018-03-23video_core: Move FramebufferInfo to FramebufferConfig in GPU.bunnei1-0/+29
2018-03-23gpu: Expose Maxwell3D engine.bunnei1-0/+4
2018-03-18GPU: Move the GPU's class constructor and destructors to a cpp file.Subv1-10/+8
This should reduce recompile times when editing the Maxwell3D register structure.
2018-03-18GPU: Store uploaded GPU macros and keep track of the number of method parameters.Subv1-1/+9
2018-03-18GPU: Macros are specific to the Maxwell3D engine, so handle them internally.Subv1-3/+0
2018-03-17GPU: Process command mode 5 (IncreaseOnce) differently from other commands.Subv1-0/+3
Accumulate all arguments before calling the desired method. Note: Maybe we should do the same for the NonIncreasing mode?
2018-02-12GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.Subv1-1/+1
Only QueryMode::Write is supported at the moment.
2018-02-12Make a GPU class in VideoCore to contain the GPU state.Subv1-0/+55
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.