Forked from silvesthu/GPUOptimizationForGameDev.md
Created
September 3, 2024 11:46
-
-
Save wiseConst/141f77e4b4a0d2fd626620dc01fe4e7c to your computer and use it in GitHub Desktop.
Revisions
-
silvesthu revised this gist
May 19, 2024 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -246,6 +246,7 @@ - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES - [D3d12infoDB by Dmytro Bulatov](https://d3d12infodb.boolka.dev/index.html) Database based on D3d12info in Tools section below - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools @@ -273,4 +274,4 @@ - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
May 12, 2024 . 1 changed file with 18 additions and 8 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -21,6 +21,7 @@ - [Blog](https://therealmjp.github.io/) - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - 2022 - [GPU Memory Pools in D3D12](https://therealmjp.github.io/posts/gpu-memory-pool/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) @@ -34,7 +35,7 @@ - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) - Michal Drobot [@michaldrobot](https://twitter.com/michaldrobot) - [Blog](https://michaldrobot.com/) - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/) - [Video](https://www.youtube.com/watch?v=Bmy3Tt3Ottc) - 2014 - [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/) - 2014 - [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs) - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA) @@ -151,13 +152,6 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer @@ -166,6 +160,14 @@ - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/) - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/) - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/) - Driver Stack - [User Mode Driver for Vulkan (AMDVLK) by AMD](https://github.com/GPUOpen-Drivers/AMDVLK) - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl) - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc) - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [User Mode Driver for Vulkan (RADV) by Mesa](https://docs.mesa3d.org/drivers/radv.html) - [Kernel Mode Driver by Linux, libdrm_amdgpu](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) and Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) @@ -197,6 +199,11 @@ - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Driver Stack - [User Mode Driver for Vulkan (NVK) by Mesa](https://docs.mesa3d.org/drivers/nvk.html) - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau) - [Kernel Mode Driver by Nvidia](https://github.com/NVIDIA/open-gpu-kernel-modules) - [Documentation of NVIDIA chip/hardware interfaces (open-gpu-doc)](https://github.com/nvidia/open-gpu-doc) - Intel - [Gamedev](https://software.intel.com/en-us/gamedev) - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) @@ -207,6 +214,9 @@ - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - Apple - 2023 - [Explore GPU advancements in M3 and A17 Pro](https://developer.apple.com/videos/play/tech-talks/111375) - 2023 - [Learn performance best practices for Metal shaders](https://developer.apple.com/videos/play/tech-talks/111373) - Arm - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201) - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301) -
silvesthu revised this gist
May 11, 2024 . 1 changed file with 8 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -117,7 +117,7 @@ - AMD - [GPU Open](https://gpuopen.com/) and Talks - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU architecture programming documentation (Instruction Set Architecture)](https://gpuopen.com/amd-gpu-architecture-programming-documentation/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) @@ -151,10 +151,13 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - From application (CPU) to hardware (GPU) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) User Mode Driver (Vulkan), amdvlk64 - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl) - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc) - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [linux/drivers/gpu/drm/amd/](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) Kernel Mode Driver (Linux), libdrm_amdgpu - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer -
silvesthu revised this gist
Jan 19, 2024 . 1 changed file with 15 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -22,13 +22,13 @@ - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering) - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202) - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://www.gdcvault.com/search.php#&category=free&firstfocus=&keyword=Optimizing+Ray%2BTracing%2BGPU%2BWorkloads%2Busing%2BNsight%2BGraphics) - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu) - [Blog](https://rys.sommefeldt.com/) - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) @@ -56,7 +56,6 @@ - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago) - 2019 - [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view) - Sebastian Aaltonen [@SebAaltonen](https://twitter.com/SebAaltonen) - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) @@ -70,7 +69,7 @@ - [Blog](https://fgiesen.wordpress.com/) - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/) - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes [@NOTimothyLottes](https://twitter.com/NOTimothyLottes) - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/) @@ -106,6 +105,12 @@ - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner) - [Blog](https://www.jendrikillner.com/) - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar. - Hans-Kristian [@Themaister](https://twitter.com/Themaister) - [Blog](https://themaister.net/blog/) - 2024 - [Modernizing Granite’s mesh rendering](https://themaister.net/blog/2024/01/17/modernizing-granites-mesh-rendering/) - Graham Wihlidal [@gwihlidal](https://twitter.com/gwihlidal) - [Blog](https://www.wihlidal.com/blog/) - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) ### By Organization @@ -206,7 +211,6 @@ - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) @@ -215,11 +219,16 @@ - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/spring2017/home), [Tsinghua ver. with video](http://15418.courses.cs.cmu.edu/tsinghua2017/home) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) ### GPU Crash Debugging - 2018 - [Aftermath: Advances in GPU Crash Debugging](https://www.youtube.com/watch?v=VaGcs5-W6S4) - 2020 - (JP) [Device Removal の処方箋](https://cedil.cesa.or.jp/cedil_sessions/view/2258), [補足資料](https://shikihuiku.github.io/post/cedec2020_prescriptions_for_deviceremoval/) - 2023 - [GPU Crash Debugging in Unreal Engine: Tools, Techniques, and Best Practices | Unreal Fest 2023](https://www.youtube.com/watch?v=CyrGLMmVUAI) ### GPU Database - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) @@ -234,6 +243,7 @@ - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) - [Adding performance instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts) - [DRED](https://devblogs.microsoft.com/directx/dred/), [D3DDred.js](https://github.com/Microsoft/DirectX-Debugging-Tools) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) -
silvesthu revised this gist
Jan 17, 2024 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -135,6 +135,7 @@ - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) -
silvesthu revised this gist
Jan 14, 2024 . 1 changed file with 2 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -138,17 +138,16 @@ - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) - RDNA - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - Github - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) -
silvesthu revised this gist
Jan 13, 2024 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -68,6 +68,7 @@ - 2014 - [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html) - Fabian Giesen [@rygorous](https://twitter.com/rygorous) - [Blog](https://fgiesen.wordpress.com/) - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/) - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) -
silvesthu revised this gist
Jan 13, 2024 . 1 changed file with 5 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -243,8 +243,10 @@ - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Other related tools - [RenderDoc](https://renderdoc.org/) Graphics debugger that allows quick and easy single-frame capture and detailed introspection - [APITrace](https://apitrace.github.io/) Trace OpenGL, Direct3D, and DirectDraw APIs calls to a file and replay - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Jan 12, 2024 . 1 changed file with 3 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -102,6 +102,9 @@ - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein) - [Blog](https://aschrein.github.io/) - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html) - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner) - [Blog](https://www.jendrikillner.com/) - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar. ### By Organization -
silvesthu revised this gist
Jan 11, 2024 . 1 changed file with 8 additions and 8 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -51,7 +51,6 @@ - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) - 2018 - [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/) - 2018 - [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/) - Matthijs De Smedt [@anji_nl](https://twitter.com/anji_nl) - 2016 - [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots) - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago) @@ -61,7 +60,6 @@ - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep) - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah) - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of) @@ -203,13 +201,13 @@ - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/) Not specifically on optimization - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU @@ -220,13 +218,14 @@ ### GPU Database - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools - Online Shader Compiler - [Compiler Explorer (godbolt)](https://godbolt.org/) Support DXC, AMD RGA - [Shader Playground](http://shader-playground.timjones.io/) Support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) @@ -242,6 +241,7 @@ - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Utility - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Jan 11, 2024 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -45,6 +45,7 @@ - 2020 - [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/) - 2020 - [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/) - 2022 - [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/) - 2023 - [LOW-LEVEL THINKING IN HIGH-LEVEL SHADING LANGUAGES 2023](https://interplayoflight.wordpress.com/2023/12/29/low-level-thinking-in-high-level-shading-languages-2023/) - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru) - [Blog](https://anteru.net/) - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) @@ -148,7 +149,7 @@ - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer - 2017 - [Live VGPR Analysis with Radeon™ GPU Analyzer](https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/) - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/) @@ -229,6 +230,7 @@ - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) - [Adding performance instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) @@ -239,5 +241,7 @@ - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Utility - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info), get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Jan 1, 2024 . 1 changed file with 4 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -60,6 +60,7 @@ - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) - 2020 - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep) - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah) - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of) @@ -216,9 +217,9 @@ ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) ### GPU Database - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools -
silvesthu revised this gist
Jan 1, 2024 . 1 changed file with 10 additions and 4 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -149,9 +149,11 @@ - OpenCL - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer - 2017 - [Live VGPR Analysis with Radeon™ GPU Analyzer](https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/) - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/) - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/) - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/) - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) and Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) @@ -169,14 +171,18 @@ - 2023 - [Advanced API Performance: Shaders](https://developer.nvidia.com/blog/advanced-api-performance-shaders/) - Pascal - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - 2023 - [Tuning CUDA Applications for Pascal](https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html) - Turing - 2018 - [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/) - 2018 - [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf) - 2023 - [Tuning CUDA Applications for Turing](https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html) - Ampere - 2020 - [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf) - 2023 - [Tuning CUDA Applications for NVIDIA Ampere GPU Architecture](https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html) - Ada - 2022 - [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf) - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf) - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Intel -
silvesthu revised this gist
Dec 26, 2023 . 1 changed file with 4 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -96,6 +96,7 @@ - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk) - [Blog](https://bartwronski.com/) - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/) - 2021 - [Is this a branch?](https://bartwronski.com/2021/01/18/is-this-a-branch/) - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris) - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/) - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein) @@ -128,6 +129,8 @@ - Curing Amnesia and Other GPU Maladies With AMD Developer Tools - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) @@ -139,7 +142,7 @@ - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - Driver - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) -
silvesthu revised this gist
Sep 28, 2023 . 1 changed file with 3 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -213,7 +213,9 @@ - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools - Online Shader Compiler - [Compiler Explorer (godbolt)](https://godbolt.org/), support DXC, AMD RGA - [Shader Playground](http://shader-playground.timjones.io/), support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -163,6 +163,7 @@ - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) - 2023 - [Advanced API Performance: Shaders](https://developer.nvidia.com/blog/advanced-api-performance-shaders/) - Pascal - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - Turing -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -191,12 +191,13 @@ - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) talks, not specifically on optimization - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/), not specifically on optimization - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 23 additions and 26 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -71,13 +71,11 @@ - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/) - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline) - [Blog](http://renderingpipeline.com) - 2012 - [Low-Level GPU Documentation](https://web.archive.org/web/20160305145630/http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/) - RasterGrid [@rastergrid](https://twitter.com/rastergrid) - [Blog](https://rastergrid.com/blog/) - 2021 - [Understanding GPU caches](https://rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/) @@ -100,33 +98,26 @@ - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/) - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris) - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/) - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein) - [Blog](https://aschrein.github.io/) - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html) ### By Organization - AMD - [GPU Open](https://gpuopen.com/) and Talks - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) - 2017 - [Wave Programming in D3D12 and Vulkan](https://gpuopen.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf) - 2017 - [D3D12 and Vulkan Done Right](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf) - 2017 - [Deep Dive: Asynchronous Compute](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf) - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg) - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx) - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf) - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf) - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf) - 2020 - [Let’s build](https://gpuopen.com/lets-build/) @@ -162,8 +153,10 @@ - [Developer Blog](https://developer.nvidia.com/blog) and Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf) - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics) - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts) - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering) - 2019 - [Tips and Tricks: Ray Tracing Best Practices](https://developer.nvidia.com/blog/rtx-best-practices/) - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01) - 2020 - [RTX Ray Tracing Best Practices](https://www.gdcvault.com/play/1026721/RTX-Ray-Tracing-Best-Practices) @@ -197,13 +190,17 @@ - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301) - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/), not generally about optimization though - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 18 additions and 15 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -8,6 +8,7 @@ - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/) - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/) - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) - 2020 - [All the pipelines - journey through the GPU](https://www.youtube.com/watch?v=Y2KG_4OxDBg) ### By Author @@ -25,6 +26,8 @@ - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering) - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202) - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://www.gdcvault.com/search.php#&category=free&firstfocus=&keyword=Optimizing+Ray%2BTracing%2BGPU%2BWorkloads%2Busing%2BNsight%2BGraphics) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu) - [Blog](https://rys.sommefeldt.com/) @@ -111,20 +114,20 @@ - 2017 - [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf) - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf) - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - AMD - [GPU Open](https://gpuopen.com/) and Talks - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg) - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx) - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf) - 2020 - [Let’s build](https://gpuopen.com/lets-build/) - AMD Ryzen™ Processor Software Optimization @@ -134,7 +137,6 @@ - Curing Amnesia and Other GPU Maladies With AMD Developer Tools - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) @@ -152,14 +154,19 @@ - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/) - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/) - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) and Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics) - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts) - 2019 - [Tips and Tricks: Ray Tracing Best Practices](https://developer.nvidia.com/blog/rtx-best-practices/) - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01) - 2020 - [RTX Ray Tracing Best Practices](https://www.gdcvault.com/play/1026721/RTX-Ray-Tracing-Best-Practices) - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) @@ -175,25 +182,21 @@ - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Intel - [Gamedev](https://software.intel.com/en-us/gamedev) - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) - Microsoft - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/) - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/) - Khronos Group - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - Arm - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201) - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301) - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - CMU - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home) - Misc -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 3 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -112,8 +112,9 @@ - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf) - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf) - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - AMD -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 9 additions and 9 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,5 +1,14 @@ # GPU Optimization for GameDev ### Graphics Pipeline / GPU Architecture Overview - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - 2015 - [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline) - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/) - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html) - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/) - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/) - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) ### By Author - Emil Persson [@_Humus_](https://twitter.com/_Humus_) @@ -192,15 +201,6 @@ - 2019 - [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/) - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) -
silvesthu revised this gist
Sep 2, 2023 . 1 changed file with 17 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -92,7 +92,7 @@ ### By Organization - GDC - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering) @@ -137,6 +137,9 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf) - Driver - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - RADEON GPU ANALYZER @@ -208,9 +211,18 @@ ### Tools - [Shader Playground](http://shader-playground.timjones.io/) - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) - AMD - [Radeon Developer Tool Suite](https://gpuopen.com/introducing-radeon-developer-tool-suite/) - [Radeon GPU Analyzer](https://gpuopen.com/rga/) - [Radeon Raytracing Analyzer](https://gpuopen.com/radeon-raytracing-analyzer/) - [Radeon Memory Visualizer](https://gpuopen.com/rmv/) - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Apr 23, 2023 . 1 changed file with 105 additions and 107 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,174 +4,175 @@ - Emil Persson [@_Humus_](https://twitter.com/_Humus_) - [Blog](http://www.humus.name/) - 2013 - [Low-Level Thinking in High-Level Shading Languages](https://www.gdcvault.com/play/1018182/Low-Level-Thinking-in-High) - 2014 - [Low-Level Shader Optimization for Next-Gen and DX11](http://www.humus.name/Articles/Persson_LowlevelShaderOptimization.pptx) - 2018 - [Rule of optimization](https://twitter.com/_Humus_/status/1011964081069330432) - Matt Pettineo [@mynameismjp](https://twitter.com/mynameismjp) - [Blog](https://therealmjp.github.io/) - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering) - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202) - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu) - [Blog](https://rys.sommefeldt.com/) - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) - Michal Drobot [@michaldrobot](https://twitter.com/michaldrobot) - [Blog](https://michaldrobot.com/) - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/) - 2014 - [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/) - 2014 - [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs) - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA) - [Blog](https://interplayoflight.wordpress.com/) - 2018 - [DD2018: Kostas Anagnostou - Experiments in GPU occlusion culling](https://www.youtube.com/watch?v=U20dIA3SLTs) - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) - 2020 - [GPU ARCHITECTURE RESOURCES (twitter thread)](https://twitter.com/KostasAAA/status/1259153226043179011) - 2020 - [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/) - 2020 - [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/) - 2022 - [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/) - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru) - [Blog](https://anteru.net/) - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) - 2018 - [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/) - 2018 - [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/) - [GPU database](https://db.thegpu.guru/) - Matthijs De Smedt [@anji_nl](https://twitter.com/anji_nl) - 2016 - [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots) - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago) - 2019 - [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view) - Sebastian Aaltonen [@SebAaltonen](https://twitter.com/SebAaltonen) - [Blog](https://www.secondorder.com/) - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep) - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah) - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of) - Sven Andersson [@andsve](https://twitter.com/andsve) - [Blog](http://svenandersson.se/) - 2014 - [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html) - Fabian Giesen [@rygorous](https://twitter.com/rygorous) - [Blog](https://fgiesen.wordpress.com/) - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://32ipi028l5q82yhj72224m8j-wpengine.netdna-ssl.com/wp-content/uploads/2018/05/gdc_2018_sponsored_engine_optimization_hot_lap.pptx) - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline) - [Blog](http://renderingpipeline.com) - 2012 - [Low-Level GPU Documentation](http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/) - Stephanie Hurlburt [@sehurlburt](http://stephaniehurlburt.com/blog) - 2016 - [Casual Introduction to Low-Level Graphics Programming](http://stephaniehurlburt.com/blog/2016/10/28/casual-introduction-to-low-level-graphics-programming) - RasterGrid [@rastergrid](https://twitter.com/rastergrid) - [Blog](https://rastergrid.com/blog/) - 2021 - [Understanding GPU caches](https://rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/) - Adam Sawicki [@Reg__](https://twitter.com/Reg__) - [Blog](https://asawicki.info/) - 2020 - [A Better Way to Scalarize a Shader](https://asawicki.info/news_1735_a_better_way_to_scalarize_a_shader) - 2021 - [Efficient Use of GPU Memory in Modern Games](https://www.youtube.com/watch?v=ML0YC77bSOc) - Matías N. Goldberg [@matiasgoldberg](https://twitter.com/matiasgoldberg) - [Blog](https://www.yosoygames.com.ar/wp/) - 2020 - [A little clarification on modern shader compile times](https://www.yosoygames.com.ar/wp/2020/08/a-little-clarification-on-modern-shader-compile-times/#tc-comment-title) - 2022 - [The road to 16-bit floats GPU is paved with our blood](https://www.yosoygames.com.ar/wp/2022/01/the-road-to-16-bit-floats-gpu-is-paved-with-our-blood/) - Francesco Cifariello Ciardi [@FCifaCiar](https://twitter.com/FCifaCiar) - [Blog](https://flashypixels.wordpress.com/) - 2018 - [INTRO TO GPU SCALARIZATION](https://flashypixels.wordpress.com/2018/11/10/intro-to-gpu-scalarization-part-1/) - Sébastien Lagarde [@SebLagarde](https://twitter.com/seblagarde) - [Blog](https://seblagarde.wordpress.com/) - 2014 - [Inverse trigonometric functions GPU optimization for AMD GCN architecture](https://seblagarde.wordpress.com/2014/12/01/inverse-trigonometric-functions-gpu-optimization-for-amd-gcn-architecture/) - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk) - [Blog](https://bartwronski.com/) - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/) - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris) - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/) ### By Organization - GDC - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/) or in [GDC VAULT EXPLORER](https://yankooliveira.com/gdcvault/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering) - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf) - 2017 - [Wave Programming in D3D12 and Vulkan](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf) - 2017 - [D3D12 and Vulkan Done Right](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf) - 2017 - [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf) - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf) - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf) - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - [JP] CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - Siggraph - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - AMD - [GPU Open](https://gpuopen.com/) - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg) - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx) - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf) - 2020 - [Let’s build](https://gpuopen.com/lets-build/) - AMD Ryzen™ Processor Software Optimization - Optimizing for the Radeon™ RDNA Architecture - From Source to ISA: A Trip Down the Shader Compiler Pipeline - A Review of GPUOpen Effects - Curing Amnesia and Other GPU Maladies With AMD Developer Tools - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2020 - [All the Pipelines - Journey through the GPU](https://gpuopen.com/videos/graphics-pipeline/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal) - [AMD-FirePro/SDK on Github](https://github.com/AMD-FirePro/SDK/tree/master/documentation) - RDNA - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf) - OpenCL - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - RADEON GPU ANALYZER - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/) - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics) - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts) - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) - Pascal - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - Turing - 2018 - [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/) - 2018 - [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf) - Ampere - 2021 - [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf) - Ada - 2022 - [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf) - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01) - 2021 - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315) - Intel - [Gamedev](https://software.intel.com/en-us/gamedev) - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) - Microsoft - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/) - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/) - Arm - [Mali GPU Best Practices](https://developer.arm.com/solutions/graphics/developer-guides/mali-gpu-best-practices) - [Best Practices for Mobile Game Art Assets](https://developer.arm.com/solutions/graphics/developer-guides/best-practices-for-mobile-game-art-assets-1) @@ -181,32 +182,29 @@ - Khronos Group - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) - CMU - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home) - Misc - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2017 - [Demystifying Asynchronous Compute](https://www.reddit.com/r/nvidia/comments/50dqd5/demystifying_asynchronous_compute/) - 2019 - [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/) - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html) ### Graphics Pipeline / GPU Architecture Overview - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - 2015 - [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline) - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/) - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html) - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/) - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/) - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) ### Database - [PerfTest: GPU shader memory operation performance test tool (with results)](https://github.com/sebbbi/perftest) - [GPUInfo](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools - [Shader Playground](http://shader-playground.timjones.io/) -
silvesthu revised this gist
Jan 17, 2023 . 1 changed file with 6 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -133,8 +133,7 @@ - RDNA - <2019> [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - <2019> [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) - <2020> ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - <2020> ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - <2020> [RDNA2 Performance Guide](https://gpuopen.com/performance/) - <2022> ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf) @@ -155,7 +154,11 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - Turing - <2018> [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/) - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf) - Ampere - <2021> [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf) - Ada - <2022> [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf) - CUDA - <2014> [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Talks -
silvesthu revised this gist
Jan 15, 2023 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -86,6 +86,8 @@ - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk) - [Blog](https://bartwronski.com/) - <2014> [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/) - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris) - <2016> [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/) ### By Organization -
silvesthu revised this gist
Jan 9, 2023 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -148,6 +148,7 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf - <2016> [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts) - <2021> [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - <2022> [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - <2023> [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) - Pascal - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - Turing -
silvesthu revised this gist
Jan 9, 2023 . 1 changed file with 1 addition and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -192,14 +192,12 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf - <2020> [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) ### Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) ### For Artist - [WIP] [Unreal Art Optimization](https://unrealartoptimization.github.io/book/pipelines/pixel/) ### Database - [PerfTest: GPU shader memory operation performance test tool (with results)](https://github.com/sebbbi/perftest) - [GPUInfo](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES - [JP] [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) -
silvesthu revised this gist
Jan 8, 2023 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -134,7 +134,8 @@ - <2020> ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - <2020> ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - <2020> [RDNA2 Performance Guide](https://gpuopen.com/performance/) - <2022> ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf) - OpenCL - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - RADEON GPU ANALYZER -
silvesthu revised this gist
Aug 15, 2022 . 1 changed file with 4 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -154,8 +154,11 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf) - CUDA - <2014> [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Talks - <2012> [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) - <2020> [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01) - <2021> [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - <2022> [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315) - Intel - [Gamedev](https://software.intel.com/en-us/gamedev) - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) -
silvesthu revised this gist
Aug 15, 2022 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -145,8 +145,8 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf - <2015> [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - <2016> [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics) - <2016> [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts) - <2021> [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - <2022> [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - Pascal - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - Turing
NewerOlder