Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save wiseConst/141f77e4b4a0d2fd626620dc01fe4e7c to your computer and use it in GitHub Desktop.
Save wiseConst/141f77e4b4a0d2fd626620dc01fe4e7c to your computer and use it in GitHub Desktop.

Revisions

  1. @silvesthu silvesthu revised this gist May 19, 2024. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -246,6 +246,7 @@
    - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/)
    - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/)
    - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES
    - [D3d12infoDB by Dmytro Bulatov](https://d3d12infodb.boolka.dev/index.html) Database based on D3d12info in Tools section below
    - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)

    ### Tools
    @@ -273,4 +274,4 @@
    - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources

    Thanks JoseEmilio-ARM for ARM part.
    Thanks JoseEmilio-ARM for ARM part.
  2. @silvesthu silvesthu revised this gist May 12, 2024. 1 changed file with 18 additions and 8 deletions.
    26 changes: 18 additions & 8 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -21,6 +21,7 @@
    - [Blog](https://therealmjp.github.io/)
    - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/)
    - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/)
    - 2022 - [GPU Memory Pools in D3D12](https://therealmjp.github.io/posts/gpu-memory-pool/)
    - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil)
    - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0)
    - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/)
    @@ -34,7 +35,7 @@
    - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/)
    - Michal Drobot [@michaldrobot](https://twitter.com/michaldrobot)
    - [Blog](https://michaldrobot.com/)
    - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/)
    - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/) - [Video](https://www.youtube.com/watch?v=Bmy3Tt3Ottc)
    - 2014 - [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/)
    - 2014 - [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs)
    - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA)
    @@ -151,13 +152,6 @@
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf)
    - From application (CPU) to hardware (GPU)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) User Mode Driver (Vulkan), amdvlk64
    - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl)
    - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc)
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [linux/drivers/gpu/drm/amd/](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) Kernel Mode Driver (Linux), libdrm_amdgpu
    - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf)
    - OpenCL
    - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - Radeon GPU Analyzer / Radeon Raytracing Analyzer
    @@ -166,6 +160,14 @@
    - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/)
    - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/)
    - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/)
    - Driver Stack
    - [User Mode Driver for Vulkan (AMDVLK) by AMD](https://github.com/GPUOpen-Drivers/AMDVLK)
    - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl)
    - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc)
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [User Mode Driver for Vulkan (RADV) by Mesa](https://docs.mesa3d.org/drivers/radv.html)
    - [Kernel Mode Driver by Linux, libdrm_amdgpu](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd)
    - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf)
    - Nvidia
    - [Developer Blog](https://developer.nvidia.com/blog) and Talks
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    @@ -197,6 +199,11 @@
    - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html)
    - CUDA
    - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - Driver Stack
    - [User Mode Driver for Vulkan (NVK) by Mesa](https://docs.mesa3d.org/drivers/nvk.html)
    - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau)
    - [Kernel Mode Driver by Nvidia](https://github.com/NVIDIA/open-gpu-kernel-modules)
    - [Documentation of NVIDIA chip/hardware interfaces (open-gpu-doc)](https://github.com/nvidia/open-gpu-doc)
    - Intel
    - [Gamedev](https://software.intel.com/en-us/gamedev)
    - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf)
    @@ -207,6 +214,9 @@
    - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples)
    - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08)
    - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering)
    - Apple
    - 2023 - [Explore GPU advancements in M3 and A17 Pro](https://developer.apple.com/videos/play/tech-talks/111375)
    - 2023 - [Learn performance best practices for Metal shaders](https://developer.apple.com/videos/play/tech-talks/111373)
    - Arm
    - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201)
    - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301)
  3. @silvesthu silvesthu revised this gist May 11, 2024. 1 changed file with 8 additions and 5 deletions.
    13 changes: 8 additions & 5 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -117,7 +117,7 @@
    - AMD
    - [GPU Open](https://gpuopen.com/) and Talks
    - [Events Presentations](https://gpuopen.com/events/)
    - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/)
    - [AMD GPU architecture programming documentation (Instruction Set Architecture)](https://gpuopen.com/amd-gpu-architecture-programming-documentation/)
    - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/)
    - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/)
    @@ -151,10 +151,13 @@
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf)
    - Github
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
    - From application (CPU) to hardware (GPU)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) User Mode Driver (Vulkan), amdvlk64
    - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl)
    - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc)
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [linux/drivers/gpu/drm/amd/](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) Kernel Mode Driver (Linux), libdrm_amdgpu
    - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf)
    - OpenCL
    - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - Radeon GPU Analyzer / Radeon Raytracing Analyzer
  4. @silvesthu silvesthu revised this gist Jan 19, 2024. 1 changed file with 15 additions and 5 deletions.
    20 changes: 15 additions & 5 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -22,13 +22,13 @@
    - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/)
    - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/)
    - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil)
    - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0)
    - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/)
    - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering)
    - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202)
    - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/)
    - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://www.gdcvault.com/search.php#&category=free&firstfocus=&keyword=Optimizing+Ray%2BTracing%2BGPU%2BWorkloads%2Busing%2BNsight%2BGraphics)
    - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0)
    - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu)
    - [Blog](https://rys.sommefeldt.com/)
    - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/)
    @@ -56,7 +56,6 @@
    - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago)
    - 2019 - [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view)
    - Sebastian Aaltonen [@SebAaltonen](https://twitter.com/SebAaltonen)
    - [Blog](https://www.secondorder.com/)
    - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/)
    - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA)
    - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html)
    @@ -70,7 +69,7 @@
    - [Blog](https://fgiesen.wordpress.com/)
    - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/)
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - Timothy Lottes
    - Timothy Lottes [@NOTimothyLottes](https://twitter.com/NOTimothyLottes)
    - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/)
    - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf)
    - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/)
    @@ -106,6 +105,12 @@
    - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner)
    - [Blog](https://www.jendrikillner.com/)
    - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar.
    - Hans-Kristian [@Themaister](https://twitter.com/Themaister)
    - [Blog](https://themaister.net/blog/)
    - 2024 - [Modernizing Granite’s mesh rendering](https://themaister.net/blog/2024/01/17/modernizing-granites-mesh-rendering/)
    - Graham Wihlidal [@gwihlidal](https://twitter.com/gwihlidal)
    - [Blog](https://www.wihlidal.com/blog/)
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)

    ### By Organization

    @@ -206,7 +211,6 @@
    - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications)
    - GDC
    - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    @@ -215,11 +219,16 @@
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - CMU
    - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home)
    - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/spring2017/home), [Tsinghua ver. with video](http://15418.courses.cs.cmu.edu/tsinghua2017/home)

    ### Game Graphics Study
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)

    ### GPU Crash Debugging
    - 2018 - [Aftermath: Advances in GPU Crash Debugging](https://www.youtube.com/watch?v=VaGcs5-W6S4)
    - 2020 - (JP) [Device Removal の処方箋](https://cedil.cesa.or.jp/cedil_sessions/view/2258), [補足資料](https://shikihuiku.github.io/post/cedec2020_prescriptions_for_deviceremoval/)
    - 2023 - [GPU Crash Debugging in Unreal Engine: Tools, Techniques, and Best Practices | Unreal Fest 2023](https://www.youtube.com/watch?v=CyrGLMmVUAI)

    ### GPU Database
    - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/)
    - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/)
    @@ -234,6 +243,7 @@
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - [PIX](https://devblogs.microsoft.com/pix/introduction/)
    - [Adding performance​ instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts)
    - [DRED](https://devblogs.microsoft.com/directx/dred/), [D3DDred.js](https://github.com/Microsoft/DirectX-Debugging-Tools)
    - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview)
    - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics)
    - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath)
  5. @silvesthu silvesthu revised this gist Jan 17, 2024. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -135,6 +135,7 @@
    - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf)
    - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/)
    - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/)
    - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/)
    - GCN
    - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
  6. @silvesthu silvesthu revised this gist Jan 14, 2024. 1 changed file with 2 additions and 3 deletions.
    5 changes: 2 additions & 3 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -138,17 +138,16 @@
    - GCN
    - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
    - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal)
    - [AMD-FirePro/SDK on Github](https://github.com/AMD-FirePro/SDK/tree/master/documentation)
    - RDNA
    - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf)
    - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf)
    - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf)
    - Driver
    - Github
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
    - OpenCL
    - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
  7. @silvesthu silvesthu revised this gist Jan 13, 2024. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -68,6 +68,7 @@
    - 2014 - [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html)
    - Fabian Giesen [@rygorous](https://twitter.com/rygorous)
    - [Blog](https://fgiesen.wordpress.com/)
    - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/)
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - Timothy Lottes
    - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/)
  8. @silvesthu silvesthu revised this gist Jan 13, 2024. 1 changed file with 5 additions and 3 deletions.
    8 changes: 5 additions & 3 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -243,8 +243,10 @@
    - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/)
    - Intel
    - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html)
    - Utility
    - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources
    - Other related tools
    - [RenderDoc](https://renderdoc.org/) Graphics debugger that allows quick and easy single-frame capture and detailed introspection
    - [APITrace](https://apitrace.github.io/) Trace OpenGL, Direct3D, and DirectDraw APIs calls to a file and replay
    - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources

    Thanks JoseEmilio-ARM for ARM part.
  9. @silvesthu silvesthu revised this gist Jan 12, 2024. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -102,6 +102,9 @@
    - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein)
    - [Blog](https://aschrein.github.io/)
    - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)
    - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner)
    - [Blog](https://www.jendrikillner.com/)
    - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar.

    ### By Organization

  10. @silvesthu silvesthu revised this gist Jan 11, 2024. 1 changed file with 8 additions and 8 deletions.
    16 changes: 8 additions & 8 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -51,7 +51,6 @@
    - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/)
    - 2018 - [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/)
    - 2018 - [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/)
    - [GPU database](https://db.thegpu.guru/)
    - Matthijs De Smedt [@anji_nl](https://twitter.com/anji_nl)
    - 2016 - [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots)
    - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago)
    @@ -61,7 +60,6 @@
    - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/)
    - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA)
    - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html)
    - 2020 - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool
    - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep)
    - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah)
    - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of)
    @@ -203,13 +201,13 @@
    - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance)
    - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications)
    - GDC
    - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) talks, not specifically on optimization
    - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/), not specifically on optimization
    - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/) Not specifically on optimization
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - CMU
    @@ -220,13 +218,14 @@

    ### GPU Database
    - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/)
    - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES
    - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/)
    - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES
    - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)

    ### Tools
    - Online Shader Compiler
    - [Compiler Explorer (godbolt)](https://godbolt.org/), support DXC, AMD RGA
    - [Shader Playground](http://shader-playground.timjones.io/), support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler
    - [Compiler Explorer (godbolt)](https://godbolt.org/) Support DXC, AMD RGA
    - [Shader Playground](http://shader-playground.timjones.io/) Support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler
    - Microsoft
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - [PIX](https://devblogs.microsoft.com/pix/introduction/)
    @@ -242,6 +241,7 @@
    - Intel
    - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html)
    - Utility
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info), get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources
    - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources

    Thanks JoseEmilio-ARM for ARM part.
  11. @silvesthu silvesthu revised this gist Jan 11, 2024. 1 changed file with 5 additions and 1 deletion.
    6 changes: 5 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -45,6 +45,7 @@
    - 2020 - [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/)
    - 2020 - [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/)
    - 2022 - [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/)
    - 2023 - [LOW-LEVEL THINKING IN HIGH-LEVEL SHADING LANGUAGES 2023](https://interplayoflight.wordpress.com/2023/12/29/low-level-thinking-in-high-level-shading-languages-2023/)
    - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru)
    - [Blog](https://anteru.net/)
    - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/)
    @@ -148,7 +149,7 @@
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - Radeon GPU Analyzer / Radeon Raytracing Analyzer
    - 2017 - [Live VGPR Analysis with Radeon™ GPU Analyzer](https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/)
    - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/)
    @@ -229,6 +230,7 @@
    - Microsoft
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - [PIX](https://devblogs.microsoft.com/pix/introduction/)
    - [Adding performance​ instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts)
    - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview)
    - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics)
    - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath)
    @@ -239,5 +241,7 @@
    - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/)
    - Intel
    - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html)
    - Utility
    - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info), get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources

    Thanks JoseEmilio-ARM for ARM part.
  12. @silvesthu silvesthu revised this gist Jan 1, 2024. 1 changed file with 4 additions and 3 deletions.
    7 changes: 4 additions & 3 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -60,6 +60,7 @@
    - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/)
    - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA)
    - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html)
    - 2020 - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool
    - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep)
    - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah)
    - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of)
    @@ -216,9 +217,9 @@
    ### Game Graphics Study
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)

    ### Database
    - [PerfTest: GPU shader memory operation performance test tool (with results)](https://github.com/sebbbi/perftest)
    - [GPUInfo](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES
    ### GPU Database
    - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/)
    - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES
    - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)

    ### Tools
  13. @silvesthu silvesthu revised this gist Jan 1, 2024. 1 changed file with 10 additions and 4 deletions.
    14 changes: 10 additions & 4 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -149,9 +149,11 @@
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - Radeon GPU Analyzer / Radeon Raytracing Analyzer
    - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/)
    - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/)
    - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/)
    - 2017 - [Live VGPR Analysis with Radeon™ GPU Analyzer](https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/)
    - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/)
    - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/)
    - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/)
    - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/)
    - Nvidia
    - [Developer Blog](https://developer.nvidia.com/blog) and Talks
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    @@ -169,14 +171,18 @@
    - 2023 - [Advanced API Performance: Shaders](https://developer.nvidia.com/blog/advanced-api-performance-shaders/)
    - Pascal
    - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - 2023 - [Tuning CUDA Applications for Pascal](https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html)
    - Turing
    - 2018 - [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/)
    - 2018 - [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - 2023 - [Tuning CUDA Applications for Turing](https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html)
    - Ampere
    - 2021 - [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf)
    - 2020 - [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf)
    - 2023 - [Tuning CUDA Applications for NVIDIA Ampere GPU Architecture](https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html)
    - Ada
    - 2022 - [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf)
    - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf)
    - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html)
    - CUDA
    - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - Intel
  14. @silvesthu silvesthu revised this gist Dec 26, 2023. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -96,6 +96,7 @@
    - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk)
    - [Blog](https://bartwronski.com/)
    - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/)
    - 2021 - [Is this a branch?](https://bartwronski.com/2021/01/18/is-this-a-branch/)
    - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris)
    - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/)
    - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein)
    @@ -128,6 +129,8 @@
    - Curing Amnesia and Other GPU Maladies With AMD Developer Tools
    - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API
    - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf)
    - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/)
    - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/)
    - GCN
    - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
    @@ -139,7 +142,7 @@
    - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf)
    - Driver
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
  15. @silvesthu silvesthu revised this gist Sep 28, 2023. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -213,7 +213,9 @@
    - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)

    ### Tools
    - [Shader Playground](http://shader-playground.timjones.io/)
    - Online Shader Compiler
    - [Compiler Explorer (godbolt)](https://godbolt.org/), support DXC, AMD RGA
    - [Shader Playground](http://shader-playground.timjones.io/), support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler
    - Microsoft
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - [PIX](https://devblogs.microsoft.com/pix/introduction/)
  16. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -163,6 +163,7 @@
    - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/)
    - 2023 - [Advanced API Performance: Shaders](https://developer.nvidia.com/blog/advanced-api-performance-shaders/)
    - Pascal
    - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - Turing
  17. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -191,12 +191,13 @@
    - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance)
    - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications)
    - GDC
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/), not generally about optimization though
    - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) talks, not specifically on optimization
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/), not specifically on optimization
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - CMU
  18. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 23 additions and 26 deletions.
    49 changes: 23 additions & 26 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -71,13 +71,11 @@
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - Timothy Lottes
    - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/)
    - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf)
    - 2018 - [Engine Optimization Hot Lap](https://32ipi028l5q82yhj72224m8j-wpengine.netdna-ssl.com/wp-content/uploads/2018/05/gdc_2018_sponsored_engine_optimization_hot_lap.pptx)
    - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf)
    - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/)
    - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline)
    - [Blog](http://renderingpipeline.com)
    - 2012 - [Low-Level GPU Documentation](http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/)
    - Stephanie Hurlburt [@sehurlburt](http://stephaniehurlburt.com/blog)
    - 2016 - [Casual Introduction to Low-Level Graphics Programming](http://stephaniehurlburt.com/blog/2016/10/28/casual-introduction-to-low-level-graphics-programming)
    - 2012 - [Low-Level GPU Documentation](https://web.archive.org/web/20160305145630/http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/)
    - RasterGrid [@rastergrid](https://twitter.com/rastergrid)
    - [Blog](https://rastergrid.com/blog/)
    - 2021 - [Understanding GPU caches](https://rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/)
    @@ -100,33 +98,26 @@
    - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/)
    - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris)
    - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/)
    - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein)
    - [Blog](https://aschrein.github.io/)
    - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)

    ### By Organization

    - GDC
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/)
    - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering)
    - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf)
    - 2017 - [Wave Programming in D3D12 and Vulkan](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf)
    - 2017 - [D3D12 and Vulkan Done Right](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf)
    - 2017 - [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf)
    - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - AMD
    - [GPU Open](https://gpuopen.com/) and Talks
    - [Events Presentations](https://gpuopen.com/events/)
    - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/)
    - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/)
    - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/)
    - 2017 - [Wave Programming in D3D12 and Vulkan](https://gpuopen.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf)
    - 2017 - [D3D12 and Vulkan Done Right](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf)
    - 2017 - [Deep Dive: Asynchronous Compute](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf)
    - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg)
    - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx)
    - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf)
    - 2020 - [Let’s build](https://gpuopen.com/lets-build/)
    @@ -162,8 +153,10 @@
    - [Developer Blog](https://developer.nvidia.com/blog) and Talks
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0)
    - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf)
    - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics)
    - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering)
    - 2019 - [Tips and Tricks: Ray Tracing Best Practices](https://developer.nvidia.com/blog/rtx-best-practices/)
    - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - 2020 - [RTX Ray Tracing Best Practices](https://www.gdcvault.com/play/1026721/RTX-Ray-Tracing-Best-Practices)
    @@ -197,13 +190,17 @@
    - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301)
    - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance)
    - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications)
    - GDC
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/), not generally about optimization though
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - CMU
    - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home)
    - Misc
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2017 - [Demystifying Asynchronous Compute](https://www.reddit.com/r/nvidia/comments/50dqd5/demystifying_asynchronous_compute/)
    - 2019 - [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/)
    - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)

    ### Game Graphics Study
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)
  19. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 18 additions and 15 deletions.
    33 changes: 18 additions & 15 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -8,6 +8,7 @@
    - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/)
    - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
    - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)
    - 2020 - [All the pipelines - journey through the GPU](https://www.youtube.com/watch?v=Y2KG_4OxDBg)

    ### By Author

    @@ -25,6 +26,8 @@
    - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering)
    - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202)
    - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/)
    - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://www.gdcvault.com/search.php#&category=free&firstfocus=&keyword=Optimizing+Ray%2BTracing%2BGPU%2BWorkloads%2Busing%2BNsight%2BGraphics)
    - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0)
    - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu)
    - [Blog](https://rys.sommefeldt.com/)
    @@ -111,20 +114,20 @@
    - 2017 - [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf)
    - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - AMD
    - [GPU Open](https://gpuopen.com/)
    - [GPU Open](https://gpuopen.com/) and Talks
    - [Events Presentations](https://gpuopen.com/events/)
    - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/)
    - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/)
    - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/)
    - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg)
    - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx)
    - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf)
    - 2020 - [Let’s build](https://gpuopen.com/lets-build/)
    - AMD Ryzen™ Processor Software Optimization
    @@ -134,7 +137,6 @@
    - Curing Amnesia and Other GPU Maladies With AMD Developer Tools
    - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API
    - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf)
    - 2020 - [All the Pipelines - Journey through the GPU](https://gpuopen.com/videos/graphics-pipeline/)
    - GCN
    - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
    @@ -152,14 +154,19 @@
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - RADEON GPU ANALYZER
    - Radeon GPU Analyzer / Radeon Raytracing Analyzer
    - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/)
    - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/)
    - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/)
    - Nvidia
    - [Developer Blog](https://developer.nvidia.com/blog)
    - [Developer Blog](https://developer.nvidia.com/blog) and Talks
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0)
    - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics)
    - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - 2019 - [Tips and Tricks: Ray Tracing Best Practices](https://developer.nvidia.com/blog/rtx-best-practices/)
    - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - 2020 - [RTX Ray Tracing Best Practices](https://www.gdcvault.com/play/1026721/RTX-Ray-Tracing-Best-Practices)
    - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/)
    @@ -175,25 +182,21 @@
    - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf)
    - CUDA
    - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - Talks
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - 2021 - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315)
    - Intel
    - [Gamedev](https://software.intel.com/en-us/gamedev)
    - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf)
    - Microsoft
    - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/)
    - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/)
    - Khronos Group
    - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples)
    - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08)
    - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering)
    - Arm
    - [Mali GPU Best Practices](https://developer.arm.com/solutions/graphics/developer-guides/mali-gpu-best-practices)
    - [Best Practices for Mobile Game Art Assets](https://developer.arm.com/solutions/graphics/developer-guides/best-practices-for-mobile-game-art-assets-1)
    - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201)
    - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301)
    - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance)
    - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications)
    - [Arm Vulkan Guides](https://developer.arm.com/solutions/graphics/apis/vulkan)
    - Khronos Group
    - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples)
    - CMU
    - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home)
    - Misc
  20. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -112,8 +112,9 @@
    - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - [JP] CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (JP) CEDEC
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048)
    - Siggraph
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - AMD
  21. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 9 additions and 9 deletions.
    18 changes: 9 additions & 9 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,14 @@
    # GPU Optimization for GameDev

    ### Graphics Pipeline / GPU Architecture Overview
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - 2015 - [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline)
    - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/)
    - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html)
    - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/)
    - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
    - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)

    ### By Author

    - Emil Persson [@_Humus_](https://twitter.com/_Humus_)
    @@ -192,15 +201,6 @@
    - 2019 - [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/)
    - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)

    ### Graphics Pipeline / GPU Architecture Overview
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - 2015 - [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline)
    - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/)
    - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html)
    - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/)
    - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
    - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)

    ### Game Graphics Study
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)

  22. @silvesthu silvesthu revised this gist Sep 2, 2023. 1 changed file with 17 additions and 5 deletions.
    22 changes: 17 additions & 5 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -92,7 +92,7 @@
    ### By Organization

    - GDC
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/) or in [GDC VAULT EXPLORER](https://yankooliveira.com/gdcvault/)
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/)
    - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering)
    @@ -137,6 +137,9 @@
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    - Driver
    - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal)
    - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK)
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - RADEON GPU ANALYZER
    @@ -208,9 +211,18 @@

    ### Tools
    - [Shader Playground](http://shader-playground.timjones.io/)
    - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics)
    - [Radeon GPU Analyzer](https://gpuopen.com/rga/)
    - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html)
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - Microsoft
    - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview)
    - [PIX](https://devblogs.microsoft.com/pix/introduction/)
    - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview)
    - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics)
    - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath)
    - AMD - [Radeon Developer Tool Suite](https://gpuopen.com/introducing-radeon-developer-tool-suite/)
    - [Radeon GPU Analyzer](https://gpuopen.com/rga/)
    - [Radeon Raytracing Analyzer](https://gpuopen.com/radeon-raytracing-analyzer/)
    - [Radeon Memory Visualizer](https://gpuopen.com/rmv/)
    - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/)
    - Intel
    - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html)

    Thanks JoseEmilio-ARM for ARM part.
  23. @silvesthu silvesthu revised this gist Apr 23, 2023. 1 changed file with 105 additions and 107 deletions.
    212 changes: 105 additions & 107 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -4,174 +4,175 @@

    - Emil Persson [@_Humus_](https://twitter.com/_Humus_)
    - [Blog](http://www.humus.name/)
    - <2013> [Low-Level Thinking in High-Level Shading Languages](https://www.gdcvault.com/play/1018182/Low-Level-Thinking-in-High)
    - <2014> [Low-Level Shader Optimization for Next-Gen and DX11](http://www.humus.name/Articles/Persson_LowlevelShaderOptimization.pptx)
    - <2018> [Rule of optimization](https://twitter.com/_Humus_/status/1011964081069330432)
    - 2013 - [Low-Level Thinking in High-Level Shading Languages](https://www.gdcvault.com/play/1018182/Low-Level-Thinking-in-High)
    - 2014 - [Low-Level Shader Optimization for Next-Gen and DX11](http://www.humus.name/Articles/Persson_LowlevelShaderOptimization.pptx)
    - 2018 - [Rule of optimization](https://twitter.com/_Humus_/status/1011964081069330432)
    - Matt Pettineo [@mynameismjp](https://twitter.com/mynameismjp)
    - [Blog](https://therealmjp.github.io/)
    - <2018> [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/)
    - <2021> [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/)
    - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/)
    - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/)
    - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil)
    - <2018> [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/)
    - <2018> [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering)
    - <2019> [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202)
    - <2020> [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/)
    - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/)
    - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering)
    - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202)
    - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/)
    - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0)
    - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu)
    - [Blog](https://rys.sommefeldt.com/)
    - <2018> [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/)
    - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/)
    - Michal Drobot [@michaldrobot](https://twitter.com/michaldrobot)
    - [Blog](https://michaldrobot.com/)
    - <2014> [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/)
    - <2014> [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/)
    - <2014> [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs)
    - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/)
    - 2014 - [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/)
    - 2014 - [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs)
    - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA)
    - [Blog](https://interplayoflight.wordpress.com/)
    - <2018> [DD2018: Kostas Anagnostou - Experiments in GPU occlusion culling](https://www.youtube.com/watch?v=U20dIA3SLTs)
    - <2020> [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)
    - <2020> [GPU ARCHITECTURE RESOURCES (twitter thread)](https://twitter.com/KostasAAA/status/1259153226043179011)
    - <2020> [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/)
    - <2020> [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/)
    - <2022> [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/)
    - 2018 - [DD2018: Kostas Anagnostou - Experiments in GPU occlusion culling](https://www.youtube.com/watch?v=U20dIA3SLTs)
    - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)
    - 2020 - [GPU ARCHITECTURE RESOURCES (twitter thread)](https://twitter.com/KostasAAA/status/1259153226043179011)
    - 2020 - [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/)
    - 2020 - [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/)
    - 2022 - [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/)
    - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru)
    - [Blog](https://anteru.net/)
    - <2018> [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/)
    - <2018> [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/)
    - <2018> [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/)
    - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/)
    - 2018 - [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/)
    - 2018 - [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/)
    - [GPU database](https://db.thegpu.guru/)
    - Matthijs De Smedt [@anji_nl](https://twitter.com/anji_nl)
    - <2016> [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots)
    - 2016 - [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots)
    - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago)
    - <2019> [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view)
    - 2019 - [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view)
    - Sebastian Aaltonen [@SebAaltonen](https://twitter.com/SebAaltonen)
    - [Blog](https://www.secondorder.com/)
    - <2017> [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/)
    - <2018> [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA)
    - <2018> [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html)
    - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/)
    - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA)
    - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html)
    - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep)
    - <2013> [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah)
    - <2013> [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of)
    - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah)
    - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of)
    - Sven Andersson [@andsve](https://twitter.com/andsve)
    - [Blog](http://svenandersson.se/)
    - <2014> [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html)
    - 2014 - [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html)
    - Fabian Giesen [@rygorous](https://twitter.com/rygorous)
    - [Blog](https://fgiesen.wordpress.com/)
    - <2011> [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - Timothy Lottes
    - <2016> [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/)
    - <2017> [ADVANCED SHADER PROGRAMMING ON GCN](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf)
    - <2018> [Engine Optimization Hot Lap](https://32ipi028l5q82yhj72224m8j-wpengine.netdna-ssl.com/wp-content/uploads/2018/05/gdc_2018_sponsored_engine_optimization_hot_lap.pptx)
    - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/)
    - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf)
    - 2018 - [Engine Optimization Hot Lap](https://32ipi028l5q82yhj72224m8j-wpengine.netdna-ssl.com/wp-content/uploads/2018/05/gdc_2018_sponsored_engine_optimization_hot_lap.pptx)
    - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline)
    - [Blog](http://renderingpipeline.com)
    - <2012> [Low-Level GPU Documentation](http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/)
    - 2012 - [Low-Level GPU Documentation](http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/)
    - Stephanie Hurlburt [@sehurlburt](http://stephaniehurlburt.com/blog)
    - <2016> [Casual Introduction to Low-Level Graphics Programming](http://stephaniehurlburt.com/blog/2016/10/28/casual-introduction-to-low-level-graphics-programming)
    - 2016 - [Casual Introduction to Low-Level Graphics Programming](http://stephaniehurlburt.com/blog/2016/10/28/casual-introduction-to-low-level-graphics-programming)
    - RasterGrid [@rastergrid](https://twitter.com/rastergrid)
    - [Blog](https://rastergrid.com/blog/)
    - <2021> [Understanding GPU caches](https://rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/)
    - 2021 - [Understanding GPU caches](https://rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/)
    - Adam Sawicki [@Reg__](https://twitter.com/Reg__)
    - [Blog](https://asawicki.info/)
    - <2020> [A Better Way to Scalarize a Shader](https://asawicki.info/news_1735_a_better_way_to_scalarize_a_shader)
    - <2021> [Efficient Use of GPU Memory in Modern Games](https://www.youtube.com/watch?v=ML0YC77bSOc)
    - 2020 - [A Better Way to Scalarize a Shader](https://asawicki.info/news_1735_a_better_way_to_scalarize_a_shader)
    - 2021 - [Efficient Use of GPU Memory in Modern Games](https://www.youtube.com/watch?v=ML0YC77bSOc)
    - Matías N. Goldberg [@matiasgoldberg](https://twitter.com/matiasgoldberg)
    - [Blog](https://www.yosoygames.com.ar/wp/)
    - <2020> [A little clarification on modern shader compile times](https://www.yosoygames.com.ar/wp/2020/08/a-little-clarification-on-modern-shader-compile-times/#tc-comment-title)
    - <2022> [The road to 16-bit floats GPU is paved with our blood](https://www.yosoygames.com.ar/wp/2022/01/the-road-to-16-bit-floats-gpu-is-paved-with-our-blood/)
    - 2020 - [A little clarification on modern shader compile times](https://www.yosoygames.com.ar/wp/2020/08/a-little-clarification-on-modern-shader-compile-times/#tc-comment-title)
    - 2022 - [The road to 16-bit floats GPU is paved with our blood](https://www.yosoygames.com.ar/wp/2022/01/the-road-to-16-bit-floats-gpu-is-paved-with-our-blood/)
    - Francesco Cifariello Ciardi [@FCifaCiar](https://twitter.com/FCifaCiar)
    - [Blog](https://flashypixels.wordpress.com/)
    - <2018> [INTRO TO GPU SCALARIZATION](https://flashypixels.wordpress.com/2018/11/10/intro-to-gpu-scalarization-part-1/)
    - 2018 - [INTRO TO GPU SCALARIZATION](https://flashypixels.wordpress.com/2018/11/10/intro-to-gpu-scalarization-part-1/)
    - Sébastien Lagarde [@SebLagarde](https://twitter.com/seblagarde)
    - [Blog](https://seblagarde.wordpress.com/)
    - <2014> [Inverse trigonometric functions GPU optimization for AMD GCN architecture](https://seblagarde.wordpress.com/2014/12/01/inverse-trigonometric-functions-gpu-optimization-for-amd-gcn-architecture/)
    - 2014 - [Inverse trigonometric functions GPU optimization for AMD GCN architecture](https://seblagarde.wordpress.com/2014/12/01/inverse-trigonometric-functions-gpu-optimization-for-amd-gcn-architecture/)
    - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk)
    - [Blog](https://bartwronski.com/)
    - <2014> [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/)
    - 2014 - [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/)
    - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris)
    - <2016> [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/)
    - 2016 - [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/)

    ### By Organization

    - GDC
    - Search "Advanced Graphics" in [GDC Vault](https://gdcvault.com/) or in [GDC VAULT EXPLORER](https://yankooliveira.com/gdcvault/)
    - <2014> [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - <2016> [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - <2016> [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering)
    - <2016> [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf)
    - <2017> [Wave Programming in D3D12 and Vulkan](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf)
    - <2017> [D3D12 and Vulkan Done Right](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf)
    - <2017> [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf)
    - <2019> [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - <2019> [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - <2019> [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau)
    - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With)
    - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering)
    - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf)
    - 2017 - [Wave Programming in D3D12 and Vulkan](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf)
    - 2017 - [D3D12 and Vulkan Done Right](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-D3D12-And-Vulkan-Done-Right.pdf)
    - 2017 - [Deep Dive: Asynchronous Compute](http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Asynchronous-Compute-Deep-Dive.pdf)
    - 2019 - [DirectX 12 Optimization Techniques in Capcom’s RE ENGINE](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s4-optimization-techniques-re2-dmc5.pdf)
    - 2019 - [A BLEND OF GCN OPTIMIZATION AND COLOR PROCESSING](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s5-blend-of-gcn-optimization-and-color-processing.pdf)
    - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf)
    - [JP] CEDEC
    - <2016> [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505)
    - Siggraph
    - <2020> [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020)
    - AMD
    - [GPU Open](https://gpuopen.com/)
    - [Events Presentations](https://gpuopen.com/events/)
    - [AMD GPU ISA documentation (GCN,Vega,CDNA,RDNA,RDNA2)](https://gpuopen.com/amd-isa-documentation/)
    - <2016> [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/)
    - <2016> [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/)
    - <2018> [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg)
    - <2018> [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx)
    - <2019> [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf)
    - <2020> [Let’s build](https://gpuopen.com/lets-build/)
    - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/)
    - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/)
    - 2018 - [Optimize your engine using compute @ 4C Prague 2018](https://gpuopen.com/wp-content/uploads/2018/11/4C-Prague-Compute-Shaders.pptx) | [(Youtube)](https://www.youtube.com/watch?v=0DLOJPSxJEg)
    - 2018 - [Optimization with Radeon GPU Profiler - A Vulkan Case Study](https://gpuopen.com/wp-content/uploads/2018/01/Optimization-with-Radeon-GPU-Profiler.pptx)
    - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf)
    - 2020 - [Let’s build](https://gpuopen.com/lets-build/)
    - AMD Ryzen™ Processor Software Optimization
    - Optimizing for the Radeon™ RDNA Architecture
    - From Source to ISA: A Trip Down the Shader Compiler Pipeline
    - A Review of GPUOpen Effects
    - Curing Amnesia and Other GPU Maladies With AMD Developer Tools
    - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API
    - <2020> [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf)
    - <2020> [All the Pipelines - Journey through the GPU](https://gpuopen.com/videos/graphics-pipeline/)
    - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf)
    - 2020 - [All the Pipelines - Journey through the GPU](https://gpuopen.com/videos/graphics-pipeline/)
    - GCN
    - <2013> [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - <2019> [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
    - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf)
    - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/)
    - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal)
    - [AMD-FirePro/SDK on Github](https://github.com/AMD-FirePro/SDK/tree/master/documentation)
    - RDNA
    - <2019> [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf)
    - <2019> [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf)
    - <2020> ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - <2020> ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - <2020> [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - <2022> ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf)
    - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf)
    - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - 2022 - ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - RADEON GPU ANALYZER
    - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/)
    - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/)
    - Nvidia
    - [Developer Blog](https://developer.nvidia.com/blog)
    - <2015> [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0)
    - <2016> [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics)
    - <2016> [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - <2021> [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - <2022> [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - <2023> [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/)
    - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0)
    - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics)
    - 2016 - [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/)
    - Pascal
    - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - Turing
    - <2018> [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/)
    - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - 2018 - [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/)
    - 2018 - [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - Ampere
    - <2021> [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf)
    - 2021 - [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf)
    - Ada
    - <2022> [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf)
    - 2022 - [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf)
    - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf)
    - CUDA
    - <2014> [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - Talks
    - <2012> [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - <2020> [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - <2021> [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - <2022> [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315)
    - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - 2021 - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315)
    - Intel
    - [Gamedev](https://software.intel.com/en-us/gamedev)
    - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf)
    - Microsoft
    - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/)
    - <2019> [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/)
    - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/)
    - Arm
    - [Mali GPU Best Practices](https://developer.arm.com/solutions/graphics/developer-guides/mali-gpu-best-practices)
    - [Best Practices for Mobile Game Art Assets](https://developer.arm.com/solutions/graphics/developer-guides/best-practices-for-mobile-game-art-assets-1)
    @@ -181,32 +182,29 @@
    - Khronos Group
    - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples)
    - CMU
    - <2017> [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home)
    - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/tsinghua2017/home)
    - Misc
    - <2009> [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - <2017> [Demystifying Asynchronous Compute](https://www.reddit.com/r/nvidia/comments/50dqd5/demystifying_asynchronous_compute/)
    - <2019> [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/)
    - <2019> [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)
    - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
    - 2017 - [Demystifying Asynchronous Compute](https://www.reddit.com/r/nvidia/comments/50dqd5/demystifying_asynchronous_compute/)
    - 2019 - [Unity GPU culling experiments](https://www.mpc-rnd.com/unity-gpu-culling-experiments/)
    - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html)

    ### Graphics Pipeline / GPU Architecture Overview
    - <2011> [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - <2015> [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline)
    - <2015> [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/)
    - <2016> [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html)
    - <2017> [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/)
    - <2019> [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
    - <2020> [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)
    - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/)
    - 2015 - [Life of a triangle - NVIDIA's logical pipeline](https://developer.nvidia.com/content/life-triangle-nvidias-logical-pipeline)
    - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/)
    - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html)
    - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/)
    - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://www.ovh.com/blog/understanding-the-anatomy-of-gpus-using-pokemon/)
    - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)

    ### Graphics Study
    ### Game Graphics Study
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)

    ### For Artist
    - [WIP] [Unreal Art Optimization](https://unrealartoptimization.github.io/book/pipelines/pixel/)

    ### Database
    - [PerfTest: GPU shader memory operation performance test tool (with results)](https://github.com/sebbbi/perftest)
    - [GPUInfo](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES
    - [JP] [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)
    - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)

    ### Tools
    - [Shader Playground](http://shader-playground.timjones.io/)
  24. @silvesthu silvesthu revised this gist Jan 17, 2023. 1 changed file with 6 additions and 3 deletions.
    9 changes: 6 additions & 3 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -133,8 +133,7 @@
    - RDNA
    - <2019> [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf)
    - <2019> [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf)
    - <2020> ["RDNA 1.0" Instruction Set
    Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - <2020> ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - <2020> ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - <2020> [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - <2022> ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    @@ -155,7 +154,11 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf
    - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - Turing
    - <2018> [NVIDIA Turing Architecture In-Depth](https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/)
    - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - Ampere
    - <2021> [NVIDIA AMPERE GA102 GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf)
    - Ada
    - <2022> [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf)
    - CUDA
    - <2014> [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - Talks
  25. @silvesthu silvesthu revised this gist Jan 15, 2023. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -86,6 +86,8 @@
    - Bart Wronski [@BartWronsk](https://twitter.com/BartWronsk)
    - [Blog](https://bartwronski.com/)
    - <2014> [GCN – two ways of latency hiding and wave occupancy](https://bartwronski.com/2014/03/27/gcn-two-ways-of-latency-hiding-and-wave-occupancy/)
    - Elizabeth Baumel [@Icetigris](https://twitter.com/icetigris)
    - <2016> [Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization](https://docs.google.com/presentation/d/1LQUMIld4SGoQVthnhT1scoA3k4Sg0as14G4NeSiSgFU/)

    ### By Organization

  26. @silvesthu silvesthu revised this gist Jan 9, 2023. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -148,6 +148,7 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf
    - <2016> [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - <2021> [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - <2022> [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - <2023> [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/)
    - Pascal
    - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - Turing
  27. @silvesthu silvesthu revised this gist Jan 9, 2023. 1 changed file with 1 addition and 3 deletions.
    4 changes: 1 addition & 3 deletions GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -192,14 +192,12 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf
    - <2020> [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/)

    ### Graphics Study
    - [Graphics Studies Compilation
    ](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)
    - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/)

    ### For Artist
    - [WIP] [Unreal Art Optimization](https://unrealartoptimization.github.io/book/pipelines/pixel/)

    ### Database

    - [PerfTest: GPU shader memory operation performance test tool (with results)](https://github.com/sebbbi/perftest)
    - [GPUInfo](https://www.gpuinfo.org/) for Vulkan, OpenGL, OpenGL ES
    - [JP] [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware)
  28. @silvesthu silvesthu revised this gist Jan 8, 2023. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -134,7 +134,8 @@
    - <2020> ["RDNA 1.0" Instruction Set
    Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf)
    - <2020> ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf)
    - [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - <2020> [RDNA2 Performance Guide](https://gpuopen.com/performance/)
    - <2022> ["RDNA3" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf)
    - OpenCL
    - [AMD Accelerated Parallel Processing OpenCL Programming Guide](http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf)
    - RADEON GPU ANALYZER
  29. @silvesthu silvesthu revised this gist Aug 15, 2022. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -154,8 +154,11 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf
    - <2018> [NVIDIA TURING GPU ARCHITECTURE](https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf)
    - CUDA
    - <2014> [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/)
    - GTC
    - Talks
    - <2012> [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf)
    - <2020> [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01)
    - <2021> [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/)
    - <2022> [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://schedule.gdconf.com/session/optimizing-ray-tracing-gpu-workloads-using-nsight-graphics-gpu-trace-and-nsight-systems-presented-by-nvidia/886315)
    - Intel
    - [Gamedev](https://software.intel.com/en-us/gamedev)
    - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf)
  30. @silvesthu silvesthu revised this gist Aug 15, 2022. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion GPUOptimizationForGameDev.md
    Original file line number Diff line number Diff line change
    @@ -145,8 +145,8 @@ Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf
    - <2015> [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0)
    - <2016> [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics)
    - <2016> [DX12 Do's And Don'ts](https://developer.nvidia.com/dx12-dos-and-donts)
    - <2020> [Best Practices: Using NVIDIA RTX Ray Tracing](https://developer.nvidia.com/blog/best-practices-using-nvidia-rtx-ray-tracing/)
    - <2021> [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/)
    - <2022> [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/)
    - Pascal
    - <2016> [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf)
    - Turing