-
-
Save Areshkew/638925717bd4be97d19e69d0c5bac81c to your computer and use it in GitHub Desktop.
Revisions
-
silvesthu revised this gist
Jul 1, 2025 . 1 changed file with 7 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -122,11 +122,16 @@ - 2024 - [Harnessing Wave Intrinsics For Good (And Evil)](https://github.com/AlexSabourinDev/cranberry_blog/blob/master/HarnessingWaveIntrinsicsForGoodAndEvil.pdf) [Video](https://www.youtube.com/watch?v=U6t33RLa0XM) - Chips and Cheese [@chipsandcheese9](https://x.com/chipsandcheese9) - [Blog](https://chipsandcheese.com/) - [Memory Latency Data](https://jsmemtest.chipsandcheese.com/latencydata) - RDNA4 - 2025 - [AMD's RDNA4 Architecture (Video)](https://chipsandcheese.com/p/amds-rdna4-architecture-video) - 2025 - [RDNA 4's "Out-of-Order" Memory Accesses](https://chipsandcheese.com/p/rdna-4s-out-of-order-memory-accesses) - 2025 - [Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture](https://chipsandcheese.com/p/dynamic-register-allocation-on-amds) - 2025 - [RDNA 4’s Raytracing Improvements](https://chipsandcheese.com/p/rdna-4s-raytracing-improvements) - 2025 - [Blackwell: Nvidia’s Massive GPU](https://chipsandcheese.com/p/blackwell-nvidias-massive-gpu) - Emilio López [@redorav](https://x.com/redorav) - [Blog](https://www.elopezr.com/) - 2025 - [The Art Of Packing Data](https://www.elopezr.com/the-art-of-packing-data/) ### By Organization @@ -196,9 +201,10 @@ - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf) - 2016 - [Reading Between The Threads: Shader Intrinsics](https://developer.nvidia.com/reading-between-threads-shader-intrinsics) - 2016 - [DX12 Do's And Don'ts](https://web.archive.org/web/20240105013427/https://developer.nvidia.com/dx12-dos-and-donts) - 2016 - [High-Performance, Low-Overhead Rendering with OpenGL and Vulkan](https://www.gdcvault.com/play/1023516/High-performance-Low-Overhead-Rendering) - 2019 - [Tips and Tricks: Ray Tracing Best Practices](https://developer.nvidia.com/blog/rtx-best-practices/) - 2019 - [Tips and Tricks: Vulkan Dos and Don’ts](https://developer.nvidia.com/blog/vulkan-dos-donts/) - 2020 - [Optimizing Graphics Applications using Nsight Systems and Nsight Graphics](https://developer.nvidia.com/siggraph/2020/video/sigg01) - 2020 - [RTX Ray Tracing Best Practices](https://www.gdcvault.com/play/1026721/RTX-Ray-Tracing-Best-Practices) - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) -
silvesthu revised this gist
Jun 23, 2025 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -6,7 +6,7 @@ - 2015 - [Render Hell 2.0](https://simonschreibt.de/gat/renderhell/) - 2016 - [How bad are small triangles on GPU and why?](http://www.g-truc.net/post-0662.html) - 2017 - [GPU Performance for Game Artists](http://fragmentbuffer.com/gpu-performance-for-game-artists/) - 2019 - [Understanding the anatomy of GPUs using Pokémon](https://blog.ovhcloud.com/understanding-the-anatomy-of-gpus-using-pokemon/) - 2020 - [GPU ARCHITECTURE RESOURCES](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) - 2020 - [All the pipelines - journey through the GPU](https://www.youtube.com/watch?v=Y2KG_4OxDBg) -
silvesthu revised this gist
Jun 16, 2025 . 1 changed file with 9 additions and 7 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -31,7 +31,7 @@ - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202) - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - 2022 - [Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://gdcvault.com/play/1027811/Optimizing-Ray-Tracing-GPU-Workloads) - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu) - [Blog](https://rys.sommefeldt.com/) - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) @@ -43,12 +43,14 @@ - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA) - [Blog](https://interplayoflight.wordpress.com/) - 2018 - [DD2018: Kostas Anagnostou - Experiments in GPU occlusion culling](https://www.youtube.com/watch?v=U20dIA3SLTs) - 2020 - [GPU architecture resources](https://interplayoflight.wordpress.com/2020/05/09/gpu-architecture-resources/) - 2020 - [GPU architecture resources (twitter thread)](https://twitter.com/KostasAAA/status/1259153226043179011) - 2020 - [What is shader occupancy and why do we care about it?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/) - 2020 - [To z-prepass or not to z-prepass](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/) - 2022 - [Shader tips and tricks](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/) - 2023 - [Low-level thinking in high-level shading languages 2023](https://interplayoflight.wordpress.com/2023/12/29/low-level-thinking-in-high-level-shading-languages-2023/) - 2025 - [The hidden cost of shader instructions](https://interplayoflight.wordpress.com/2025/01/19/the-hidden-cost-of-shader-instructions/) - 2025 - [Async compute all the things](https://interplayoflight.wordpress.com/2025/05/27/async-compute-all-the-things/) - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru) - [Blog](https://anteru.net/) - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) -
silvesthu revised this gist
Apr 27, 2025 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -129,7 +129,7 @@ ### By Organization - AMD - [GPU Open](https://gpuopen.com/) - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU architecture programming documentation (Instruction Set Architecture)](https://gpuopen.com/amd-gpu-architecture-programming-documentation/) - [Reading AMD GPU ISA](https://rocm.blogs.amd.com/software-tools-optimization/amdgcn-isa/README.html) @@ -179,6 +179,7 @@ - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/) - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/) - Driver Stack - [ROCm™ Blogs](https://rocm.blogs.amd.com/) - [User Mode Driver for Vulkan (AMDVLK) by AMD](https://github.com/GPUOpen-Drivers/AMDVLK) - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl) - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc) -
silvesthu revised this gist
Apr 27, 2025 . 1 changed file with 7 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -118,6 +118,13 @@ - Alexandre Sabourin [@AlexSneezeKing](https://mastodon.gamedev.place/@AlexSneezeKing) - [Blog](https://github.com/AlexSabourinDev/cranberry_blog/tree/master) - 2024 - [Harnessing Wave Intrinsics For Good (And Evil)](https://github.com/AlexSabourinDev/cranberry_blog/blob/master/HarnessingWaveIntrinsicsForGoodAndEvil.pdf) [Video](https://www.youtube.com/watch?v=U6t33RLa0XM) - Chips and Cheese [@chipsandcheese9](https://x.com/chipsandcheese9) - [Blog](https://chipsandcheese.com/) - RDNA4 - 2025 - [AMD's RDNA4 Architecture (Video)](https://chipsandcheese.com/p/amds-rdna4-architecture-video) - 2025 - [RDNA 4's "Out-of-Order" Memory Accesses](https://chipsandcheese.com/p/rdna-4s-out-of-order-memory-accesses) - 2025 - [Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture](https://chipsandcheese.com/p/dynamic-register-allocation-on-amds) - 2025 - [RDNA 4’s Raytracing Improvements](https://chipsandcheese.com/p/rdna-4s-raytracing-improvements) ### By Organization -
silvesthu revised this gist
Apr 14, 2025 . 1 changed file with 4 additions and 2 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -23,6 +23,7 @@ - 2019 - [Half The Precision, Twice The Fun: Working With FP16 In HLSL](https://therealmjp.github.io/posts/shader-fp16/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - 2022 - [GPU Memory Pools in D3D12](https://therealmjp.github.io/posts/gpu-memory-pool/) - 2025 - [To Early-Z, or Not To Early-Z](https://therealmjp.github.io/posts/to-earlyz-or-not-to-earlyz/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) @@ -140,8 +141,8 @@ - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf) - 2020 - [Let’s build](https://gpuopen.com/lets-build/) - [Optimizing for the Radeon™ RDNA Architecture](https://gpuopen.com/wp-content/uploads/slides/GPUOpen_Let%E2%80%99sBuild2020_Optimizing%20for%20the%20Radeon%20RDNA%20Architecture.pdf) [Video](https://www.youtube.com/watch?v=7eEKLUhoTQs) - [From Source to ISA: A Trip Down the Shader Compiler Pipeline](https://gpuopen.com/wp-content/uploads/slides/GPUOpen_Let%E2%80%99sBuild2020_A%20Trip%20Down%20the%20GPU%20Compiler%20Pipeline.pdf) [Video](https://www.youtube.com/watch?v=_ilAL-1-moA) - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2022 - [Let's talk about (GPU) crashes](https://gpuopen.com/presentations/2022/Reboot%20Blue%202022%20-%20Lets%20talk%20about%20GPU%20crashes.pdf) @@ -160,6 +161,7 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - 2024 - ["RDNA3.5" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf) - 2025 - ["RDNA4" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna4-instruction-set-architecture.pdf) - [RDNA Performance Guide](https://gpuopen.com/learn/rdna-performance-guide/) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) -
silvesthu revised this gist
Feb 19, 2025 . 1 changed file with 4 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -114,6 +114,9 @@ - Graham Wihlidal [@gwihlidal](https://twitter.com/gwihlidal) - [Blog](https://www.wihlidal.com/blog/) - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - Alexandre Sabourin [@AlexSneezeKing](https://mastodon.gamedev.place/@AlexSneezeKing) - [Blog](https://github.com/AlexSabourinDev/cranberry_blog/tree/master) - 2024 - [Harnessing Wave Intrinsics For Good (And Evil)](https://github.com/AlexSabourinDev/cranberry_blog/blob/master/HarnessingWaveIntrinsicsForGoodAndEvil.pdf) [Video](https://www.youtube.com/watch?v=U6t33RLa0XM) ### By Organization @@ -144,8 +147,8 @@ - 2022 - [Let's talk about (GPU) crashes](https://gpuopen.com/presentations/2022/Reboot%20Blue%202022%20-%20Lets%20talk%20about%20GPU%20crashes.pdf) - 2022 - [Compute Shaders @ GIC](https://www.youtube.com/watch?v=eDLilzy2mq0) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Occupancy explained through Insert picture the AMD RDNA architecture](https://gpuopen.com/presentations/2024/GPC24_Occupancy_explained.pdf) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) -
silvesthu revised this gist
Feb 15, 2025 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -244,6 +244,8 @@ - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - Digital Dragon - [Video](https://www.youtube.com/@DigitalDragonsForGamedev/playlists) Not specifically on optimization - Graphics Programming Conference - [Video](https://www.youtube.com/@GraphicsProgrammingConference) Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) -
silvesthu revised this gist
Feb 15, 2025 . 1 changed file with 3 additions and 4 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -124,6 +124,7 @@ - [Reading AMD GPU ISA](https://rocm.blogs.amd.com/software-tools-optimization/amdgcn-isa/README.html) - [Machine-readable ISA documentation](https://gpuopen.com/machine-readable-isa/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Getting the Most Out of Delta Color Compression](https://gpuopen.com/learn/dcc-overview/) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) - 2017 - [Wave Programming in D3D12 and Vulkan](https://gpuopen.com/wp-content/uploads/2017/07/GDC2017-Wave-Programming-D3D12-Vulkan.pdf) @@ -136,12 +137,8 @@ - 2019 - [AMD GPU Performance Revealed](https://gpuopen.com/gdc-presentations/2019/gdc-2019-s6-gpu-performance-revealed.pdf) - 2019 - [Triangles Are Precious](https://gpuopen.com/presentations/2019/nordic-game-2019-triangles-are-precious.pdf) - 2020 - [Let’s build](https://gpuopen.com/lets-build/) - Optimizing for the Radeon™ RDNA Architecture - From Source to ISA: A Trip Down the Shader Compiler Pipeline - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2022 - [Let's talk about (GPU) crashes](https://gpuopen.com/presentations/2022/Reboot%20Blue%202022%20-%20Lets%20talk%20about%20GPU%20crashes.pdf) @@ -209,6 +206,8 @@ - 2022 - [NVIDIA ADA GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf) - 2022 - [SHADER EXECUTION REORDERING](https://developer.nvidia.com/sites/default/files/akamai/gameworks/ser-whitepaper.pdf) - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html) - Blackwell - 2025 - [NVIDIA RTX BLACKWELL GPU ARCHITECTURE](https://images.nvidia.com/aem-dam/Solutions/geforce/blackwell/nvidia-rtx-blackwell-gpu-architecture.pdf) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - 2017 - [CUDA kernel-level experiments in NVIDIA Nsight](https://docs.nvidia.com/nsight-visual-studio-edition/5.3/Content/Analysis/Report/CudaExperiments/Kernel_Level_Experiments.htm) on Issue Efficiency, Memory Statistics, Pipe Utilization, etc. -
silvesthu revised this gist
Dec 17, 2024 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -75,6 +75,7 @@ - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/) - 2024 - [Fixing The GPU](https://docs.google.com/document/d/1MyvFNjbaJl62v_RJ3oWhGIsUIOY74XYlcqlYHbZQriM) - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline) - [Blog](https://web.archive.org/web/20220307175030/http://renderingpipeline.com/) - 2012 - [Low-Level GPU Documentation](https://web.archive.org/web/20160305145630/http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/) -
silvesthu revised this gist
Dec 11, 2024 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -143,13 +143,15 @@ - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2022 - [Let's talk about (GPU) crashes](https://gpuopen.com/presentations/2022/Reboot%20Blue%202022%20-%20Lets%20talk%20about%20GPU%20crashes.pdf) - 2022 - [Compute Shaders @ GIC](https://www.youtube.com/watch?v=eDLilzy2mq0) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - 2024 - [Occupancy explained through Insert picture the AMD RDNA architecture](https://gpuopen.com/presentations/2024/GPC24_Occupancy_explained.pdf) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) - 2020 - [Understanding AMD GPU ISA](https://drive.google.com/file/d/1O9yTRZgsCFODH9II_PsC_6chROYduSPT/view) [Video](https://www.youtube.com/watch?v=HYrs_TGWgz4) - RDNA - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://web.archive.org/web/20240306074306/https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) @@ -176,6 +178,7 @@ - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) and Talks - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) on various topics - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) - 2015 - [Constant Buffers without Constant Pain](https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0) - 2016 - [Practical DirectX 12](https://developer.nvidia.com/sites/default/files/akamai/gameworks/blog/GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf) @@ -188,7 +191,8 @@ - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) - 2023 - [Avoiding Stalls and Hitches in DirectX 12](https://www.youtube.com/watch?v=f0a9mN4HQCI) - 2023 - [How to Improve Shader Performance by Resolving LDC Divergence](https://www.youtube.com/watch?v=HSsPJ4qK6AU) - 2024 - [Shader Debugging Made Easy with NVIDIA Nsight Graphics](https://developer.nvidia.com/blog/shader-debugging-made-easy-with-nvidia-nsight-graphics/) - Pascal - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) -
silvesthu revised this gist
Nov 21, 2024 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -146,6 +146,7 @@ - 2022 - [Compute Shaders @ GIC](https://www.youtube.com/watch?v=eDLilzy2mq0) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - 2024 - [Occupancy explained through Insert picture the AMD RDNA architecture](https://gpuopen.com/presentations/2024/GPC24_Occupancy_explained.pdf) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) @@ -263,6 +264,7 @@ - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES - [D3d12infoDB by Dmytro Bulatov](https://d3d12infodb.boolka.dev/index.html) Database based on D3d12info in Tools section below - [Feature Table](https://d3d12infodb.boolka.dev/FeatureTable.html) - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 5 additions and 7 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -187,7 +187,8 @@ - 2021 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) - 2022 - [Best Practices for Using NVIDIA RTX Ray Tracing (Updated)](https://developer.nvidia.com/blog/best-practices-for-using-nvidia-rtx-ray-tracing-updated/) - 2023 - [Practical Tips for Optimizing Ray Tracing](https://developer.nvidia.com/blog/practical-tips-for-optimizing-ray-tracing/) - 2023 - [Advanced API Performance](https://developer.nvidia.com/blog/tag/advanced-api-performance/) on Various topics - 2024 - [Shader Debugging Made Easy with NVIDIA Nsight Graphics](https://developer.nvidia.com/blog/shader-debugging-made-easy-with-nvidia-nsight-graphics/) - Pascal - 2016 - [NVIDIA GeForce GTX 1080 Whitepaper](http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_1080_Whitepaper_FINAL.pdf) - 2023 - [Tuning CUDA Applications for Pascal](https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html) @@ -280,20 +281,17 @@ - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) - AMD - [Radeon Developer Tool Suite](https://gpuopen.com/introducing-radeon-developer-tool-suite/) - [Radeon GPU Profiler (RGP)](https://gpuopen.com/rgp/) Low-level optimization tool - [Radeon Memory Visualizer (RMV)](https://gpuopen.com/rmv/) - [Radeon Developer Panel (RDP)](https://gpuopen.com/rdp/) - [Driver Experiments](https://gpuopen.com/learn/rdts-driver-experiments/) Low-level control of the AMD Adrenalin driver - [Radeon GPU Analyzer (RGA)](https://gpuopen.com/rga/) Offline compiler and performance analysis tool - [Radeon Raytracing Analyzer (RRA)](https://gpuopen.com/radeon-raytracing-analyzer/) - [Radeon GPU Detective (RGD)](https://gpuopen.com/radeon-gpu-detective/) Post-mortem analysis of GPU crashes - 2024 - [Post-Mortem GPU Crash Analysis With AMD Radeon GPU Detective (RGD)](https://gpuopen.com/gdc-presentations/2024/GDC2024_Radeon_GPU_Detective.pdf) - 2024 - [Game Optimization with The Radeon Developer Tool Suite](https://gpuopen.com/gdc-presentations/2024/GDC2024_Game_Optimization_with_The_Radeon_Developer_Tool_Suite.pdf) - [GPU Reshape](https://gpuopen.com/gpu-reshape/) On-the-fly instrumentation of GPU operations with instruction level validation of potentially undefined behavior - 2024 - [Introducing GPU Reshape](https://gpuopen.com/gdc-presentations/2024/GDC2024_Shader_Instrumentation_with_GPU_Reshape.pdf) - [Video](https://www.youtube.com/watch?v=qPNh7WrfFvc) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Other related tools -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 20 additions and 7 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -117,9 +117,11 @@ ### By Organization - AMD - [GPU Open](https://gpuopen.com/), [ROCm™ Blogs](https://rocm.blogs.amd.com/) - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU architecture programming documentation (Instruction Set Architecture)](https://gpuopen.com/amd-gpu-architecture-programming-documentation/) - [Reading AMD GPU ISA](https://rocm.blogs.amd.com/software-tools-optimization/amdgcn-isa/README.html) - [Machine-readable ISA documentation](https://gpuopen.com/machine-readable-isa/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) @@ -274,13 +276,24 @@ - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Systems](https://developer.nvidia.com/nsight-systems) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) - AMD - [Radeon Developer Tool Suite](https://gpuopen.com/introducing-radeon-developer-tool-suite/) - Talks - [GAME OPTIMIZATION WITH THE AMD RADEON DEVELOPER TOOL SUITE](https://gpuopen.com/gdc-presentations/2024/GDC2024_Game_Optimization_with_The_Radeon_Developer_Tool_Suite.pdf) - [Radeon GPU Profiler (RGP)](https://gpuopen.com/rgp/) Low-level optimization tool - [Radeon Memory Visualizer (RMV)](https://gpuopen.com/rmv/) - [Radeon Developer Panel (RDP)](https://gpuopen.com/rdp/) - [Driver Experiments](https://gpuopen.com/learn/rdts-driver-experiments/) Low-level control of the AMD Adrenalin driver - [Radeon GPU Analyzer (RGA)](https://gpuopen.com/rga/) Offline compiler and performance analysis tool - [Radeon Raytracing Analyzer (RRA)](https://gpuopen.com/radeon-raytracing-analyzer/) - [Radeon GPU Detective (RGD)](https://gpuopen.com/radeon-gpu-detective/) Post-mortem analysis of GPU crashes - Talks - [POST-MORTEM GPU CRASH ANALYSIS WITH AMD RADEON GPU DETECTIVE (RGD)](https://gpuopen.com/gdc-presentations/2024/GDC2024_Radeon_GPU_Detective.pdf) - [GPU Reshape](https://gpuopen.com/gpu-reshape/) On-the-fly instrumentation of GPU operations with instruction level validation of potentially undefined behavior - Talks - [INTRODUCING GPU RESHAPE](https://gpuopen.com/gdc-presentations/2024/GDC2024_Shader_Instrumentation_with_GPU_Reshape.pdf) - [Video](https://www.youtube.com/watch?v=qPNh7WrfFvc) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Other related tools -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -228,7 +228,8 @@ - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/) - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/) - Khronos Group - [Vulkan Documentation](https://docs.vulkan.org/spec/latest/index.html) - [Github](https://github.com/KhronosGroup/Vulkan-Samples) - [Performance samples](https://docs.vulkan.org/samples/latest/samples/performance/README.html) - [Github](https://github.com/KhronosGroup/Vulkan-Samples/tree/main/samples/performance) - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - GDC -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 4 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -20,6 +20,7 @@ - Matt Pettineo [@mynameismjp](https://twitter.com/mynameismjp) - [Blog](https://therealmjp.github.io/) - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2019 - [Half The Precision, Twice The Fun: Working With FP16 In HLSL](https://therealmjp.github.io/posts/shader-fp16/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - 2022 - [GPU Memory Pools in D3D12](https://therealmjp.github.io/posts/gpu-memory-pool/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) @@ -207,6 +208,8 @@ - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau) - [Kernel Mode Driver by Nvidia](https://github.com/NVIDIA/open-gpu-kernel-modules) - [Documentation of NVIDIA chip/hardware interfaces (open-gpu-doc)](https://github.com/nvidia/open-gpu-doc) - Misc - [Pack/Unpack shader utility in donut](https://github.com/NVIDIAGameWorks/donut/blob/main/include/donut/shaders/packing.hlsli) - Apple - [Metal Shading Language Specification](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf) - 2016 - [Advanced Metal Shader Optimization](https://developer.apple.com/videos/play/wwdc2016/606/) @@ -231,7 +234,7 @@ - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - Digital Dragon - [Video](https://www.youtube.com/@DigitalDragonsForGamedev/playlists) Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 13 additions and 6 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -75,7 +75,7 @@ - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/) - Robert Menzel [@renderpipeline](https://twitter.com/renderpipeline) - [Blog](https://web.archive.org/web/20220307175030/http://renderingpipeline.com/) - 2012 - [Low-Level GPU Documentation](https://web.archive.org/web/20160305145630/http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/) - RasterGrid [@rastergrid](https://twitter.com/rastergrid) - [Blog](https://rastergrid.com/blog/) @@ -140,19 +140,20 @@ - Radeon™ ProRender Full Spectrum Rendering 2.0: The Universal Rendering API - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2022 - [Compute Shaders @ GIC](https://www.youtube.com/watch?v=eDLilzy2mq0) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) - RDNA - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://web.archive.org/web/20240306074306/https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - 2024 - ["RDNA3.5" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf) - [RDNA Performance Guide](https://gpuopen.com/learn/rdna-performance-guide/) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer @@ -200,6 +201,7 @@ - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - 2017 - [CUDA kernel-level experiments in NVIDIA Nsight](https://docs.nvidia.com/nsight-visual-studio-edition/5.3/Content/Analysis/Report/CudaExperiments/Kernel_Level_Experiments.htm) on Issue Efficiency, Memory Statistics, Pipe Utilization, etc. - Driver Stack - [User Mode Driver for Vulkan (NVK) by Mesa](https://docs.mesa3d.org/drivers/nvk.html) - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau) @@ -228,6 +230,8 @@ - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - Digital Dragon - Video [https://www.youtube.com/@DigitalDragonsForGamedev/playlists] Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) @@ -236,7 +240,8 @@ - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU - 2017 - [Parallel Computer Architecture and Programming](https://web.archive.org/web/20240720165805/http://15418.courses.cs.cmu.edu/spring2017/home) - 2017 - [Parallel Computer Architecture and Programming Tsinghua ver.](https://web.archive.org/web/20240720165807/http://15418.courses.cs.cmu.edu/tsinghua2017/home) - [Video](https://www.youtube.com/@csyonghe/videos) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) @@ -262,14 +267,16 @@ - [PIX](https://devblogs.microsoft.com/pix/introduction/) - [Adding performance instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts) - [DRED](https://devblogs.microsoft.com/directx/dred/), [D3DDred.js](https://github.com/Microsoft/DirectX-Debugging-Tools) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) - AMD - [Radeon Developer Tool Suite](https://gpuopen.com/introducing-radeon-developer-tool-suite/) - [Radeon GPU Analyzer](https://gpuopen.com/rga/) - [Radeon Raytracing Analyzer](https://gpuopen.com/radeon-raytracing-analyzer/) - [Radeon Memory Visualizer](https://gpuopen.com/rmv/) - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) Post-mortem analysis of GPU crashes - [Driver Experiments](https://gpuopen.com/learn/rdts-driver-experiments/) Low-level control of the AMD Adrenalin driver - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Other related tools -
silvesthu revised this gist
Oct 20, 2024 . 1 changed file with 13 additions and 11 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -205,31 +205,33 @@ - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau) - [Kernel Mode Driver by Nvidia](https://github.com/NVIDIA/open-gpu-kernel-modules) - [Documentation of NVIDIA chip/hardware interfaces (open-gpu-doc)](https://github.com/nvidia/open-gpu-doc) - Apple - [Metal Shading Language Specification](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf) - 2016 - [Advanced Metal Shader Optimization](https://developer.apple.com/videos/play/wwdc2016/606/) - 2023 - [Explore GPU advancements in M3 and A17 Pro](https://developer.apple.com/videos/play/tech-talks/111375) - 2023 - [Learn performance best practices for Metal shaders](https://developer.apple.com/videos/play/tech-talks/111373) - Intel - [# Intel® Game Dev](https://www.intel.com/content/www/us/en/developer/topic-technology/gamedev/overview.html) - 2015 - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) - 2024 - [Intel® Graphics Performance Analyzers User Guide](https://www.intel.com/content/www/us/en/docs/gpa/user-guide/2024-3/overview.html) - Arm - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201) - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301) - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - Microsoft - [DirectX-Specs](https://microsoft.github.io/DirectX-Specs/) - 2019 - [New in D3D12 – background shader optimizations](https://devblogs.microsoft.com/directx/background-shader-optimizations/) - Khronos Group - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - SIGGRAPH - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/) Not specifically on optimization - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) -
silvesthu revised this gist
Sep 23, 2024 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -152,6 +152,7 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - 2024 - ["RDNA3.5" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer @@ -215,6 +216,7 @@ - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - Apple - [Metal Shading Language Specification](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf) - 2023 - [Explore GPU advancements in M3 and A17 Pro](https://developer.apple.com/videos/play/tech-talks/111375) - 2023 - [Learn performance best practices for Metal shaders](https://developer.apple.com/videos/play/tech-talks/111373) - Arm -
silvesthu revised this gist
May 19, 2024 . 1 changed file with 2 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -246,6 +246,7 @@ - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES - [D3d12infoDB by Dmytro Bulatov](https://d3d12infodb.boolka.dev/index.html) Database based on D3d12info in Tools section below - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools @@ -273,4 +274,4 @@ - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
May 12, 2024 . 1 changed file with 18 additions and 8 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -21,6 +21,7 @@ - [Blog](https://therealmjp.github.io/) - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - 2022 - [GPU Memory Pools in D3D12](https://therealmjp.github.io/posts/gpu-memory-pool/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) @@ -34,7 +35,7 @@ - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) - Michal Drobot [@michaldrobot](https://twitter.com/michaldrobot) - [Blog](https://michaldrobot.com/) - 2014 - [Low Level Optimizations for GCN – Digital Dragons 2014](https://michaldrobot.com/2014/05/12/low-level-optimizations-for-gcn-digital-dragons-2014-slides/) - [Video](https://www.youtube.com/watch?v=Bmy3Tt3Ottc) - 2014 - [GCN Execution Patterns in Full Screen Passes](https://michaldrobot.com/2014/04/01/gcn-execution-patterns-in-full-screen-passes/) - 2014 - [ShaderFastLibs](https://github.com/michaldrobot/ShaderFastLibs) - Kostas Anagnostou [@KostasAAA](https://twitter.com/KostasAAA) @@ -151,13 +152,6 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer @@ -166,6 +160,14 @@ - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECT3D®12 COMPUTE](https://gpuopen.com/learn/radeon-gpu-analyzer-2-2-direct3d12-compute/) - 2022 - [Visualizing VGPR Pressure with Radeon™ GPU Analyzer 2.6](https://gpuopen.com/learn/visualizing-vgpr-pressure-with-rga-2-6/) - 2022 - [Improving raytracing performance with the Radeon™ Raytracing Analyzer (RRA)](https://gpuopen.com/learn/improving-rt-perf-with-rra/) - Driver Stack - [User Mode Driver for Vulkan (AMDVLK) by AMD](https://github.com/GPUOpen-Drivers/AMDVLK) - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl) - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc) - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [User Mode Driver for Vulkan (RADV) by Mesa](https://docs.mesa3d.org/drivers/radv.html) - [Kernel Mode Driver by Linux, libdrm_amdgpu](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf) - Nvidia - [Developer Blog](https://developer.nvidia.com/blog) and Talks - 2012 - [GPU Performance Analysis and Optimization](http://on-demand.gputechconf.com/gtc/2012/presentations/S0514-GTC2012-GPU-Performance-Analysis.pdf) @@ -197,6 +199,11 @@ - 2023 - [Tuning CUDA Applications for NVIDIA Ada GPU Architecture](https://docs.nvidia.com/cuda/ada-tuning-guide/index.html) - CUDA - 2014 - [CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics](https://devblogs.nvidia.com/cuda-pro-tip-optimized-filtering-warp-aggregated-atomics/) - Driver Stack - [User Mode Driver for Vulkan (NVK) by Mesa](https://docs.mesa3d.org/drivers/nvk.html) - [Kernel Mode Driver by Linux (Nouveau)](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/nouveau) - [Kernel Mode Driver by Nvidia](https://github.com/NVIDIA/open-gpu-kernel-modules) - [Documentation of NVIDIA chip/hardware interfaces (open-gpu-doc)](https://github.com/nvidia/open-gpu-doc) - Intel - [Gamedev](https://software.intel.com/en-us/gamedev) - [Intel® Processor Graphics: Architecture & Programming](https://doc.lagout.org/electronics/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf) @@ -207,6 +214,9 @@ - [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) - 2019 - [Optimising a AAA Vulkan Title on Desktop](https://www.youtube.com/watch?v=Hc_i6X3qU08) - 2020 - [Vulkan Ray Tracing Best Practices for Hybrid Rendering](https://www.khronos.org/blog/vulkan-ray-tracing-best-practices-for-hybrid-rendering) - Apple - 2023 - [Explore GPU advancements in M3 and A17 Pro](https://developer.apple.com/videos/play/tech-talks/111375) - 2023 - [Learn performance best practices for Metal shaders](https://developer.apple.com/videos/play/tech-talks/111373) - Arm - [Introducing the Arm architecture](https://developer.arm.com/documentation/102404/0201) - [Arm GPU Best Practices Developer Guide](https://developer.arm.com/documentation/101897/0301) -
silvesthu revised this gist
May 11, 2024 . 1 changed file with 8 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -117,7 +117,7 @@ - AMD - [GPU Open](https://gpuopen.com/) and Talks - [Events Presentations](https://gpuopen.com/events/) - [AMD GPU architecture programming documentation (Instruction Set Architecture)](https://gpuopen.com/amd-gpu-architecture-programming-documentation/) - 2014 - [Vertex Shader Tricks](https://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau) - 2016 - [Leveraging asynchronous queues for concurrent execution](https://gpuopen.com/concurrent-execution-asynchronous-queues/) - 2016 - [AMD GCN Assembly: Cross-Lane Operations](https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/) @@ -151,10 +151,13 @@ - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - From application (CPU) to hardware (GPU) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) User Mode Driver (Vulkan), amdvlk64 - [Vulkan API Layer (XGL)](https://github.com/GPUOpen-Drivers/xgl) - [LLVM-Based Pipeline Compiler (LLPC)](https://github.com/GPUOpen-Drivers/llpc#llvm-based-pipeline-compiler-llpc) - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [linux/drivers/gpu/drm/amd/](https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/amd) Kernel Mode Driver (Linux), libdrm_amdgpu - [Micro engine scheduler (MES) firmware](https://gpuopen.com/download/documentation/micro_engine_scheduler.pdf) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer -
silvesthu revised this gist
Jan 19, 2024 . 1 changed file with 15 additions and 5 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -22,13 +22,13 @@ - 2018 - [Breaking Down Barriers](https://therealmjp.github.io/posts/breaking-down-barriers-part-1-whats-a-barrier/) - 2021 - [The Shader Permutation Problem](https://therealmjp.github.io/posts/shader-permutations-part1/) - Louis Bavoil [@louisbavoil](https://twitter.com/louisbavoil) - [D3D11 Vendor Hacks](https://docs.google.com/spreadsheets/d/1J_HIRVlYK8iI4u6AJrCeb66L5W36UDkd9ExSCku9s_o/edit#gid=0) - 2018 - [The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload](https://devblogs.nvidia.com/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/) - 2018 - [Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs (Presented by NVIDIA)](https://www.gdcvault.com/play/1024810/Fixing-the-Hyperdrive-Maximizing-Rendering) - 2019 - [Optimizing DX12/DXR GPU Workloads using Nsight Graphics: GPU Trace and the Peak-Performance-Percentage (P3) Method (Presented by NVIDIA)](https://www.gdcvault.com/browse/gdc-19/play/1026202) - 2020 - [Optimizing Compute Shaders for L2 Locality using Thread-Group ID Swizzling](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - 2021 - Dana Elifaz - [The Next Level of Optimization Advice with Nsight Graphics: GPU Trace](https://www.nvidia.com/en-us/on-demand/session/gtcspring21-e32683/) - 2022 - [(GDC Paywall) Optimizing Ray Tracing GPU Workloads using Nsight Graphics: GPU Trace and Nsight Systems](https://www.gdcvault.com/search.php#&category=free&firstfocus=&keyword=Optimizing+Ray%2BTracing%2BGPU%2BWorkloads%2Busing%2BNsight%2BGraphics) - Rys Sommefeldt [@ryszu](https://twitter.com/ryszu) - [Blog](https://rys.sommefeldt.com/) - 2018 - [Understanding GPU context rolls](https://gpuopen.com/understanding-gpu-context-rolls/) @@ -56,7 +56,6 @@ - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago) - 2019 - [GPU Architectures](https://drive.google.com/file/d/12ahbqGXNfY3V-1Gj5cvne2AH4BFWZHGD/view) - Sebastian Aaltonen [@SebAaltonen](https://twitter.com/SebAaltonen) - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) @@ -70,7 +69,7 @@ - [Blog](https://fgiesen.wordpress.com/) - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/) - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes [@NOTimothyLottes](https://twitter.com/NOTimothyLottes) - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) - 2017 - [ADVANCED SHADER PROGRAMMING ON GCN](https://gpuopen.com/wp-content/uploads/2017/03/GDC2017-Advanced-Shader-Programming-On-GCN.pdf) - 2018 - [Engine Optimization Hot Lap](https://slideplayer.com/slide/17173687/) @@ -106,6 +105,12 @@ - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner) - [Blog](https://www.jendrikillner.com/) - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar. - Hans-Kristian [@Themaister](https://twitter.com/Themaister) - [Blog](https://themaister.net/blog/) - 2024 - [Modernizing Granite’s mesh rendering](https://themaister.net/blog/2024/01/17/modernizing-granites-mesh-rendering/) - Graham Wihlidal [@gwihlidal](https://twitter.com/gwihlidal) - [Blog](https://www.wihlidal.com/blog/) - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) ### By Organization @@ -206,7 +211,6 @@ - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) @@ -215,11 +219,16 @@ - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU - 2017 - [Parallel Computer Architecture and Programming](http://15418.courses.cs.cmu.edu/spring2017/home), [Tsinghua ver. with video](http://15418.courses.cs.cmu.edu/tsinghua2017/home) ### Game Graphics Study - [Graphics Studies Compilation](http://www.adriancourreges.com/blog/2020/12/29/graphics-studies-compilation/) ### GPU Crash Debugging - 2018 - [Aftermath: Advances in GPU Crash Debugging](https://www.youtube.com/watch?v=VaGcs5-W6S4) - 2020 - (JP) [Device Removal の処方箋](https://cedil.cesa.or.jp/cedil_sessions/view/2258), [補足資料](https://shikihuiku.github.io/post/cedec2020_prescriptions_for_deviceremoval/) - 2023 - [GPU Crash Debugging in Unreal Engine: Tools, Techniques, and Best Practices | Unreal Fest 2023](https://www.youtube.com/watch?v=CyrGLMmVUAI) ### GPU Database - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) @@ -234,6 +243,7 @@ - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) - [Adding performance instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts) - [DRED](https://devblogs.microsoft.com/directx/dred/), [D3DDred.js](https://github.com/Microsoft/DirectX-Debugging-Tools) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) -
silvesthu revised this gist
Jan 17, 2024 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -135,6 +135,7 @@ - 2020 - [CONCURRENCY MODEL IN EXPLICIT GRAPHICS APIS](https://gpuopen.com/wp-content/uploads/2020/06/GPUOpen_Concurrency_vTUM.pdf) - 2021 - [Understanding Graphs in Radeon GPU Profiler and GPUView](https://gpuopen.com/learn/understanding-graphs-in-radeon-gpu-profiler-and-gpuview/) - 2023 - [Occupancy explained](https://gpuopen.com/learn/occupancy-explained/) - 2024 - [Mesh shaders: optimization and best practices](https://gpuopen.com/learn/mesh_shaders/mesh_shaders-optimization_and_best_practices/) - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) -
silvesthu revised this gist
Jan 14, 2024 . 1 changed file with 2 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -138,17 +138,16 @@ - GCN - 2013 - [GCN3 Instruction Set Architecture](http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf) - 2019 - [AMD GCN ISA: a first dive](https://giordi91.github.io/post/vegaisa/) - RDNA - 2019 - [INTRODUCING RDNA ARCHITECTURE](https://www.amd.com/system/files/documents/rdna-whitepaper.pdf) - 2019 - [RDNA Architecture](https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf) - 2020 - ["RDNA 1.0" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA_Shader_ISA.pdf) - 2020 - ["RDNA 2" Instruction Set Architecture](https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf) - 2020 - [RDNA2 Performance Guide](https://gpuopen.com/performance/) - 2022 - ["RDNA3" Instruction Set Architecture](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna3-shader-instruction-set-architecture-feb-2023_0.pdf) - Github - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [GPUOpen-Drivers/pal on Github](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) -
silvesthu revised this gist
Jan 13, 2024 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -68,6 +68,7 @@ - 2014 - [Real-time Rendering Blogs](http://svenandersson.se/2014/realtime-rendering-blogs.html) - Fabian Giesen [@rygorous](https://twitter.com/rygorous) - [Blog](https://fgiesen.wordpress.com/) - 2010 - [Finish your derivations, please](https://fgiesen.wordpress.com/2010/10/21/finish-your-derivations-please/) - 2011 - [A trip through the Graphics Pipeline 2011](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - Timothy Lottes - 2016 - [Understanding Memory Coalescing on GCN](https://gpuopen.com/gcn-memory-coalescing/) -
silvesthu revised this gist
Jan 13, 2024 . 1 changed file with 5 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -243,8 +243,10 @@ - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Other related tools - [RenderDoc](https://renderdoc.org/) Graphics debugger that allows quick and easy single-frame capture and detailed introspection - [APITrace](https://apitrace.github.io/) Trace OpenGL, Direct3D, and DirectDraw APIs calls to a file and replay - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool. Results on a wide range of GPUs are already available - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Jan 12, 2024 . 1 changed file with 3 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -102,6 +102,9 @@ - Anton Schreiner [@antonschrein](https://twitter.com/antonschrein) - [Blog](https://aschrein.github.io/) - 2019 - [What's up with my branch on GPU?](https://aschrein.github.io/jekyll/update/2019/06/13/whatsup-with-my-branches-on-gpu.html) - Jendrik Illner [@jendrikillner](https://twitter.com/jendrikillner) - [Blog](https://www.jendrikillner.com/) - [Graphics Programming Weekly Article Database](https://www.jendrikillner.com/article_database/) Not specifically on optimization. Have a search bar. ### By Organization -
silvesthu revised this gist
Jan 11, 2024 . 1 changed file with 8 additions and 8 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -51,7 +51,6 @@ - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) - 2018 - [More compute shaders](https://anteru.net/blog/2018/more-compute-shaders/) - 2018 - [Even more compute shaders](https://anteru.net/blog/2018/even-more-compute-shaders/) - Matthijs De Smedt [@anji_nl](https://twitter.com/anji_nl) - 2016 - [PC GPU Performance Hot Spots](https://developer.nvidia.com/pc-gpu-performance-hot-spots) - Maurizio Cerrato [@speedwago](https://twitter.com/speedwago) @@ -61,7 +60,6 @@ - 2017 - [Optimizing GPU occupancy and resource usage with large thread groups](https://gpuopen.com/learn/optimizing-gpu-occupancy-resource-usage-large-thread-groups/) - 2018 - [DD2018: Sebastian Aaltonen - GPU based clay simulation and ray tracing tech in Claybook](https://www.youtube.com/watch?v=Xpf7Ua3UqOA) - 2018 - [This is how I managed to port Claybook from consoles to ~4x slower handheld](https://threadreaderapp.com/thread/1076765876148490240.html) - Layla Mah [@MissQuickstep](https://twitter.com/missquickstep) - 2013 - [The AMD GCN Architecture - A Crash Course](https://www.slideshare.net/DevCentralAMD/gs4106-the-amd-gcn-architecture-a-crash-course-by-layla-mah) - 2013 - [Powering the Next Generation of Graphics: The AMD GCN Architecture](https://www.gdcvault.com/play/1019294/Powering-the-Next-Generation-of) @@ -203,13 +201,13 @@ - [Principles of High Performance](https://developer.arm.com/solutions/graphics/developer-guides/principles-of-high-performance) - [Accelerating 2D Applications](https://developer.arm.com/solutions/graphics/developer-guides/accelerating-2d-applications) - GDC - [Advanced Graphics Summit](https://gdcvault.com/search.php#&category=free&firstfocus=&keyword=advanced+graphics%2Bsummit) Not specifically on optimization - 2016 - [Optimizing the Graphics Pipeline With Compute](https://www.gdcvault.com/play/1023109/Optimizing-the-Graphics-Pipeline-With) - (JP) CEDEC - 2016 - [GPU最適化入門](https://www.slideshare.net/ssuser2e676d/gpu-65502505) - (Book) [マンガとイラストでわかる! GPU最適化入門](https://www.amazon.co.jp/%E3%83%9E%E3%83%B3%E3%82%AC%E3%81%A8%E3%82%A4%E3%83%A9%E3%82%B9%E3%83%88%E3%81%A7%E3%82%8F%E3%81%8B%E3%82%8B-GPU%E6%9C%80%E9%81%A9%E5%8C%96%E5%85%A5%E9%96%80-%E5%B0%8F%E5%8F%A3-%E8%B2%B4%E5%BC%98/dp/4862465048) - Siggraph - [Advances in Real-Time Rendering in Games](https://advances.realtimerendering.com/) Not specifically on optimization - 2009 - [From Shader Code to a Teraflop: How Shader Cores Work](https://web.archive.org/web/20181008131455/http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf) - 2020 - [LOW-LEVEL OPTIMIZATIONS IN THE LAST OF US PART II](https://www.naughtydog.com/blog/naughty_dog_at_siggraph_2020) - CMU @@ -220,13 +218,14 @@ ### GPU Database - [GPU Specs Database by techpowerup](https://www.techpowerup.com/gpu-specs/) - [GPU database by Matthäus G. Chajdas](https://db.thegpu.guru/) - [GPUInfo by Sascha Willems](https://www.gpuinfo.org/) For Vulkan, OpenGL, OpenGL ES - (JP) [GPU Spec Database by HYPERでんち](https://dench.flatlib.jp/start#hardware) ### Tools - Online Shader Compiler - [Compiler Explorer (godbolt)](https://godbolt.org/) Support DXC, AMD RGA - [Shader Playground](http://shader-playground.timjones.io/) Support DXC, FXC, glslang, hlsl2glsl, hlslparser, IntelShaderAnalyzer, AMD RGA, slang, XShaderCompiler - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) @@ -242,6 +241,7 @@ - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Utility - [PerfTest](https://github.com/sebbbi/perftest) A simple GPU shader memory operation performance test tool - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info) Get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part. -
silvesthu revised this gist
Jan 11, 2024 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -45,6 +45,7 @@ - 2020 - [WHAT IS SHADER OCCUPANCY AND WHY DO WE CARE ABOUT IT?](https://interplayoflight.wordpress.com/2020/11/11/what-is-shader-occupancy-and-why-do-we-care-about-it/) - 2020 - [TO Z-PREPASS OR NOT TO Z-PREPASS](https://interplayoflight.wordpress.com/2020/12/21/to-z-prepass-or-not-to-z-prepass/) - 2022 - [SHADER TIPS AND TRICKS](https://interplayoflight.wordpress.com/2022/01/22/shader-tips-and-tricks/) - 2023 - [LOW-LEVEL THINKING IN HIGH-LEVEL SHADING LANGUAGES 2023](https://interplayoflight.wordpress.com/2023/12/29/low-level-thinking-in-high-level-shading-languages-2023/) - Matthäus G. Chajdas [@NIV_Anteru](https://twitter.com/niv_anteru) - [Blog](https://anteru.net/) - 2018 - [Introduction to compute shaders](https://anteru.net/blog/2018/intro-to-compute-shaders/) @@ -148,7 +149,7 @@ - [Platform Abstraction Library (PAL)](https://github.com/GPUOpen-Drivers/pal) - [AMD Open Source Driver for Vulkan](https://github.com/GPUOpen-Drivers/AMDVLK) - OpenCL - 2013 - [AMD Accelerated Parallel Processing OpenCL Programming Guide](https://web.archive.org/web/20220426031939/http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf) - Radeon GPU Analyzer / Radeon Raytracing Analyzer - 2017 - [Live VGPR Analysis with Radeon™ GPU Analyzer](https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/) - 2019 - [USING RADEON™ GPU ANALYZER WITH DIRECTX®12 GRAPHICS](https://gpuopen.com/learn/radeon-gpu-analyzer-2-3-direct3d-12-graphics/) @@ -229,6 +230,7 @@ - Microsoft - [GPUView](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/using-gpuview) - [PIX](https://devblogs.microsoft.com/pix/introduction/) - [Adding performance instrumentation for PIX APIs](https://www.youtube.com/watch?v=ICM56FI97Ts) - Nvidia - [NVIDIA Developer Tools](https://developer.nvidia.com/tools-overview) - [NVIDIA Nsight Graphics](https://developer.nvidia.com/nsight-graphics) - [NVIDIA Nsight Aftermath SDK](https://developer.nvidia.com/nsight-aftermath) @@ -239,5 +241,7 @@ - [Radeon GPU Detective](https://gpuopen.com/radeon-gpu-detective/) - Intel - [Intel Graphics Performance Analyzers](https://software.intel.com/content/www/us/en/develop/tools/graphics-performance-analyzers.html) - Utility - [D3d12info by Adam Sawicki](https://github.com/sawickiap/D3d12info), get GPU information through DXGI and Direct3D 12 (D3D12) + AMD AGS, NVAPI, WinAPI, and some other sources Thanks JoseEmilio-ARM for ARM part.
NewerOlder