Skip to content

Instantly share code, notes, and snippets.

@wuyakuma
wuyakuma / InfiniteGrid.shader
Created October 13, 2023 01:22 — forked from bgolus/InfiniteGrid.shader
Infinite Grid shader with procedural grid with configurable divisions and major and minor lines markings.
Shader "Unlit/InfiniteGrid"
{
Properties
{
[Toggle] _WorldUV ("Use World Space UV", Float) = 1.0
_GridScale ("Grid Scale", Float) = 1.0
_GridBias ("Grid Bias", Float) = 0.5
_GridDiv ("Grid Divisions", Float) = 10.0
_BaseColor ("Base Color", Color) = (0,0,0,1)
_LineColor ("Line Color", Color) = (1,1,1,1)
@wuyakuma
wuyakuma / Simulation_Projection.md
Created April 12, 2022 13:39 — forked from vassvik/Simulation_Projection.md
Realtime Fluid Simulation: Projection

Realtime Fluid Simulation: Projection

The core of most real-time fluid simulators, like the one in EmberGen, are based on the "Stable Fluids" algorithm by Jos Stam, which to my knowledge was first presented at SIGGRAPH '99. This is a post about one part of this algorithm that's often underestimated: Projection

MG4_F32.mp4

Stable Fluids

The Stable Fluids algorithm solves a subset of the famous "Navier Stokes equations", which describe how fluids interact and move. In particular, it typically solves what's called the "incompressible Euler equations", where viscous forces are often ignored.

@wuyakuma
wuyakuma / FastUniformLoadWithWaveOps.txt
Created November 26, 2018 07:25 — forked from sebbbi/FastUniformLoadWithWaveOps.txt
Fast uniform load with wave ops (up to 64x speedup)
In shader programming, you often run into a problem where you want to iterate an array in memory over all pixels in a compute shader
group (tile). Tiled deferred lighting is the most common case. 8x8 tile loops over a light list culled for that tile.
Simplified HLSL code looks like this:
Buffer<float4> lightDatas;
Texture2D<uint2> lightStartCounts;
RWTexture2D<float4> output;
[numthreads(8, 8, 1)]
vec2 DFG_Cloth(float roughness, float NoV) {
const vec4 c0 = vec4(0.24, 0.93, 0.01, 0.20);
const vec4 c1 = vec4(2.00, -1.30, 0.40, 0.03);
float s = 1.0 - NoV;
float e = s - c0.y;
float g = c0.x * exp2(-(e * e) / (2.0 * c0.z)) + s * c0.w;
float n = roughness * c1.x + c1.y;
float r = max(1.0 - n * n, c1.z) * g;
vec2 DFG_Cloth(float roughness, float NoV) {
const vec4 c0 = vec4(0.24, 0.93, 0.01, 0.20);
const vec4 c1 = vec4(2.00, -1.30, 0.40, 0.03);
float s = 1.0 - NoV;
float e = s - c0.y;
float g = c0.x * exp2(-(e * e) / (2.0 * c0.z)) + s * c0.w;
float n = roughness * c1.x + c1.y;
float r = max(1.0 - n * n, c1.z) * g;
@wuyakuma
wuyakuma / hash_fnv1a.h
Created November 6, 2017 05:35 — forked from ruby0x1/hash_fnv1a.h
FNV1a c++11 constexpr compile time hash functions, 32 and 64 bit
#pragma once
#include <stdint.h>
//fnv1a 32 and 64 bit hash functions
// key is the data to hash, len is the size of the data (or how much of it to hash against)
// code license: public domain or equivalent
// post: https://notes.underscorediscovery.com/constexpr-fnv1a/
inline const uint32_t hash_32_fnv1a(const void* key, const uint32_t len) {
@wuyakuma
wuyakuma / d_ggx.glsl
Created September 6, 2017 02:47 — forked from romainguy/d_ggx.glsl
D_GGX in mediump/half float
float D_GGX(float linearRoughness, float NoH, const vec3 h) {
// Walter et al. 2007, "Microfacet Models for Refraction through Rough Surfaces"
// In mediump, there are two problems computing 1.0 - NoH^2
// 1) 1.0 - NoH^2 suffers floating point cancellation when NoH^2 is close to 1 (highlights)
// 2) NoH doesn't have enough precision around 1.0
// Both problem can be fixed by computing 1-NoH^2 in highp and providing NoH in highp as well
// However, we can do better using Lagrange's identity:
// ||a x b||^2 = ||a||^2 ||b||^2 - (a . b)^2
@wuyakuma
wuyakuma / gpu_arch_resources
Created March 6, 2017 06:12 — forked from jhaberstro/gpu_arch_resources
GPU Architecture Learning Resources
http://courses.cms.caltech.edu/cs179/
http://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf
https://community.arm.com/graphics/b/blog
http://cdn.imgtec.com/sdk-documentation/PowerVR+Hardware.Architecture+Overview+for+Developers.pdf
http://cdn.imgtec.com/sdk-documentation/PowerVR+Series5.Architecture+Guide+for+Developers.pdf
https://www.imgtec.com/blog/a-look-at-the-powervr-graphics-architecture-tile-based-rendering/
https://www.imgtec.com/blog/the-dr-in-tbdr-deferred-rendering-in-rogue/
http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/opencl-optimization-guide/#50401334_pgfId-412605
https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/
https://community.arm.com/graphics/b/documents/posts/moving-mobile-graphics#siggraph2015
@wuyakuma
wuyakuma / Tex2DCatmullRom.hlsl
Created September 20, 2016 03:33 — forked from TheRealMJP/Tex2DCatmullRom.hlsl
An HLSL function for sampling a 2D texture with Catmull-Rom filtering, using 9 texture samples instead of 16
// Samples a texture with Catmull-Rom filtering, using 9 texture fetches instead of 16.
// See http://vec3.ca/bicubic-filtering-in-fewer-taps/ for more details
float4 SampleTextureCatmullRom(in Texture2D<float4> tex, in SamplerState linearSampler, in float2 uv, in float2 texSize)
{
// We're going to sample a a 4x4 grid of texels surrounding the target UV coordinate. We'll do this by rounding
// down the sample location to get the exact center of our "starting" texel. The starting texel will be at
// location [1, 1] in the grid, where [0, 0] is the top left corner.
float2 samplePos = uv * texSize;
float2 texPos1 = floor(samplePos - 0.5f) + 0.5f;