wen-long · August 23, 2024 18:14 · Aug 23, 2024 · Aug 21, 2024 · Aug 19, 2024
diff --git a/learn hash.txt b/learn hash.txt
@@ -1,3 +1,6 @@
+hash 重要吗?
+Rust杂谈 HashMap性能比不上Java? https://www.bilibili.com/video/BV1RTiceCEr6/
+
 https://github.com/rurban/smhasher  
 可以作为索引, 如果有专门讲 hash 的书没有提到 smhasher, 要么过时要么不专业
 务必 clone 代码自己在自己的机器上测试并得出结论, 不要过于关心其他人的结果

diff --git a/learn hash.txt b/learn hash.txt
@@ -7,6 +7,8 @@ smhasher 的 Summary 竟然将 rapidhash 放在首位, 这与我的实际测试
 吹牛逼和傲慢在 hash 算法这个领域也有出现
 
 在我的机器上, xxh3 性能是 rapidhash 的两倍. 运行在单核虚拟机, 宿主机宣称为 AMD Ryzen 7950x
+homebrew 原提供的 xxhash -Os 参数不能提供最佳性能,已经修复为 -O3
+详见 https://github.com/Homebrew/homebrew-core/pull/181691
 
 ```
 $ xxhsum -b

diff --git a/learn hash.txt b/learn hash.txt
@@ -0,0 +1,170 @@
+https://github.com/rurban/smhasher  
+可以作为索引, 如果有专门讲 hash 的书没有提到 smhasher, 要么过时要么不专业
+务必 clone 代码自己在自己的机器上测试并得出结论, 不要过于关心其他人的结果
+smhasher 的 Summary 竟然将 rapidhash 放在首位, 这与我的实际测试不符
+
+
+吹牛逼和傲慢在 hash 算法这个领域也有出现
+
+在我的机器上, xxh3 性能是 rapidhash 的两倍. 运行在单核虚拟机, 宿主机宣称为 AMD Ryzen 7950x
+
+```
+$ xxhsum -b
+xxhsum 0.8.1 by Yann Collet
+compiled as 64-bit x86_64 autoVec little endian with GCC 11.2.0
+Sample of 100 KB...
+ 1#XXH32                         :     102400 ->    99518 it/s ( 9718.6 MB/s)
+ 3#XXH64                         :     102400 ->   143061 it/s (13970.8 MB/s)
+ 5#XXH3_64b                      :     102400 ->   610795 it/s (59647.9 MB/s)
+11#XXH128                        :     102400 ->   604307 it/s (59014.3 MB/s)
+
+processor       : 0
+vendor_id       : AuthenticAMD
+cpu family      : 25
+model           : 1
+model name      : AMD EPYC-Milan Processor
+stepping        : 1
+microcode       : 0x1000065
+cpu MHz         : 4499.996
+cache size      : 512 KB
+physical id     : 0
+siblings        : 1
+core id         : 0
+cpu cores       : 1
+apicid          : 0
+initial apicid  : 0
+fpu             : yes
+fpu_exception   : yes
+cpuid level     : 13
+wp              : yes
+flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr wbnoinvd arat avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilities
+bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass srso
+bogomips        : 8999.99
+TLB size        : 1024 4K pages
+clflush size    : 64
+cache_alignment : 64
+address sizes   : 40 bits physical, 48 bits virtual
+power management:
+
+--- Testing xxh3 "xxHash v3, 64-bit" GOOD
+
+[[[ Speed Tests ]]]
+
+Bulk speed test - 262144-byte keys
+Alignment  7 - 14.657 bytes/cycle - 41934.22 MiB/sec @ 3 ghz
+Alignment  6 - 14.724 bytes/cycle - 42124.75 MiB/sec @ 3 ghz
+Alignment  5 - 14.683 bytes/cycle - 42008.81 MiB/sec @ 3 ghz
+Alignment  4 - 14.846 bytes/cycle - 42475.55 MiB/sec @ 3 ghz
+Alignment  3 - 14.635 bytes/cycle - 41871.10 MiB/sec @ 3 ghz
+Alignment  2 - 14.678 bytes/cycle - 41993.26 MiB/sec @ 3 ghz
+Alignment  1 - 14.573 bytes/cycle - 41692.77 MiB/sec @ 3 ghz
+Alignment  0 - 17.671 bytes/cycle - 50557.43 MiB/sec @ 3 ghz
+Average      - 15.058 bytes/cycle - 43082.23 MiB/sec @ 3 ghz
+
+--- Testing rapidhash "rapidhash v1" GOOD
+
+[[[ Speed Tests ]]]
+
+Bulk speed test - 262144-byte keys
+Alignment  7 -  8.065 bytes/cycle - 23075.42 MiB/sec @ 3 ghz
+Alignment  6 -  8.061 bytes/cycle - 23062.59 MiB/sec @ 3 ghz
+Alignment  5 -  8.066 bytes/cycle - 23076.17 MiB/sec @ 3 ghz
+Alignment  4 -  8.107 bytes/cycle - 23194.39 MiB/sec @ 3 ghz
+Alignment  3 -  8.069 bytes/cycle - 23084.34 MiB/sec @ 3 ghz
+Alignment  2 -  8.072 bytes/cycle - 23094.01 MiB/sec @ 3 ghz
+Alignment  1 -  8.070 bytes/cycle - 23089.31 MiB/sec @ 3 ghz
+Alignment  0 -  8.101 bytes/cycle - 23177.19 MiB/sec @ 3 ghz
+Average      -  8.076 bytes/cycle - 23106.68 MiB/sec @ 3 ghz
+```
+
+另一台 Ryzen5 PRO 4650G
+
+```
+./SMHasher  --test=Speed xxh3
+--- Testing xxh3 "xxHash v3, 64-bit" GOOD
+
+[[[ Speed Tests ]]]
+
+Bulk speed test - 262144-byte keys
+Alignment  7 - 13.483 bytes/cycle - 38574.27 MiB/sec @ 3 ghz
+Alignment  6 - 13.476 bytes/cycle - 38556.37 MiB/sec @ 3 ghz
+Alignment  5 - 13.406 bytes/cycle - 38354.21 MiB/sec @ 3 ghz
+Alignment  4 - 13.398 bytes/cycle - 38331.72 MiB/sec @ 3 ghz
+Alignment  3 - 13.452 bytes/cycle - 38486.57 MiB/sec @ 3 ghz
+Alignment  2 - 13.452 bytes/cycle - 38487.14 MiB/sec @ 3 ghz
+Alignment  1 - 13.427 bytes/cycle - 38414.08 MiB/sec @ 3 ghz
+Alignment  0 - 16.680 bytes/cycle - 47721.07 MiB/sec @ 3 ghz
+Average      - 13.847 bytes/cycle - 39615.68 MiB/sec @ 3 ghz
+
+./SMHasher  --test=Speed rapidhash
+--- Testing rapidhash "rapidhash v1" GOOD
+
+[[[ Speed Tests ]]]
+
+Bulk speed test - 262144-byte keys
+Alignment  7 -  7.747 bytes/cycle - 22165.33 MiB/sec @ 3 ghz
+Alignment  6 -  7.742 bytes/cycle - 22150.55 MiB/sec @ 3 ghz
+Alignment  5 -  7.753 bytes/cycle - 22182.48 MiB/sec @ 3 ghz
+Alignment  4 -  7.767 bytes/cycle - 22222.63 MiB/sec @ 3 ghz
+Alignment  3 -  7.816 bytes/cycle - 22360.94 MiB/sec @ 3 ghz
+Alignment  2 -  7.808 bytes/cycle - 22338.03 MiB/sec @ 3 ghz
+Alignment  1 -  7.805 bytes/cycle - 22329.86 MiB/sec @ 3 ghz
+Alignment  0 -  7.819 bytes/cycle - 22371.62 MiB/sec @ 3 ghz
+Average      -  7.782 bytes/cycle - 22265.18 MiB/sec @ 3 ghz
+```
+
+
+学术用语 pseudorandom function (PRF)
+
+
+XXH3 诞生帖
+https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html
+https://news.ycombinator.com/item?id=19402602
+
+阅读比较新的 hash 算法作者本人发布的帖子
+被认可和广泛采纳的 hash 可能宣传该 hash 如何 beat 了其他 hash
+指令集
+对于大/小输入的区别
+hash 的吞吐和延迟
+
+追求“多快好省”的 hash 算法, 因为存在又慢又差的 hash 算法
+
+
+SipHash 诞生记, 从 Hash DoS 攻击讲起, 当时太多的软件基础不能抵抗该攻击
+[29C3: Hash-flooding DoS reloaded: attacks and defenses (EN)](https://www.youtube.com/watch?v=Vdrab3sB7MU)
+SipHash 早期文档 https://web.archive.org/web/20190207205346/https://www.131002.net/siphash/
+
+SipHash 没有被 go 代码维护者足够 like, golang 标准库不会提供 SipHash
+https://github.com/golang/go/issues/19659
+
+有人提议 golang map 使用 SipHash 以加强安全, 因维护者观点不同被拒绝
+
+> 1. golang map 的用户(开发者)是安全的第一责任人, key 如果使用用户输入是 map 用户的责任
+> 2. 当前没有对 golang map Hash-flooding 的 POC
+
+https://github.com/golang/go/issues/9365
+
+
+尽管上面两条情况没有发生变化, go 还是修改了 aeshash 实现来避免攻击, 是因为心虚吗? 既然要修改 aeshash, 为何不使用 SipHash?
+https://github.com/golang/go/commit/91059de095703ebc4ce6b8bad7a0a40dedeef7dc
+
+hash 实现可以充分使用高级指令集, golang map 内部 hash 算法会使用 aes 指令(不是用于加密)
+https://github.com/tildeleb/aeshash
+
+内部 hash 可以实现的很混乱, 毕竟 key 无需暴露和传递
+
+怎么就算 cryptographic hash function?
+linux 内核文档 https://docs.kernel.org/security/siphash.html 提到 SipHash is a cryptographically secure PRF
+
+不同于 golang 对 SipHash 无端的抵触, linux 内核决定用 SipHash 替换 "all of these hashing functions"
+https://lwn.net/Articles/711167/
+
+https://github.com/BLAKE3-team/BLAKE3
+
+hash 的演进方向
+1. beat 竞争对手, 加入 smhasher
+2. 发帖演讲宣传
+3. 公布正式文档
+4. 提供便于使用的二进制, 固定实现, fingerprint functions
+5. 采用高级指令集
+6. 并行化