- 
      
- 
        Save atzamis/9819be47dae6acbd28fa802bb8b787fe to your computer and use it in GitHub Desktop. 
Revisions
- 
        Artefact2 revised this gist Mar 15, 2024 . 1 changed file with 15 additions and 39 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -43,42 +43,18 @@ See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matri # ROCm benchmarks for Mistral-7B * Last updated 2024-03-15 (bench #6083).  | | **GiB** | **pp512 -ngl 99** | **tg128 -ngl 99** | **pp512 -ngl 0** | **tg128 -ngl 0** | **pp512 -ngl 0 #6083** | |------------|---------|-------------------|-------------------|------------------|------------------|------------------------| | **IQ1_S** | 1.50 | 709.29 | 74.85 | 324.35 | 15.66 | 585.61 | | **IQ2_XS** | 2.05 | 704.52 | 58.44 | 316.10 | 15.11 | 557.68 | | **IQ3_XS** | 2.79 | 682.72 | 45.79 | 300.61 | 10.49 | 527.83 | | **IQ4_XS** | 3.64 | 712.96 | 64.17 | 292.36 | 11.06 | 495.92 | | **Q4_0** | 3.83 | 870.44 | 63.42 | 310.94 | 10.44 | 554.56 | | **Q5_K** | 4.78 | 691.40 | 46.52 | 273.83 | 8.54 | 453.58 | | **Q6_K** | 5.53 | 661.98 | 47.57 | 261.16 | 7.34 | 415.22 | | **Q8_0** | 7.17 | 881.95 | 39.74 | 270.70 | 5.74 | 440.44 | | **f16** | 13.49 | | | 211.12 | 3.06 | 303.60 | 
- 
        Artefact2 revised this gist Mar 11, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,6 +1,6 @@ # Which GGUF is right for me? (Opinionated) Good question! I am collecting human data on how quantization affects outputs. See here for more information: https://github.com/ggerganov/llama.cpp/discussions/5962 In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters. 
- 
        Artefact2 revised this gist Mar 5, 2024 . 1 changed file with 1 addition and 10 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -6,16 +6,7 @@ In the meantime, use the largest that fully fits in your GPU. If you can comfort # llama.cpp feature matrix See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix # KL-divergence statistics for Mistral-7B 
- 
        Artefact2 revised this gist Mar 4, 2024 . 1 changed file with 3 additions and 8 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,13 +1,8 @@ # Which GGUF is right for me? (Opinionated) Good question! I am collecting human data on how quantization affects outputs. Contact me if you want to help. In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters. # llama.cpp feature matrix 
- 
        Artefact2 revised this gist Mar 3, 2024 . 1 changed file with 43 additions and 1 deletion.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -53,4 +53,46 @@ | **Q4_K_M** | 4.83 | 0.0075 | 0.0885 | 0.0576 | 0.0060 | | **Q5_K_S** | 5.52 | 0.0045 | 0.0393 | 0.0454 | 0.0005 | | **Q5_K_M** | 5.67 | 0.0043 | 0.0368 | 0.0444 | 0.0005 | | **Q6_K** | 6.57 | 0.0032 | 0.0222 | 0.0394 | −0.0008 | # ROCm benchmarks for Mistral-7B * TODO: add fancy graph * Last updated 2024-03-03. | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | ---------- | ---------------: | | llama 7B IQ1_S - 1.5625 bpw | 1.50 GiB | 7.24 B | ROCm | 99 | pp 512 | 709.29 ± 1.88 | | llama 7B IQ1_S - 1.5625 bpw | 1.50 GiB | 7.24 B | ROCm | 99 | tg 128 | 74.85 ± 0.02 | | llama 7B IQ2_XS - 2.3125 bpw | 2.05 GiB | 7.24 B | ROCm | 99 | pp 512 | 704.52 ± 1.67 | | llama 7B IQ2_XS - 2.3125 bpw | 2.05 GiB | 7.24 B | ROCm | 99 | tg 128 | 58.44 ± 0.07 | | llama 7B IQ3_XS - 3.3 bpw | 2.79 GiB | 7.24 B | ROCm | 99 | pp 512 | 682.72 ± 1.98 | | llama 7B IQ3_XS - 3.3 bpw | 2.79 GiB | 7.24 B | ROCm | 99 | tg 128 | 45.79 ± 0.05 | | llama 7B IQ4_XS - 4.25 bpw | 3.64 GiB | 7.24 B | ROCm | 99 | pp 512 | 712.96 ± 0.98 | | llama 7B IQ4_XS - 4.25 bpw | 3.64 GiB | 7.24 B | ROCm | 99 | tg 128 | 64.17 ± 0.06 | | llama 7B Q4_0 | 3.83 GiB | 7.24 B | ROCm | 99 | pp 512 | 870.44 ± 0.40 | | llama 7B Q4_0 | 3.83 GiB | 7.24 B | ROCm | 99 | tg 128 | 63.42 ± 0.02 | | llama 7B Q5_K - Medium | 4.78 GiB | 7.24 B | ROCm | 99 | pp 512 | 691.40 ± 0.09 | | llama 7B Q5_K - Medium | 4.78 GiB | 7.24 B | ROCm | 99 | tg 128 | 46.52 ± 0.00 | | llama 7B Q6_K | 5.53 GiB | 7.24 B | ROCm | 99 | pp 512 | 661.98 ± 0.15 | | llama 7B Q6_K | 5.53 GiB | 7.24 B | ROCm | 99 | tg 128 | 47.57 ± 0.00 | | llama 7B Q8_0 | 7.17 GiB | 7.24 B | ROCm | 99 | pp 512 | 881.95 ± 0.17 | | llama 7B Q8_0 | 7.17 GiB | 7.24 B | ROCm | 99 | tg 128 | 39.74 ± 0.12 | | llama 7B IQ1_S - 1.5625 bpw | 1.50 GiB | 7.24 B | ROCm | 0 | pp 512 | 324.35 ± 2.72 | | llama 7B IQ1_S - 1.5625 bpw | 1.50 GiB | 7.24 B | ROCm | 0 | tg 128 | 15.66 ± 0.08 | | llama 7B IQ2_XS - 2.3125 bpw | 2.05 GiB | 7.24 B | ROCm | 0 | pp 512 | 316.10 ± 1.21 | | llama 7B IQ2_XS - 2.3125 bpw | 2.05 GiB | 7.24 B | ROCm | 0 | tg 128 | 15.11 ± 0.05 | | llama 7B IQ3_XS - 3.3 bpw | 2.79 GiB | 7.24 B | ROCm | 0 | pp 512 | 300.61 ± 1.21 | | llama 7B IQ3_XS - 3.3 bpw | 2.79 GiB | 7.24 B | ROCm | 0 | tg 128 | 10.49 ± 0.12 | | llama 7B IQ4_XS - 4.25 bpw | 3.64 GiB | 7.24 B | ROCm | 0 | pp 512 | 292.36 ± 9.67 | | llama 7B IQ4_XS - 4.25 bpw | 3.64 GiB | 7.24 B | ROCm | 0 | tg 128 | 11.06 ± 0.06 | | llama 7B Q4_0 | 3.83 GiB | 7.24 B | ROCm | 0 | pp 512 | 310.94 ± 2.01 | | llama 7B Q4_0 | 3.83 GiB | 7.24 B | ROCm | 0 | tg 128 | 10.44 ± 0.19 | | llama 7B Q5_K - Medium | 4.78 GiB | 7.24 B | ROCm | 0 | pp 512 | 273.83 ± 1.47 | | llama 7B Q5_K - Medium | 4.78 GiB | 7.24 B | ROCm | 0 | tg 128 | 8.54 ± 0.04 | | llama 7B Q6_K | 5.53 GiB | 7.24 B | ROCm | 0 | pp 512 | 261.16 ± 1.06 | | llama 7B Q6_K | 5.53 GiB | 7.24 B | ROCm | 0 | tg 128 | 7.34 ± 0.20 | | llama 7B Q8_0 | 7.17 GiB | 7.24 B | ROCm | 0 | pp 512 | 270.70 ± 2.32 | | llama 7B Q8_0 | 7.17 GiB | 7.24 B | ROCm | 0 | tg 128 | 5.74 ± 0.04 | | llama 7B F16 | 13.49 GiB | 7.24 B | ROCm | 0 | pp 512 | 211.12 ± 0.74 | | llama 7B F16 | 13.49 GiB | 7.24 B | ROCm | 0 | tg 128 | 3.06 ± 0.03 | 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 8 additions and 8 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -11,16 +11,16 @@ # llama.cpp feature matrix * Last updated 2024-02-27. * Improvements/corrections welcome! | | **CPU (AVX2)** | **CPU (ARM NEON)** | **Metal** | **cuBLAS** | **rocBLAS** | **SYCL** | **CLBlast** | **Vulkan** | **Kompute** | |:--------------------:|:--------------:|--------------------|:---------:|:----------:|:----------------:|----------|:-----------:|:----------:|:-----------:| | **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ❓ | ✅ | ✅ | 🚫 | | **I-quants** | ✅ (SLOW) | ✅ | ✅ (SLOW) | ✅ | ✅ | 🚫 | 🚫 | 🚫 | 🚫 | | **Multi-GPU** | N/A | N/A | N/A | ✅ | ❓ | 🚫 | ❓ | ✅ | ❓ | | **K cache quants** | ✅ | ❓ | ❓ | ✅ | Only q8_0 (SLOW) | ❓ | ✅ | 🚫 | 🚫 | | **MoE architecture** | ✅ | ❓ | ✅ | ✅ | ✅ | ❓ | Only -ngl 0 | 🚫 | 🚫 | # KL-divergence statistics for Mistral-7B 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -25,7 +25,7 @@ # KL-divergence statistics for Mistral-7B * Last updated 2024-02-27 (add IQ4_XS). * imatrix from wiki.train, 200*512 tokens. * KL-divergence measured on wiki.test. 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 4 additions and 3 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -29,7 +29,7 @@ * imatrix from wiki.train, 200*512 tokens. * KL-divergence measured on wiki.test.  | | **Bits per weight** | **KL-divergence median** | **KL-divergence q99** | **Top tokens differ** | **ln(PPL(Q)/PPL(base))** | |-------------|---------------------|--------------------------|-----------------------|-----------------------|--------------------------| @@ -41,15 +41,16 @@ | **Q2_K_S** | 2.79 | 0.0829 | 1.5111 | 0.1735 | 0.1600 | | **Q2_K** | 3.00 | 0.0588 | 1.0337 | 0.1492 | 0.1103 | | **IQ3_XXS** | 3.21 | 0.0330 | 0.5492 | 0.1137 | 0.0589 | | **IQ3_XS** | 3.32 | 0.0296 | 0.4550 | 0.1071 | 0.0458 | | **Q3_K_S** | 3.50 | 0.0304 | 0.4481 | 0.1068 | 0.0511 | | **IQ3_S** | 3.52 | 0.0205 | 0.3018 | 0.0895 | 0.0306 | | **IQ3_M** | 3.63 | 0.0186 | 0.2740 | 0.0859 | 0.0268 | | **Q3_K_M** | 3.89 | 0.0171 | 0.2546 | 0.0839 | 0.0258 | | **Q3_K_L** | 4.22 | 0.0152 | 0.2202 | 0.0797 | 0.0205 | | **IQ4_XS** | 4.32 | 0.0088 | 0.1082 | 0.0606 | 0.0079 | | **IQ4_NL** | 4.56 | 0.0085 | 0.1077 | 0.0605 | 0.0074 | | **Q4_K_S** | 4.57 | 0.0083 | 0.1012 | 0.0600 | 0.0081 | | **Q4_K_M** | 4.83 | 0.0075 | 0.0885 | 0.0576 | 0.0060 | | **Q5_K_S** | 5.52 | 0.0045 | 0.0393 | 0.0454 | 0.0005 | | **Q5_K_M** | 5.67 | 0.0043 | 0.0368 | 0.0444 | 0.0005 | | **Q6_K** | 6.57 | 0.0032 | 0.0222 | 0.0394 | −0.0008 | 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 7 additions and 7 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -14,13 +14,13 @@ * Last updated 2024-02-26. * Improvements/corrections welcome! | | **CPU (AVX2)** | **CPU (ARM NEON)** | **Metal** | **rocBLAS** | **cuBLAS** | **CLBlast** | **SYCL** | **Vulkan** | **Kompute** | |:--------------------:|:--------------:|---------------------|:---------:|:----------------:|:----------:|:-----------:|----------|:----------:|:-----------:| | **K-quants** | ✅ | ❓ | ✅ | ✅ | ✅ | ✅ | ❓ | ✅ | 🚫 | | **I-quants** | ✅ (SLOW) | ❓ | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | 🚫 | | **Multi-GPU** | N/A | N/A | N/A | ❓ | ✅ | ❓ | 🚫 | ✅ | ❓ | | **K cache quants** | ✅ | ❓ | ❓ | Only q8_0 (SLOW) | ✅ | ❓ | ❓ | ❓ | ❓ | | **MoE architecture** | ✅ | ❓ | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | ❓ | # KL-divergence statistics for Mistral-7B 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 7 additions and 6 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -14,12 +14,13 @@ * Last updated 2024-02-26. * Improvements/corrections welcome! | | **CPU (AVX2)** | **cuBLAS** | **rocBLAS** | **Metal** | **CLBlast** | **SYCL** | **Vulkan** | **Kompute** | |:--------------------:|:--------------:|:----------:|:----------------:|:---------:|:-----------:|----------|:----------:|:-----------:| | **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ❓ | ✅ | 🚫 | | **I-quants** | ✅ (SLOW) | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | 🚫 | | **Multi-GPU** | N/A | ✅ | ❓ | N/A | ❓ | 🚫 | ✅ | ❓ | | **K cache quants** | ✅ | ✅ | Only q8_0 (SLOW) | ❓ | ❓ | ❓ | ❓ | ❓ | | **MoE architecture** | ✅ | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | ❓ | # KL-divergence statistics for Mistral-7B 
- 
        Artefact2 revised this gist Feb 27, 2024 . 1 changed file with 6 additions and 8 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -14,14 +14,12 @@ * Last updated 2024-02-26. * Improvements/corrections welcome! | | **CPU (AVX2)** | **cuBLAS** | **rocBLAS** | **Metal** | **CLBlast** | **SYCL** | **Vulkan** | **Kompute** | |:--------------------:|:--------------:|:----------:|:-----------:|:---------:|:-----------:|----------|:----------:|:-----------:| | **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ❓ | ✅ | 🚫 | | **I-quants** | ✅ (SLOW) | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | 🚫 | | **Multi-GPU** | N/A | ✅ | ❓ | N/A | ❓ | 🚫 | ✅ | ❓ | | **MoE architecture** | ✅ | ✅ | ✅ | ✅ | ❓ | ❓ | 🚫 | ❓ | # KL-divergence statistics for Mistral-7B 
- 
        Artefact2 revised this gist Feb 26, 2024 . 1 changed file with 15 additions and 0 deletions.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -9,6 +9,21 @@ * **I am fully offloading (running on GPU): use the largest one that fits.** If you can comfortably fit Q4_K_S with room to spare, consider using another model with more parameters instead. # llama.cpp feature matrix * Last updated 2024-02-26. * Improvements/corrections welcome! | | **CPU (AVX2)** | **cuBLAS** | **rocBLAS** | **Metal** | **CLBlast** | **Vulkan** | **Kompute** | |---------------------------------|----------------|------------|-------------|-----------|-------------|------------|-------------| | **Legacy quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (SLOW) | | **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 | | **I-quants** | ✅ (SLOW) | ✅ | ✅ | ✅ | ❓ | 🚫 | 🚫 | | **Multi-GPU** | N/A | ✅ | 🚫 | N/A | ❓ | ✅ | ❓ | | **Llama, Mistral** architecture | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | **Mixtral** architecture | ✅ | ✅ | ✅ | ✅ | ❓ | 🚫 | ❓ | # KL-divergence statistics for Mistral-7B * Last updated 2024-02-26. 
- 
        Artefact2 revised this gist Feb 26, 2024 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,4 +1,4 @@ # Which GGUF is right for me? (Opinionated) * **I am partially offloading (running on CPU+GPU): use Q4_K_S.** The IQ stuff is slower on CPU and generally not worth the speed penalty. 
- 
        Artefact2 created this gist Feb 26, 2024 .There are no files selected for viewingThis file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,41 @@ # Which one is right for me? (Opinionated) * **I am partially offloading (running on CPU+GPU): use Q4_K_S.** The IQ stuff is slower on CPU and generally not worth the speed penalty. You can go higher (Q5_K_S, Q6_K) but there are diminishing returns for a considerable size increase. I consider Q4_K_S to be transparent, that is, indistinguishable from f16 under a blind test. (Before you disagree with me based on biased and anecdotal evidence, have you tried running a proper blind test?) * **I am fully offloading (running on GPU): use the largest one that fits.** If you can comfortably fit Q4_K_S with room to spare, consider using another model with more parameters instead. # KL-divergence statistics for Mistral-7B * Last updated 2024-02-26. * imatrix from wiki.train, 200*512 tokens. * KL-divergence measured on wiki.test.  | | **Bits per weight** | **KL-divergence median** | **KL-divergence q99** | **Top tokens differ** | **ln(PPL(Q)/PPL(base))** | |-------------|---------------------|--------------------------|-----------------------|-----------------------|--------------------------| | **IQ1_S** | 1.78 | 0.5495 | 5.5174 | 0.3840 | 0.9235 | | **IQ2_XXS** | 2.20 | 0.1751 | 2.4983 | 0.2313 | 0.2988 | | **IQ2_XS** | 2.43 | 0.1146 | 1.7693 | 0.1943 | 0.2046 | | **IQ2_S** | 2.55 | 0.0949 | 1.6284 | 0.1806 | 0.1722 | | **IQ2_M** | 2.76 | 0.0702 | 1.0935 | 0.1557 | 0.1223 | | **Q2_K_S** | 2.79 | 0.0829 | 1.5111 | 0.1735 | 0.1600 | | **Q2_K** | 3.00 | 0.0588 | 1.0337 | 0.1492 | 0.1103 | | **IQ3_XXS** | 3.21 | 0.0330 | 0.5492 | 0.1137 | 0.0589 | | **Q3_K_XS** | 3.32 | 0.0296 | 0.4550 | 0.1071 | 0.0458 | | **Q3_K_S** | 3.50 | 0.0304 | 0.4481 | 0.1068 | 0.0511 | | **IQ3_S** | 3.52 | 0.0205 | 0.3018 | 0.0895 | 0.0306 | | **IQ3_M** | 3.63 | 0.0186 | 0.2740 | 0.0859 | 0.0268 | | **Q3_K_M** | 3.89 | 0.0171 | 0.2546 | 0.0839 | 0.0258 | | **Q3_K_L** | 4.22 | 0.0152 | 0.2202 | 0.0797 | 0.0205 | | **IQ4_NL** | 4.56 | 0.0085 | 0.1077 | 0.0605 | 0.0074 | | **Q4_K_S** | 4.57 | 0.0083 | 0.1012 | 0.0600 | 0.0081 | | **Q4_K_M** | 4.83 | 0.0075 | 0.0885 | 0.0576 | 0.0060 | | **Q5_K_S** | 5.52 | 0.0045 | 0.0393 | 0.0454 | 0.0005 | | **Q5_K_M** | 5.67 | 0.0043 | 0.0368 | 0.0444 | 0.0005 | | **Q6_K** | 6.57 | 0.0032 | 0.0222 | 0.0394 | −0.0008 |