ubergarm · August 1, 2025 23:41 · Aug 1, 2025
diff --git a/testing-llamacpp-glm4.5.md b/testing-llamacpp-glm4.5.md
@@ -0,0 +1,65 @@
+gotta go play some games now but quick test:
+
+```bash
+$ cd llama.cpp
+$ git remote -v | grep sam
+sammcj  [email protected]:sammcj/llama.cpp.git (fetch)
+sammcj  [email protected]:sammcj/llama.cpp.git (push)
+$ git checkout glm-4-5
+$ git rev-parse --short HEAD
+3d15c4a94
+# compile cpu only
+$ ./build/bin/llama-server --version
+version: 6038 (3d15c4a94)
+built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
+# test
+#!/usr/bin/env bash
+
+#ulimit -n 9999
+
+model=/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf
+
+numactl -N 0 -m 0 \
+./build/bin/llama-server \
+    --model "$model"\
+    --alias ubergarm/GLM-4.5-Q8_0 \
+    --ctx-size 196608 \
+    -fa \
+    -ctk q8_0 -ctv q8_0 \
+    --parallel 1 \
+    --threads 128 \
+    --threads-batch 192 \
+    --numa numactl \
+    --host 127.0.0.1 \
+    --port 8080 \
+    --no-mmap
+
+print_info: model type       = 355B.A32B
+print_info: model params     = 358.34 B
+print_info: general.name     = GLM 4.5
+print_info: vocab type       = BPE
+print_info: n_vocab          = 151552
+print_info: n_merges         = 318088
+print_info: BOS token        = 151329 '<|endoftext|>'
+print_info: EOS token        = 151329 '<|endoftext|>'
+print_info: EOT token        = 151336 '<|user|>'
+print_info: UNK token        = 151329 '<|endoftext|>'
+print_info: PAD token        = 151329 '<|endoftext|>'
+print_info: LF token         = 198 'Ċ'
+print_info: EOG token        = 151329 '<|endoftext|>'
+print_info: EOG token        = 151336 '<|user|>'
+print_info: max token length = 1024
+load_tensors: loading model tensors, this can take a while... (mmap = false)
+model has unused tensor blk.92.eh_proj (size = 209715200 bytes) -- ignoring
+model has unused tensor blk.92.embed_tokens (size = 3103784960 bytes) -- ignoring
+model has unused tensor blk.92.enorm (size = 20480 bytes) -- ignoring
+model has unused tensor blk.92.hnorm (size = 20480 bytes) -- ignoring
+model has unused tensor blk.92.shared_head.head (size = 3103784960 bytes) -- ignoring
+model has unused tensor blk.92.shared_head.norm (size = 20480 bytes) -- ignoring
+llama_model_load: error loading model: missing tensor 'blk.3.exp_probs_b'
+llama_model_load_from_file_impl: failed to load model
+common_init_from_params: failed to load model '/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf'
+srv    load_model: failed to load model, '/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf'
+srv    operator(): operator(): cleaning up before exit...
+main: exiting due to model loading error
+```
No results found