Created
August 1, 2025 23:41
-
-
Save ubergarm/ee121af8c7974d05d4df12ccd35e529b to your computer and use it in GitHub Desktop.
Revisions
-
ubergarm created this gist
Aug 1, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,65 @@ gotta go play some games now but quick test: ```bash $ cd llama.cpp $ git remote -v | grep sam sammcj [email protected]:sammcj/llama.cpp.git (fetch) sammcj [email protected]:sammcj/llama.cpp.git (push) $ git checkout glm-4-5 $ git rev-parse --short HEAD 3d15c4a94 # compile cpu only $ ./build/bin/llama-server --version version: 6038 (3d15c4a94) built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu # test #!/usr/bin/env bash #ulimit -n 9999 model=/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf numactl -N 0 -m 0 \ ./build/bin/llama-server \ --model "$model"\ --alias ubergarm/GLM-4.5-Q8_0 \ --ctx-size 196608 \ -fa \ -ctk q8_0 -ctv q8_0 \ --parallel 1 \ --threads 128 \ --threads-batch 192 \ --numa numactl \ --host 127.0.0.1 \ --port 8080 \ --no-mmap print_info: model type = 355B.A32B print_info: model params = 358.34 B print_info: general.name = GLM 4.5 print_info: vocab type = BPE print_info: n_vocab = 151552 print_info: n_merges = 318088 print_info: BOS token = 151329 '<|endoftext|>' print_info: EOS token = 151329 '<|endoftext|>' print_info: EOT token = 151336 '<|user|>' print_info: UNK token = 151329 '<|endoftext|>' print_info: PAD token = 151329 '<|endoftext|>' print_info: LF token = 198 'Ċ' print_info: EOG token = 151329 '<|endoftext|>' print_info: EOG token = 151336 '<|user|>' print_info: max token length = 1024 load_tensors: loading model tensors, this can take a while... (mmap = false) model has unused tensor blk.92.eh_proj (size = 209715200 bytes) -- ignoring model has unused tensor blk.92.embed_tokens (size = 3103784960 bytes) -- ignoring model has unused tensor blk.92.enorm (size = 20480 bytes) -- ignoring model has unused tensor blk.92.hnorm (size = 20480 bytes) -- ignoring model has unused tensor blk.92.shared_head.head (size = 3103784960 bytes) -- ignoring model has unused tensor blk.92.shared_head.norm (size = 20480 bytes) -- ignoring llama_model_load: error loading model: missing tensor 'blk.3.exp_probs_b' llama_model_load_from_file_impl: failed to load model common_init_from_params: failed to load model '/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf' srv load_model: failed to load model, '/mnt/data/models/Thireus/GLM-4.5-THIREUS-BF16-SPECIAL_SPLIT/GLM-4.5-Thireus-Q8_0.gguf' srv operator(): operator(): cleaning up before exit... main: exiting due to model loading error ```