Last active
February 10, 2025 22:09
-
-
Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.
Revisions
-
w32zhong revised this gist
Feb 10, 2025 . 1 changed file with 5 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -55,4 +55,9 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.t --tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \ --basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \ --gradient-accumulation-steps 4 --bs 1 ``` After training, use [a simple test script](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/application/test_v1.py) to evaluate the speed based on the saved model. For example, for 10-epoch checkpoint, change `ea_model_path` to ```py ea_model_path='../EAGLE-v1/eagle/train/ckpt/model_9' ``` -
w32zhong renamed this gist
Feb 6, 2025 . 1 changed file with 0 additions and 0 deletions.There are no files selected for viewing
File renamed without changes. -
w32zhong created this gist
Feb 6, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,58 @@ ## EAGLE v1 Replication Set up environment and run an inference test: ```sh git clone --branch v1 --depth 1 https://github.com/SafeAILab/EAGLE.git EAGLE-v1 cd EAGLE-v1 wget https://raw.githubusercontent.com/w32zhong/EAGLE/refs/heads/eagle-v1-save/application/test_v1.py -O eagle/application/test_v1.py pip install -e . pip install transformers==4.36.2 pip install accelerate==0.21.0 pip install datasets==3.2.0 cd eagle CUDA_VISIBLE_DEVICES=0 python application/test_v1.py ``` Go to `eagle/ge_data/allocation.py` and change your training GPU allocations. For example, in my case, ```py gpus=[[0, 1],[2, 3]] ``` Then replace `ge_data_all_vicuna.py` to `ge_data_all_llama2chat.py` in `allocation.py` for training llama base model. Go to `eagle/ge_data/ge_data_all_llama2chat.py` and change the following: ```py # bigname="/home/hongyanz/scratch/weights/llama2chat/13B" bigname="meta-llama/Llama-2-7b-chat-hf" ... # ds = load_dataset('json', data_files="/home/hongyanz/scratch/data/ShareGPT_V4.3_unfiltered_cleaned_split.json") ds = load_dataset( path="Aeala/ShareGPT_Vicuna_unfiltered", data_files=["ShareGPT_V4.3_unfiltered_cleaned_split.json"], revision='8b0048ad6ae8c22f46a78c15559dec98feef5539' ) ``` Run the following to generate training data: ```sh cd ge_data python -m eagle.ge_data.allocation --outdir /mnt/wd_ssd/ ``` (`/mnt/wd_ssd` is my data storage directory) This will take a few hours and will consume 756 GiB disk space. Change directory to `../train` and modify the wandb settings in 'main.py': ```py #wandb.init(project="ess", entity="yuhui-li", config=train_config) wandb.init(project="beagle", config=train_config) ``` Importantly, change the `list_files` function to filter out empty training files (in my experience there are 0.5% empty inputs), and skip all in-training tests due to potential divided-by-zero errors. Check out [this patch](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/patch_v1.diff) for all detailed changes. Now train the speculative decoder model: ```sh CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.train.main \ --tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \ --basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \ --gradient-accumulation-steps 4 --bs 1 ```