Last active
February 10, 2025 22:09
-
-
Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## EAGLE v1 Replication | |
| Set up environment and run an inference test: | |
| ```sh | |
| git clone --branch v1 --depth 1 https://github.com/SafeAILab/EAGLE.git EAGLE-v1 | |
| cd EAGLE-v1 | |
| wget https://raw.githubusercontent.com/w32zhong/EAGLE/refs/heads/eagle-v1-save/application/test_v1.py -O eagle/application/test_v1.py | |
| pip install -e . | |
| pip install transformers==4.36.2 | |
| pip install accelerate==0.21.0 | |
| pip install datasets==3.2.0 | |
| cd eagle | |
| CUDA_VISIBLE_DEVICES=0 python application/test_v1.py | |
| ``` | |
| Go to `eagle/ge_data/allocation.py` and change your training GPU allocations. For example, in my case, | |
| ```py | |
| gpus=[[0, 1],[2, 3]] | |
| ``` | |
| Then replace `ge_data_all_vicuna.py` to `ge_data_all_llama2chat.py` in `allocation.py` for training llama base model. | |
| Go to `eagle/ge_data/ge_data_all_llama2chat.py` and change the following: | |
| ```py | |
| # bigname="/home/hongyanz/scratch/weights/llama2chat/13B" | |
| bigname="meta-llama/Llama-2-7b-chat-hf" | |
| ... | |
| # ds = load_dataset('json', data_files="/home/hongyanz/scratch/data/ShareGPT_V4.3_unfiltered_cleaned_split.json") | |
| ds = load_dataset( | |
| path="Aeala/ShareGPT_Vicuna_unfiltered", | |
| data_files=["ShareGPT_V4.3_unfiltered_cleaned_split.json"], | |
| revision='8b0048ad6ae8c22f46a78c15559dec98feef5539' | |
| ) | |
| ``` | |
| Run the following to generate training data: | |
| ```sh | |
| cd ge_data | |
| python -m eagle.ge_data.allocation --outdir /mnt/wd_ssd/ | |
| ``` | |
| (`/mnt/wd_ssd` is my data storage directory) | |
| This will take a few hours and will consume 756 GiB disk space. | |
| Change directory to `../train` and modify the wandb settings in 'main.py': | |
| ```py | |
| #wandb.init(project="ess", entity="yuhui-li", config=train_config) | |
| wandb.init(project="beagle", config=train_config) | |
| ``` | |
| Importantly, change the `list_files` function to filter out empty training files (in my experience there are 0.5% empty inputs), and skip all in-training tests due to potential divided-by-zero errors. | |
| Check out [this patch](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/patch_v1.diff) for all detailed changes. | |
| Now train the speculative decoder model: | |
| ```sh | |
| CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.train.main \ | |
| --tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \ | |
| --basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \ | |
| --gradient-accumulation-steps 4 --bs 1 | |
| ``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment