Skip to content

Instantly share code, notes, and snippets.

@w32zhong
Last active February 10, 2025 22:09
Show Gist options
  • Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.
Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.
## EAGLE v1 Replication
Set up environment and run an inference test:
```sh
git clone --branch v1 --depth 1 https://github.com/SafeAILab/EAGLE.git EAGLE-v1
cd EAGLE-v1
wget https://raw.githubusercontent.com/w32zhong/EAGLE/refs/heads/eagle-v1-save/application/test_v1.py -O eagle/application/test_v1.py
pip install -e .
pip install transformers==4.36.2
pip install accelerate==0.21.0
pip install datasets==3.2.0
cd eagle
CUDA_VISIBLE_DEVICES=0 python application/test_v1.py
```
Go to `eagle/ge_data/allocation.py` and change your training GPU allocations. For example, in my case,
```py
gpus=[[0, 1],[2, 3]]
```
Then replace `ge_data_all_vicuna.py` to `ge_data_all_llama2chat.py` in `allocation.py` for training llama base model.
Go to `eagle/ge_data/ge_data_all_llama2chat.py` and change the following:
```py
# bigname="/home/hongyanz/scratch/weights/llama2chat/13B"
bigname="meta-llama/Llama-2-7b-chat-hf"
...
# ds = load_dataset('json', data_files="/home/hongyanz/scratch/data/ShareGPT_V4.3_unfiltered_cleaned_split.json")
ds = load_dataset(
path="Aeala/ShareGPT_Vicuna_unfiltered",
data_files=["ShareGPT_V4.3_unfiltered_cleaned_split.json"],
revision='8b0048ad6ae8c22f46a78c15559dec98feef5539'
)
```
Run the following to generate training data:
```sh
cd ge_data
python -m eagle.ge_data.allocation --outdir /mnt/wd_ssd/
```
(`/mnt/wd_ssd` is my data storage directory)
This will take a few hours and will consume 756 GiB disk space.
Change directory to `../train` and modify the wandb settings in 'main.py':
```py
#wandb.init(project="ess", entity="yuhui-li", config=train_config)
wandb.init(project="beagle", config=train_config)
```
Importantly, change the `list_files` function to filter out empty training files (in my experience there are 0.5% empty inputs), and skip all in-training tests due to potential divided-by-zero errors.
Check out [this patch](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/patch_v1.diff) for all detailed changes.
Now train the speculative decoder model:
```sh
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.train.main \
--tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \
--basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \
--gradient-accumulation-steps 4 --bs 1
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment