Skip to content

Instantly share code, notes, and snippets.

@w32zhong
Last active February 10, 2025 22:09
Show Gist options
  • Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.
Save w32zhong/7d4bf1e0f5b0b7352619f277cfadbba4 to your computer and use it in GitHub Desktop.

Revisions

  1. w32zhong revised this gist Feb 10, 2025. 1 changed file with 5 additions and 0 deletions.
    5 changes: 5 additions & 0 deletions steps.md
    Original file line number Diff line number Diff line change
    @@ -55,4 +55,9 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.t
    --tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \
    --basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \
    --gradient-accumulation-steps 4 --bs 1
    ```

    After training, use [a simple test script](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/application/test_v1.py) to evaluate the speed based on the saved model. For example, for 10-epoch checkpoint, change `ea_model_path` to
    ```py
    ea_model_path='../EAGLE-v1/eagle/train/ckpt/model_9'
    ```
  2. w32zhong renamed this gist Feb 6, 2025. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  3. w32zhong created this gist Feb 6, 2025.
    58 changes: 58 additions & 0 deletions gistfile1.txt
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,58 @@
    ## EAGLE v1 Replication
    Set up environment and run an inference test:
    ```sh
    git clone --branch v1 --depth 1 https://github.com/SafeAILab/EAGLE.git EAGLE-v1
    cd EAGLE-v1
    wget https://raw.githubusercontent.com/w32zhong/EAGLE/refs/heads/eagle-v1-save/application/test_v1.py -O eagle/application/test_v1.py
    pip install -e .
    pip install transformers==4.36.2
    pip install accelerate==0.21.0
    pip install datasets==3.2.0
    cd eagle
    CUDA_VISIBLE_DEVICES=0 python application/test_v1.py
    ```

    Go to `eagle/ge_data/allocation.py` and change your training GPU allocations. For example, in my case,
    ```py
    gpus=[[0, 1],[2, 3]]
    ```
    Then replace `ge_data_all_vicuna.py` to `ge_data_all_llama2chat.py` in `allocation.py` for training llama base model.

    Go to `eagle/ge_data/ge_data_all_llama2chat.py` and change the following:
    ```py
    # bigname="/home/hongyanz/scratch/weights/llama2chat/13B"
    bigname="meta-llama/Llama-2-7b-chat-hf"
    ...
    # ds = load_dataset('json', data_files="/home/hongyanz/scratch/data/ShareGPT_V4.3_unfiltered_cleaned_split.json")
    ds = load_dataset(
    path="Aeala/ShareGPT_Vicuna_unfiltered",
    data_files=["ShareGPT_V4.3_unfiltered_cleaned_split.json"],
    revision='8b0048ad6ae8c22f46a78c15559dec98feef5539'
    )
    ```

    Run the following to generate training data:
    ```sh
    cd ge_data
    python -m eagle.ge_data.allocation --outdir /mnt/wd_ssd/
    ```
    (`/mnt/wd_ssd` is my data storage directory)

    This will take a few hours and will consume 756 GiB disk space.

    Change directory to `../train` and modify the wandb settings in 'main.py':
    ```py
    #wandb.init(project="ess", entity="yuhui-li", config=train_config)
    wandb.init(project="beagle", config=train_config)
    ```

    Importantly, change the `list_files` function to filter out empty training files (in my experience there are 0.5% empty inputs), and skip all in-training tests due to potential divided-by-zero errors.
    Check out [this patch](https://github.com/w32zhong/EAGLE/blob/eagle-v1-save/patch_v1.diff) for all detailed changes.

    Now train the speculative decoder model:
    ```sh
    CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch -m --mixed_precision=bf16 eagle.train.main \
    --tmpdir /mnt/wd_ssd/sharegpt_0_67999_mufp16/ --cpdir ./ckpt --configpath ./llama_2_chat_7B_config.json \
    --basepath ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-chat-hf/snapshots/f5db02db724555f92da89c216ac04704f23d4590 \
    --gradient-accumulation-steps 4 --bs 1
    ```