Skip to content

Instantly share code, notes, and snippets.

@fearnworks
Last active June 2, 2023 17:54
Show Gist options
  • Save fearnworks/faad5059a518a2ccbb9052ee3d45ebd2 to your computer and use it in GitHub Desktop.
Save fearnworks/faad5059a518a2ccbb9052ee3d45ebd2 to your computer and use it in GitHub Desktop.

Revisions

  1. fearnworks revised this gist Jun 2, 2023. 1 changed file with 5 additions and 1 deletion.
    6 changes: 5 additions & 1 deletion bug.md
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,13 @@
    workspace/llm-playground/notebooks/axolotl/runpod/axolotl-falcon-7b-qlora-gsm8k.ipynb

    Steps to reproduce :
    Steps to reproduce :

    1 ) Copy config from #4 run-16: 40*2 + xformer into examples/falcon/qlora.yml

    2 ) Run cells 1 & 2

    3 ) Run !accelerate launch scripts/finetune.py examples/falcon/qlora.yml

    4 ) kaboom

    Runpod config
  2. fearnworks created this gist Jun 2, 2023.
    131 changes: 131 additions & 0 deletions bug.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,131 @@
    workspace/llm-playground/notebooks/axolotl/runpod/axolotl-falcon-7b-qlora-gsm8k.ipynb

    Steps to reproduce :
    1 ) Copy config from #4 run-16: 40*2 + xformer into examples/falcon/qlora.yml
    2 ) Run cells 1 & 2
    3 ) Run !accelerate launch scripts/finetune.py examples/falcon/qlora.yml
    4 ) kaboom

    Runpod config
    ![image](https://user-images.githubusercontent.com/120260158/242956630-cac84d95-7b6c-4a21-b1b5-853836492100.png)


    Stacktrace:
    ```
    Loading checkpoint shards: 100%|██████████████████| 2/2 [00:17<00:00, 8.71s/it]
    Downloading (…)neration_config.json: 100%|█████| 111/111 [00:00<00:00, 18.8kB/s]
    INFO:root:converting PEFT model w/ prepare_model_for_int8_training
    /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/peft/utils/other.py:76: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead.
    warnings.warn(
    INFO:root:found linear modules: ['query_key_value', 'dense', 'dense_4h_to_h', 'dense_h_to_4h']
    trainable params: 130547712 || all params: 3739292544 || trainable%: 3.4912409356543783
    INFO:root:Compiling torch model
    INFO:root:Pre-saving adapter config to ./qlora-out
    INFO:root:Starting trainer...
    Traceback (most recent call last):
    File "/workspace/axolotl/scripts/finetune.py", line 294, in <module>
    fire.Fire(train)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
    File "/workspace/axolotl/scripts/finetune.py", line 281, in train
    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/transformers/trainer.py", line 1661, in train
    return inner_training_loop(
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/transformers/trainer.py", line 1767, in _inner_training_loop
    model, self.optimizer, self.lr_scheduler = self.accelerator.prepare(
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1192, in prepare
    result = tuple(
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1193, in <genexpr>
    self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1042, in _prepare_one
    return self.prepare_model(obj, device_placement=device_placement)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1260, in prepare_model
    raise ValueError(
    ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device()}you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()}
    Traceback (most recent call last):
    File "/root/miniconda3/envs/py3.9/bin/accelerate", line 8, in <module>
    sys.exit(main())
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/launch.py", line 934, in launch_command
    simple_launcher(args)
    File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/launch.py", line 594, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
    subprocess.CalledProcessError: Command '['/root/miniconda3/envs/py3.9/bin/python3', 'scripts/finetune.py', 'examples/falcon/qlora.yml']' returned non-zero exit status 1.
    ```

    Using this config :
    ```
    base_model: tiiuae/falcon-7b
    base_model_config: tiiuae/falcon-7b
    trust_remote_code: true
    model_type: AutoModelForCausalLM
    tokenizer_type: AutoTokenizer
    load_in_8bit: false
    load_in_4bit: true
    gptq: false
    strict: false
    push_dataset_to_hub:
    datasets:
    - path: QingyiSi/Alpaca-CoT
    data_files:
    - Chain-of-Thought/formatted_cot_data/gsm8k_train.json
    type: "alpaca:chat"
    dataset_prepared_path: last_run_prepared
    val_set_size: 0.01
    adapter: qlora
    lora_model_dir:
    sequence_len: 2048
    max_packed_sequence_len:
    lora_r: 64
    lora_alpha: 16
    lora_dropout: 0.05
    lora_target_modules:
    lora_target_linear: true
    lora_fan_in_fan_out:
    wandb_project: falcon-qlora
    wandb_watch:
    wandb_run_id:
    wandb_log_model:
    output_dir: ./qlora-out
    micro_batch_size: 40
    gradient_accumulation_steps: 2
    num_epochs: 3
    optimizer: paged_adamw_32bit
    torchdistx_path:
    lr_scheduler: cosine
    learning_rate: 0.0002
    train_on_inputs: false
    group_by_length: false
    bf16: true
    fp16: false
    tf32: true
    gradient_checkpointing: true
    # stop training after this many evaluation losses have increased in a row
    # https://huggingface.co/transformers/v4.2.2/_modules/transformers/trainer_callback.html#EarlyStoppingCallback
    early_stopping_patience: 3
    resume_from_checkpoint:
    auto_resume_from_checkpoints: true
    local_rank:
    logging_steps: 1
    xformers_attention: true
    flash_attention:
    gptq_groupsize:
    gptq_model_v1:
    warmup_steps: 10
    eval_steps: 5
    save_steps: 10
    debug:
    deepspeed:
    weight_decay: 0.000001
    fsdp:
    fsdp_config:
    special_tokens:
    pad_token: "<|endoftext|>"
    bos_token: ">>ABSTRACT<<"
    eos_token: "<|endoftext|>"
    ```