Last active
          June 2, 2023 17:54 
        
      - 
      
 - 
        
Save fearnworks/faad5059a518a2ccbb9052ee3d45ebd2 to your computer and use it in GitHub Desktop.  
Revisions
- 
        
fearnworks revised this gist
Jun 2, 2023 . 1 changed file with 5 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,9 +1,13 @@ workspace/llm-playground/notebooks/axolotl/runpod/axolotl-falcon-7b-qlora-gsm8k.ipynb Steps to reproduce : 1 ) Copy config from #4 run-16: 40*2 + xformer into examples/falcon/qlora.yml 2 ) Run cells 1 & 2 3 ) Run !accelerate launch scripts/finetune.py examples/falcon/qlora.yml 4 ) kaboom Runpod config  - 
        
fearnworks created this gist
Jun 2, 2023 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,131 @@ workspace/llm-playground/notebooks/axolotl/runpod/axolotl-falcon-7b-qlora-gsm8k.ipynb Steps to reproduce : 1 ) Copy config from #4 run-16: 40*2 + xformer into examples/falcon/qlora.yml 2 ) Run cells 1 & 2 3 ) Run !accelerate launch scripts/finetune.py examples/falcon/qlora.yml 4 ) kaboom Runpod config  Stacktrace: ``` Loading checkpoint shards: 100%|██████████████████| 2/2 [00:17<00:00, 8.71s/it] Downloading (…)neration_config.json: 100%|█████| 111/111 [00:00<00:00, 18.8kB/s] INFO:root:converting PEFT model w/ prepare_model_for_int8_training /root/miniconda3/envs/py3.9/lib/python3.9/site-packages/peft/utils/other.py:76: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead. warnings.warn( INFO:root:found linear modules: ['query_key_value', 'dense', 'dense_4h_to_h', 'dense_h_to_4h'] trainable params: 130547712 || all params: 3739292544 || trainable%: 3.4912409356543783 INFO:root:Compiling torch model INFO:root:Pre-saving adapter config to ./qlora-out INFO:root:Starting trainer... Traceback (most recent call last): File "/workspace/axolotl/scripts/finetune.py", line 294, in <module> fire.Fire(train) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/workspace/axolotl/scripts/finetune.py", line 281, in train trainer.train(resume_from_checkpoint=resume_from_checkpoint) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/transformers/trainer.py", line 1661, in train return inner_training_loop( File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/transformers/trainer.py", line 1767, in _inner_training_loop model, self.optimizer, self.lr_scheduler = self.accelerator.prepare( File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1192, in prepare result = tuple( File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1193, in <genexpr> self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1042, in _prepare_one return self.prepare_model(obj, device_placement=device_placement) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/accelerator.py", line 1260, in prepare_model raise ValueError( ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device()}you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()} Traceback (most recent call last): File "/root/miniconda3/envs/py3.9/bin/accelerate", line 8, in <module> sys.exit(main()) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main args.func(args) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/launch.py", line 934, in launch_command simple_launcher(args) File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/accelerate/commands/launch.py", line 594, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/root/miniconda3/envs/py3.9/bin/python3', 'scripts/finetune.py', 'examples/falcon/qlora.yml']' returned non-zero exit status 1. ``` Using this config : ``` base_model: tiiuae/falcon-7b base_model_config: tiiuae/falcon-7b trust_remote_code: true model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer load_in_8bit: false load_in_4bit: true gptq: false strict: false push_dataset_to_hub: datasets: - path: QingyiSi/Alpaca-CoT data_files: - Chain-of-Thought/formatted_cot_data/gsm8k_train.json type: "alpaca:chat" dataset_prepared_path: last_run_prepared val_set_size: 0.01 adapter: qlora lora_model_dir: sequence_len: 2048 max_packed_sequence_len: lora_r: 64 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules: lora_target_linear: true lora_fan_in_fan_out: wandb_project: falcon-qlora wandb_watch: wandb_run_id: wandb_log_model: output_dir: ./qlora-out micro_batch_size: 40 gradient_accumulation_steps: 2 num_epochs: 3 optimizer: paged_adamw_32bit torchdistx_path: lr_scheduler: cosine learning_rate: 0.0002 train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: true gradient_checkpointing: true # stop training after this many evaluation losses have increased in a row # https://huggingface.co/transformers/v4.2.2/_modules/transformers/trainer_callback.html#EarlyStoppingCallback early_stopping_patience: 3 resume_from_checkpoint: auto_resume_from_checkpoints: true local_rank: logging_steps: 1 xformers_attention: true flash_attention: gptq_groupsize: gptq_model_v1: warmup_steps: 10 eval_steps: 5 save_steps: 10 debug: deepspeed: weight_decay: 0.000001 fsdp: fsdp_config: special_tokens: pad_token: "<|endoftext|>" bos_token: ">>ABSTRACT<<" eos_token: "<|endoftext|>" ```