This command-line tool avoids the complexity and frequent disruptions often faced with snapshot_download and git clone when fetching large models, like LLM. It smartly utilizes wget(which supports resuming) for Git LFS files and git clone for the rest.
- 🚀 Resume from breakpoint: You can re-run it or Ctrl+C anytime.
- 🚫 File Exclusion: Use
--excludeto skip specific files, save time for models with duplicate formats (e.g., .bin and .safetensors). - 🔐 Auth Support: For gated models that require Huggingface login, use
--hf_usernameand--hf_tokento authenticate. - 🌍 Proxy Support: Set up with
HTTPS_PROXYenvironment variable. - 📦 Simple: No dependencies & No installation required.
First, Download hfd.sh from this repo.
$ ./hfd.sh -h
Usage:
hfd <model_id> [--exclude exclude_pattern] [--hf_username username] [--hf_token token]
Download a model:
./hdf.sh bigscience/bloom-560m
Download a model need login
Get huggingface token from https://huggingface.co/settings/tokens, then
hfd meta-llama/Llama-2-7b --hf_username YOUR_HF_USERNAME --hf_token YOUR_HF_TOKENDownload a model and exclude certain files (e.g., .safetensors):
./hdf.sh bigscience/bloom-560m --exclude safetensorsOutput: During the download, the file URLs will be displayed:
$ ./hdf.sh bigscience/bloom-560m --exclude safetensors
...
Start Downloading lfs files, bash script:
wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/flax_model.msgpack
# wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/model.safetensors
wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/onnx/decoder_model.onnx
...For easier access, you can create an alias for the script:
alias hfd="$PWD/hfd.sh"