Created
October 18, 2024 21:06
-
-
Save xezpeleta/84fe445e571c99e9cdffcfab8d62c61a to your computer and use it in GitHub Desktop.
Revisions
-
xezpeleta created this gist
Oct 18, 2024 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,91 @@ # Itzune NMT models training ## Prepare the environment ### Requirements - Ubuntu 20.04 or 22.04 - CUDA 11 ``` docker run --gpus all -it -v nvcr.io/nvidia/tensorflow:22.10-tf2-py3 ``` Check other containers from [Nvidia NGC catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags) ## download git clone https://github.com/xezpeleta/nmt-models.git cd nmt-models/ ## Installation ``` apt-get update apt-get install libcudart10.1 python3-pip zip apt install libcudart10.1 ``` ``` cd ./install-scripts pip3 uninstall cudf onnx pip3 install -r requirements.txt ``` Check that everything is installed correctly: ``` ./versions.sh ``` ``` Could not load dynamic library 'libnvinfer_plugin.so.7' ``` (batzuetan pip install tensorrt egin behar da?) cd /usr/lib/x86_64-linux-gnu/ ln -s libnvinfer_plugin.so.8 libnvinfer_plugin.so.7 ln -s libnvinfer.so.8 libnvinfer.so.7 LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/ ## Get corpus apt install git-lfs (meter en el install script?) cd languages git lfs install (meter en el script de get-corpuses.sh?) ./get-corpus.sh ## Download evaluation dataset bash get_flores.sh ## Preprocess ./preprocess-all.sh ## Train cd languages/eng-eus voc.sh train.sh multi gpu: add --num_gpus 2 ### Wandb integration pip install wandb wandb login wandb sync -p OpenNMT-engeus --id v1 nmt-models/languages/eng-eus/run/ wandb sync -p OpenNMT-engeus --id v1 --append nmt-models/languages/eng-eus/run/ watch -n 600 wandb sync -p OpenNMT-engeus --id v1 --append nmt-models/languages/eng-eus/run/