xezpeleta · October 18, 2024 21:06 · Oct 18, 2024
diff --git a/itzune_train.md b/itzune_train.md
@@ -0,0 +1,91 @@
+# Itzune NMT models training
+
+
+## Prepare the environment
+
+### Requirements
+- Ubuntu 20.04 or 22.04
+- CUDA 11
+
+```
+docker run --gpus all -it -v nvcr.io/nvidia/tensorflow:22.10-tf2-py3
+```
+
+Check other containers from [Nvidia NGC catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags)
+## download 
+
+
+git clone https://github.com/xezpeleta/nmt-models.git
+cd nmt-models/
+
+## Installation
+
+
+```
+apt-get update
+apt-get install libcudart10.1 python3-pip zip apt install libcudart10.1
+```
+
+
+```
+cd ./install-scripts
+pip3 uninstall cudf onnx
+pip3 install -r requirements.txt
+```
+
+Check that everything is installed correctly:
+
+```
+./versions.sh
+```
+
+```
+Could not load dynamic library 'libnvinfer_plugin.so.7'
+```
+
+(batzuetan pip install tensorrt egin behar da?)
+
+cd /usr/lib/x86_64-linux-gnu/
+ln -s libnvinfer_plugin.so.8 libnvinfer_plugin.so.7
+ln -s libnvinfer.so.8 libnvinfer.so.7
+LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/
+
+## Get corpus
+
+apt install git-lfs  (meter en el install script?)
+
+cd languages
+
+git lfs install (meter en el script de get-corpuses.sh?)
+
+./get-corpus.sh
+
+## Download evaluation dataset
+
+bash get_flores.sh
+## Preprocess
+
+./preprocess-all.sh
+
+
+## Train
+
+cd languages/eng-eus
+
+voc.sh
+
+
+train.sh
+
+multi gpu: add --num_gpus 2 
+
+### Wandb integration
+
+pip install wandb
+wandb login
+
+wandb sync -p OpenNMT-engeus --id v1 nmt-models/languages/eng-eus/run/
+
+wandb sync -p OpenNMT-engeus --id v1 --append nmt-models/languages/eng-eus/run/
+
+watch -n 600 wandb sync -p OpenNMT-engeus --id v1 --append nmt-models/languages/eng-eus/run/
No results found