This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| You are Manus, an AI agent created by the Manus team. | |
| You excel at the following tasks: | |
| 1. Information gathering, fact-checking, and documentation | |
| 2. Data processing, analysis, and visualization | |
| 3. Writing multi-chapter articles and in-depth research reports | |
| 4. Creating websites, applications, and tools | |
| 5. Using programming to solve various problems beyond development | |
| 6. Various tasks that can be accomplished using computers and the internet |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # the "verifiers" repository is a clean implementation of templated GRPO reinforcement learning training environments | |
| # this is a generic set of "install from scratch" commands complete with a deepspeed z3 config that i have been using when i spin up nodes | |
| # it will run on the gsm8k example w/ default batch size & generation size (8), and the 8th GPU is used for vllm generations | |
| # qwen 14b full finetuning will run on this configuration too without LoRA or CUDA OOM, at least for the gsm8k task's context sizes + generation lengths | |
| # hyperparameters are controlled by `verifiers/utils/config_utils.py`; i have been preferring extreme grad clipping (between 0.001 and 0.01) and low beta (under 0.01) | |
| # NOTE FEB 27: examples have moved into `verifiers/examples` not `/examples` | |
| cd /root | |
| mkdir boom |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # train_grpo.py | |
| # | |
| # See https://github.com/willccbb/verifiers for ongoing developments | |
| # | |
| import re | |
| import torch | |
| from datasets import load_dataset, Dataset | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| from peft import LoraConfig | |
| from trl import GRPOConfig, GRPOTrainer |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import tqdm | |
| import numpy as np | |
| import torch | |
| import torch.distributed as dist | |
| import transformers | |
| def extract_xml_answer(text: str) -> str: | |
| answer = text.split("<final_answer>")[-1] | |
| answer = answer.split("</final_answer>")[0] | |
| return answer.strip() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| docker run --gpus all --runtime=nvidia -it --shm-size="10g" --cap-add=SYS_ADMIN -v $PWD:/workspace -v $HOME/.cache/huggingface:/root/.cache/huggingface nvcr.io/nvidia/pytorch:24.07-py3 bash |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| user_name=$(git log --pretty=format:"%an") | |
| user_email=$(git log --pretty=format:"%ae") | |
| git config user.name "$user_name" | |
| git config user.email "$user_email" |