Skip to content

Instantly share code, notes, and snippets.

View RoyalMamba's full-sized avatar

Saurabh Rajaram Yadav RoyalMamba

View GitHub Profile
@RoyalMamba
RoyalMamba / grpo_demo.py
Created January 31, 2025 06:26 — forked from willccbb/grpo_demo.py
GRPO Llama-1B
# train_grpo.py
import re
import torch
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer
# Load and prep dataset
@RoyalMamba
RoyalMamba / output.json
Created July 23, 2024 19:39
LLama 3.1 8B response.
This file has been truncated, but you can view the full file.
{
"Afghanistan": "Afghanistan, a country located in South Asia, has been a focal point of international attention for decades due to its tumultuous history, geographical significance, and ongoing conflict. With a rich cultural heritage and a strategic location at the crossroads of Asia, Afghanistan has been a coveted prize for various empires and powers throughout history. From the ancient Silk Road to the modern-day struggle against terrorism, Afghanistan's story is one of resilience, turmoil, and transformation.\n\nGeography and Climate\n\nAfghanistan is a landlocked country bordered by Pakistan to the east and south, Iran to the west, Turkmenistan, Uzbekistan, and Tajikistan to the north, and China to the northeast. The country's terrain varies greatly, with towering mountain ranges, vast deserts, and fertile valleys. The Hindu Kush mountain range runs through the center of the country, dividing it into three main regions: the north, the central highlands, and the south. The climate in Afghanistan is g