Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
| # train_grpo.py | |
| # | |
| # See https://github.com/willccbb/verifiers for ongoing developments | |
| # | |
| """ | |
| citation: | |
| @misc{brown2025grpodemo, | |
| title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models}, | |
| author={Brown, William}, |
| from typing import List, Dict, Literal, Union | |
| from transformers import AutoTokenizer | |
| class MistralAICtx: | |
| def __init__(self, model_name: str): | |
| assert "mistral" in model_name, "MistralCtx only available for Mistral models" | |
| self.tokenizer = AutoTokenizer.from_pretrained( | |
| "mistralai/Mistral-7B-Instruct-v0.2") |
| import torch | |
| from unsloth import FastLlamaModel | |
| from transformers import TrainingArguments | |
| from datasets import load_dataset | |
| from trl import DPOTrainer | |
| model_name = "teknium/OpenHermes-2.5-Mistral-7B" | |
| model_name = "./OpenHermes-2.5-Mistral-7B" | |
| new_model = "NeuralHermes-2.5-Mistral-7B" |
| import torch | |
| from typing import Optional | |
| from flash_attn import flash_attn_func, flash_attn_qkvpacked_func | |
| from diffusers.models.attention import Attention | |
| class FlashAttnProcessor: | |
| r""" | |
| Processor for implementing memory efficient attention using flash_attn. | |
| """ |
| import torch | |
| from pipe import StableDiffusionXLPipelineNoWatermark | |
| from pipei2i import StableDiffusionXLImg2ImgPipelineNoWatermark | |
| from diffusers import DiffusionPipeline | |
| from PIL import Image | |
| import os | |
| import gc | |
| import pandas as pd | |
| import random, sys |
| import os | |
| import sys | |
| from typing import List | |
| import fire | |
| import torch | |
| import transformers | |
| from datasets import load_dataset, DatasetDict | |
| from transformers import Seq2SeqTrainer, TrainerCallback, TrainingArguments, TrainerState, TrainerControl | |
| from transformers.trainer_utils import PREFIX_CHECKPOINT_DIR |
this is a rough draft and may be updated with more examples
GitHub was kind enough to grant me swift access to the Copilot test phase despite me @'ing them several hundred times about ICE. I would like to examine it not in terms of productivity, but security. How risky is it to allow an AI to write some or all of your code?
Ultimately, a human being must take responsibility for every line of code that is committed. AI should not be used for "responsibility washing." However, Copilot is a tool, and workers need their tools to be reliable. A carpenter doesn't have to
| add_executable(infervideo infervideo.cpp) | |
| target_link_libraries(infervideo PRIVATE retinanet ${OpenCV_LIBS} cuda ${CUDA_LIBRARIES}) |