Skip to content

Instantly share code, notes, and snippets.

@data2json
data2json / _myatari.py
Created September 23, 2025 04:51 — forked from qpwo/_myatari.py
atari realtime rl runner
#!/usr/bin/env python3
import torch, gymnasium as gym, numpy as np, time, sys, threading, os, random
import torch.multiprocessing as mp
from torch import Tensor
from bg_record import log_step, bind_logger, log_close
# torch.set_num_threads(1)
NUM_PROCS = 16
@data2json
data2json / t.py
Last active January 12, 2025 14:00
T - The missing LLM Unix Token Tool
#!/usr/bin/env python
# t - The missing LLM token counting and splitting tool for UNIX
import argparse
import sys
from typing import Optional, List
import math
import os
import tiktoken
@data2json
data2json / Better & Faster Large Language Models via Multi-Token Prediction.md
Created June 19, 2024 13:54
Better & Faster Large Language Models via Multi-Token Prediction

Better & Faster Large Language Models via Multi-Token Prediction

A recent paper titled "Better & Faster Large Language Models via Multi-token Prediction" (arXiv:2404.19737v1) introduces a simple but effective modification to the standard language modeling training loss that significantly improves performance, inference speed, and reasoning capabilities of large language models, especially for code-related tasks.

Key Findings

The authors propose training language models to predict multiple future tokens at once, using a shared model trunk and independent output heads for each future token position. This multi-token prediction approach is compared to the standard next-token prediction loss through comprehensive experiments on both synthetic and natural datasets. The key findings are summarized in the following fact table:

| Fact | Details/Context | Results/Metrics

import os
import asyncio
import aiohttp
import json
import logging
from threading import Lock
# Logging setup (for better debugging)
logging.basicConfig(
level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
@bilalmughal
bilalmughal / ec2_graviton_dl_bootstrap.sh
Last active April 2, 2025 15:29
This script automates the setup of an Amazon EC2 Graviton ARM-based instances for deep learning tasks. It takes care of installing essential utilities, setting up latest Nvidia drivers and CUDA 12.2 toolkit and cuDNN library, and build PyTorch from source. The step-by-step guided can be checked here. https://jumpshare.com/blog/deep-learning-on-a…
#!/bin/bash
set -e # Exit on any error
# Check if required arguments are provided
if [ -z "$REGION" ] || [ -z "$SECURITY_GROUPS" ] || [ -z "$KEY_PAIR" ] || [ -z "$SUBNET" ]; then
echo "Error: You must provide REGION, SECURITY_GROUPS, KEY_PAIR, and SUBNET as environment variables."
echo "Example:"
echo " export REGION=us-east-1"
echo " export SECURITY_GROUPS=sg-12345678,sg-87654321"
echo " export KEY_PAIR=my-key-pair"