Skip to content

Instantly share code, notes, and snippets.

@do-me
Created October 21, 2025 12:04
Show Gist options
  • Select an option

  • Save do-me/374c741d1ecd1d84fdcac3473a50b25e to your computer and use it in GitHub Desktop.

Select an option

Save do-me/374c741d1ecd1d84fdcac3473a50b25e to your computer and use it in GitHub Desktop.
Benchmark for Mac M3 Max 128Gb and mlx-community/gemma-3-270m-it-4bit with mlx-lm
from mlx_lm import batch_generate, load
model, tokenizer = load("mlx-community/gemma-3-270m-it-4bit")
# load a pandas df here, df has a text column
import pandas as pd
df = pd.read_parquet("2000_benchmark_texts_BAAI.parquet")
# Apply the chat template and encode to tokens
prompts = [i + "--------\nSummarize this article in one sentence" for i in df.text.to_list()]
prompts = [
tokenizer.apply_chat_template(
[{"role": "user", "content": p}],
add_generation_prompt=True,
)
for p in prompts[:2000]
]
result = batch_generate(model, tokenizer, prompts, verbose=False)
@do-me
Copy link
Author

do-me commented Oct 21, 2025

Results

1 mlx-community/gemma-3-270m-it-4bit

Prompts Time (s) Rate (prompts/s)
10 2 5.00 /s
50 8.4 5.95 /s
100 13.1 7.63 /s
300 34.5 8.70 /s
500 57 8.77 /s
1000 100 10.00 /s
2000 194 10.31 /s

The more the better! Only marginal gains after 2000 if at all.

Graph

Just copy & paste this code in Jupyter Lite

Graph Code
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import make_interp_spline

# --- Your Data ---
# Number of prompts
x_data = np.array([10, 50, 100, 300, 500, 1000, 2000])
# Time in seconds
y_data = np.array([2, 8.4, 13.1, 34.5, 57, 101, 194])

# --- Create a smooth curve ---
# Generate a spline for smooth interpolation.
spline = make_interp_spline(x_data, y_data, k=3)  # k=3 for a cubic spline

# Create a new set of x-values for a smoother line (e.g., 400 points).
x_smooth = np.linspace(x_data.min(), x_data.max(), 400)
y_smooth = spline(x_smooth)

# --- Styling and Plotting ---

# Apply a dark, tech-style theme.
plt.style.use('dark_background')

# Create the figure and axes
fig, ax = plt.subplots(figsize=(10, 6))

# Plot the main, solid line on top
ax.plot(x_smooth, y_smooth, linewidth=2, color=glow_color)

# Plot the original data points
ax.scatter(x_data, y_data, color='#FE53BB', marker='o', s=100, zorder=10, 
           edgecolors=glow_color, linewidth=1.5, label='Original Data Points')


# --- Aesthetics and Labels ---

# Set title and labels with a futuristic font style
title_font = {'family': 'sans-serif', 'color':  'white', 'weight': 'bold', 'size': 18}
label_font = {'family': 'sans-serif', 'color':  '#c7c7c7', 'weight': 'normal', 'size': 12}

ax.set_title('Performance Benchmark', fontdict=title_font, pad=20)
ax.set_xlabel('Number of Prompts', fontdict=label_font, labelpad=10)
ax.set_ylabel('Time (seconds)', fontdict=label_font, labelpad=10)

# Customize the grid for a tech look
ax.grid(True, which='both', linestyle='--', linewidth=0.5, color='#343434')

# Customize spines (the plot border)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_color('#565656')
ax.spines['bottom'].set_color('#565656')

# Customize tick parameters
ax.tick_params(axis='x', colors='#c7c7c7')
ax.tick_params(axis='y', colors='#c7c7c7')

# Add a legend
legend = ax.legend()
for text in legend.get_texts():
    text.set_color("white")

# Ensure a tight layout to prevent labels from being cut off
plt.tight_layout()

# --- Display the Plot ---
plt.show()
image

Test Data

2000 newspaper full texts from BAAI's high quality news dataset: https://huggingface.co/datasets/BAAI/IndustryCorpus2

Sorry the file upload here in this gist seems to be broken; I cannot upload any kind of file (tried csv, csv.gz, 7z, zip, parquet...). Please just reach out to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment