| Model Configuration | Avg. Latency (s) | Avg. Peak Memory (MB) | Avg. LLM Judge Score |
|---|---|---|---|
| W4A16 + SDPA | 1.103 | 1003.81 | 0.421 |
| W4A16 + SDPA Paged | 1.303 | 1041.80 | 0.391 |
Created
July 4, 2025 07:36
-
-
Save FareedKhan-dev/63fa61bf603a1726b997afebaecbe40d to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment