M5 Pro — LLM benchmarks
No benchmarks on M5 Pro yet.
No M5 Pro benchmarks yet.
Run on YOUR hardware to populate this page: pipx install llm-speed && llm-speed bench
$ pipx install llm-speed && llm-speed bench
Community folklore on M5 Pro
2 unverified claims extracted from Reddit/HN comments. Lower trust than signed runs above — every row links to the source.
- communityconfidence 50%
25.00tok/s — on M5 Pro via mlx
“bit benchmarks on M5 Pro 64GB (SwiftLM/MLX, measured):** |Config|Prefill|Decode|GPU RAM active| |:-|:-|:-|:-| |SSD streaming, 4K context|**25 t/s**|\~0.4 t/s|**2.7 GB**| Note: At 4,262-token context depth with a 122B MoE, each decode step streams the full active expert set (\~1…”
- communityconfidence 50%
25.00tok/s — on M5 Pro via mlx
“bit benchmarks on M5 Pro 64GB (SwiftLM/MLX, measured):** |Config|Prefill|Decode|GPU RAM active| |:-|:-|:-|:-| |SSD streaming, 4K context|**25 t/s**|\~0.4 t/s|**2.7 GB**| Note: At 4,262-token context depth with a 122B MoE, each decode step streams the full active expert set (\~1…”
Common questions about M5 Pro
Direct Q&A drawn from the runs above: fastest LLM, supported model classes, backend rankings, quantization guidance.