M5 Pro — LLM benchmarks

Name: M5 Pro — community LLM benchmarks
Creator: llm-speed
License: https://www.apache.org/licenses/LICENSE-2.0
Keywords: M5 Pro, LLM benchmark, tokens per second, decode tok/s, prefill, TTFT

No benchmarks on M5 Pro yet.

No M5 Pro benchmarks yet.

Run on YOUR hardware to populate this page: pipx install llm-speed && llm-speed bench

$ pipx install llm-speed && llm-speed bench

read the methodology

Community folklore on M5 Pro

2 unverified claims extracted from Reddit/HN comments. Lower trust than signed runs above — every row links to the source.

communityconfidence 50%
25.00tok/s — on M5 Pro via mlx
“bit benchmarks on M5 Pro 64GB (SwiftLM/MLX, measured):** |Config|Prefill|Decode|GPU RAM active| |:-|:-|:-|:-| |SSD streaming, 4K context|**25 t/s**|\~0.4 t/s|**2.7 GB**| Note: At 4,262-token context depth with a 122B MoE, each decode step streams the full active expert set (\~1…”
source: Reddit · u/solderzzc · 2026-04-01
communityconfidence 50%
25.00tok/s — on M5 Pro via mlx
“bit benchmarks on M5 Pro 64GB (SwiftLM/MLX, measured):** |Config|Prefill|Decode|GPU RAM active| |:-|:-|:-|:-| |SSD streaming, 4K context|**25 t/s**|\~0.4 t/s|**2.7 GB**| Note: At 4,262-token context depth with a 122B MoE, each decode step streams the full active expert set (\~1…”
source: Reddit · u/solderzzc · 2026-04-01

Common questions about M5 Pro

Direct Q&A drawn from the runs above: fastest LLM, supported model classes, backend rankings, quantization guidance.

Read the M5 Pro FAQ →