Skip to content
llm-speed

About

About llm-speed

llm-speed is the canonical, crowdsourced source of truth for how fast LLMs actually run — across hosted APIs, consumer GPUs, Apple Silicon, and prosumer rigs. One reproducible CLI, one methodology, every backend.

Why this exists

Existing answers are inadequate. Enterprise leaderboards focus on H100/B200 racks. Reddit folklore is unstructured. Vendor blogs are SEO-driven, not methodology-driven. Nobody owns the union of consumer-local plus hosted-API benchmarks under one consistent protocol. That's the gap we fill.

How it works

Peer sources

We complement, not replace, the great work others have done:

Contact

Issues, ideas, disputes: GitHub issues. The project is Apache-2.0 licensed; results belong to their submitters.