Skip to content
llm-speed

RTX 5080 — LLM benchmarks

No benchmarks on RTX 5080 yet.

No RTX 5080 benchmarks yet.

Run on YOUR hardware to populate this page: pipx install llm-speed && llm-speed bench

$ pipx install llm-speed && llm-speed bench

Community folklore on RTX 5080

39 unverified claims extracted from Reddit/HN comments. Lower trust than signed runs above — every row links to the source.

  • communityconfidence 85%

    66.00tok/s qwen3-coder on RTX 5080 via llama.cpp Q4_K_M

    I get around ~66 t/s (16k/32k context, Q4_K_M) with very similar but Notebook hardware: AMD Ryzen 9955HX3D 64GB DDR5-5600 Nvidia RTX 5080 Mobile 16

    source: Reddit · u/Danmoreng · 2026-02-25

  • communityconfidence 70%

    25.00tok/s Qwen3-Coder on RTX 5080 via lm-studio

    Running Qwen3-Coder 30B at Full 256K Context: 25 tok/s with 96GB RAM + RTX 5080 Hello, I come to share with you my happiness running Qwen3-Coder 30B at its maximum unstretched context (256K).

    source: Reddit · u/ajmusic15 · 2025-08-02

  • communityconfidence 70%

    25.00tok/s Qwen3-Coder on RTX 5080 via lm-studio

    Running Qwen3-Coder 30B at Full 256K Context: 25 tok/s with 96GB RAM + RTX 5080 Hello, I come to share with you my happiness running Qwen3-Coder 30B at its maximum unstretched context (256K).

    source: Reddit · u/ajmusic15 · 2025-08-02

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

  • communityconfidence 65%

    47.00tok/s Qwen3-Coder-Next on RTX 5080 IQ3_XXS

    del is the biggest one I can run on my 5080 Edit: actually just tried it and I could run Qwen3-Coder-Next UD-IQ3_XXS with 262k context at ~47 t/s which isn't bad!

    source: Reddit · u/grumd · 2026-03-06

See all 39 claims for RTX 5080

Common questions about RTX 5080

Direct Q&A drawn from the runs above: fastest LLM, supported model classes, backend rankings, quantization guidance.

Read the RTX 5080 FAQ →