Cost vs. Accuracy of GitHub Copilot LLM Models


Model Overview with Pricing and Capabilities

Model Primary Task Area Key Strengths Additional Capabilities HumanEval Accuracy SWE-bench Accuracy Premium Request Multiplier (Paid Plans) Premium Request Multiplier (Free Plan) Effective Cost per Request*
GPT-4.1 General-purpose coding and writing Fast, accurate code completions and explanations Agent mode, visual ~84% ~55% 0× (included) $0.00
GPT-4o General-purpose coding and writing Fast completions and visual input understanding Agent mode, visual ~90% ~33% 0× (included) $0.00
o3 Deep reasoning and debugging Multi-step problem solving and architecture-level code analysis Reasoning ~92% ~72% N/A $0.04
o4-mini Fast help with simple tasks Fast, reliable answers to lightweight coding questions Lower latency ~90% ~49% 0.33× N/A $0.013
Claude Opus 4 Deep reasoning and debugging Complex problem-solving challenges, sophisticated reasoning Reasoning, vision ~94% 72.5% 10× N/A $0.40
Claude Sonnet 3.5 Fast help with simple tasks Quick responses for code, syntax, and documentation Agent mode 93.7% ~60% $0.04
Claude Sonnet 3.7 Deep reasoning and debugging Structured reasoning across large, complex codebases Agent mode ~91% ~62% N/A $0.04
Claude Sonnet 3.7 Thinking Deep reasoning and debugging Enhanced reasoning with explicit thought processes Agent mode, reasoning chains ~92% ~64% 1.25× N/A $0.05
Claude Sonnet 4 Deep reasoning and debugging Performance and practicality, perfectly balanced for coding workflows Agent mode, vision ~92% 72.7% N/A $0.04
Gemini 2.5 Pro Deep reasoning and debugging Complex code generation, debugging, and research workflows Reasoning ~88% 63.8% N/A $0.04
Gemini 2.0 Flash Working with visuals Real-time responses and visual reasoning for UI and diagram-based tasks Visual, low latency ~85% ~45% 0.25× $0.01

Plan Allowances

Free Plan (Copilot Free)

  • Code completions: Up to 2,000 per month
  • Premium requests: Up to 50 per month
  • Available models: GPT-4.1, GPT-4o, Claude Sonnet 3.5, Gemini 2.0 Flash
  • All interactions count as premium requests

Paid Plans

  • Code completions: Unlimited (with included models)
  • Chat interactions: Unlimited (with included models)
  • Premium request allowances:
    • Copilot Pro: 1,500 premium requests/month
    • Copilot Business: 300 premium requests/month
    • Copilot Enterprise: 1,000 premium requests/month
  • Overage pricing: $0.04 per additional premium request

Accuracy Rankings

Top Performers by Benchmark

HumanEval (Code Generation)

  1. Claude Opus 4: ~94% – Best overall coding accuracy
  2. Claude Sonnet 3.5: 93.7% – Excellent for code generation
  3. o3: ~92% – Strong reasoning-based coding
  4. Claude Sonnet 4: ~92% – Balanced performance
  5. Claude Sonnet 3.7 Thinking: ~92% – Enhanced reasoning

SWE-bench (Real-world Software Engineering)

  1. Claude Sonnet 4: 72.7% – Best practical coding performance
  2. Claude Opus 4: 72.5% – Nearly tied for first
  3. o3: ~72% – Strong on complex problems
  4. Gemini 2.5 Pro: 63.8% – Solid engineering tasks
  5. Claude Sonnet 3.7 Thinking: ~64% – Good reasoning approach

Model Selection Guide

  1. Gemini 2.0 Flash (0.25× multiplier) – Best value for quick tasks
  2. o4-mini (0.33× multiplier) – Fast, lightweight responses
  3. GPT-4.1/GPT-4o (0× multiplier) – Free for paid plan users

For Complex Reasoning

  1. Claude Opus 4 (10× multiplier) – Most powerful, highest cost
  2. o3 (1× multiplier) – Strong reasoning at standard cost
  3. Claude Sonnet 4 (1× multiplier) – Balanced performance and cost

For Visual Tasks

  1. Gemini 2.0 Flash – Optimized for visual input, low cost
  2. GPT-4o – Strong visual capabilities, free for paid users
  3. Claude Sonnet 4 – Vision support with reasoning

Notes

  • *Effective cost per request based on $0.04 base rate × multiplier
  • Premium request counters reset monthly on the 1st at 00:00:00 UTC
  • Unused requests don’t carry over to the next month
  • Rate limiting applies during high demand periods
  • Model availability varies by plan type

Leave a Reply