New Release February 5, 2026

Claude Opus 4.6

Waqar's Verdict: Opus 4.5 Was a Beast. Then 4.6 Showed Up.

Anthropic's most capable model. Record-breaking benchmarks.
1M token context. Adaptive thinking. Built for the hardest problems.

0% ARC AGI 2

0M Token Context

0% Terminal-Bench 2.0

0% OSWorld

Read the Announcement Try the API

Scroll to explore

Opus 4.6

68.8%

GPT-5.2

54.2%

Gemini 3 Pro

45.1%

Opus 4.5

37.6%

+83% improvement over Opus 4.5. Problems easy for humans, hard for AI.

Opus 4.6

65.4%

Opus 4.5

59.8%

Agentic coding evaluation. Real-world software engineering tasks.

Opus 4.6

72.7%

Opus 4.5

66.3%

Agentic computer use benchmark. GUI automation at scale.

Opus 4.6

76%

Sonnet 4.5

18.5%

Needle-in-a-haystack retrieval at 1M tokens. 4x improvement.

Opus 4.6

GPT-5.2

-144 Elo

Opus 4.5

-190 Elo

Economically valuable knowledge work tasks across finance, legal, and enterprise domains. ~70% win rate vs GPT-5.2.

40% perfect scores. 84% scoring above 0.8. Highest Claude score ever.

Computational Biology Structural Biology Organic Chemistry Phylogenetics

~2x improvement over Opus 4.5 across life science disciplines.

Adaptive Thinking

Claude now dynamically decides when and how deeply to reason. No more manual budget_tokens. Four effort levels — low, medium, high, max — let you balance intelligence, speed, and cost.

Low Fast responses

Medium Balanced

High Default

Max Full power

1M Token Context

Process entire codebases, legal documents, or research papers in a single prompt. 5x the previous 200K limit, with 76% accuracy on needle-in-a-haystack retrieval.

200K

1M tokens

Context Compaction

Automatic summarization of older conversational tokens. Long-running tasks no longer hit context limits — Claude compresses what it no longer needs in detail.

Agent Teams

Multiple AI agents work simultaneously on different aspects of a coding project, coordinating autonomously. Ship features faster with parallel agentic workflows.

Specification	Opus 4.5 Nov 2025	Opus 4.6 Feb 2026
Context Window	200K tokens	1M tokens 5x
Max Output	128K tokens	128K tokens
Thinking Mode	Extended Thinking	Adaptive Thinking New
ARC AGI 2	37.6%	68.8% +83%
Terminal-Bench 2.0	59.8%	65.4% +9.4%
OSWorld	66.3%	72.7% +9.7%
BigLaw Bench	—	90.2% New
MRCR v2 (1M)	—	76% New
SWE-bench Verified	80.9%	80.8%
Life Sciences	Baseline	~2x improvement 2x
Agent Teams	No	Yes New
Context Compaction	No	Yes (beta) New
Input Pricing	$5 / 1M tokens	$5 / 1M tokens
Output Pricing	$25 / 1M tokens	$25 / 1M tokens

$5 / 1M input tokens

$25 / 1M output tokens

200K token context window
128K max output tokens
Adaptive thinking included
All standard features

Get Started

Extended Context

$10 / 1M input tokens

$37.50 / 1M output tokens

1M token context window
128K max output tokens
76% MRCR v2 accuracy
Context compaction (beta)
Ideal for codebases & legal docs

Request Access

Enterprise Knowledge Work

190 Elo points above Opus 4.5 on GDPval-AA. Finance, legal analysis, and complex business reasoning at scale.

Agentic Coding

Highest Terminal-Bench 2.0 score in the industry. Build, debug, and ship production code with agent teams working in parallel.

Scientific Research

2x improvement in computational biology, structural biology, organic chemistry. Process entire research papers in a single context.

Legal Analysis

90.2% on BigLaw Bench. 40% perfect scores. Review contracts, case law, and regulatory documents with unmatched precision.

Computer Use

72.7% on OSWorld — the best computer-using model available. Automate GUI workflows, test applications, and interact with desktop environments.

Deep Research

Industry-leading BrowseComp and DeepSearchQA scores. Multi-step agentic search for hard-to-find information across the web.

Start Building with
Opus 4.6

Available now on claude.ai, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry.

Open the Console Read the Blog Post

claude-opus-4-6

Claude Opus 4.6

Record-Breaking
Benchmarks

What's New in
Opus 4.6

Adaptive Thinking

1M Token Context

Context Compaction

Agent Teams

Opus 4.5 vs Opus 4.6

More Capability.
Same Price.

Standard

1M Context

Built for the
Hardest Work

Enterprise Knowledge Work

Agentic Coding

Scientific Research

Legal Analysis

Computer Use

Deep Research

Start Building with
Opus 4.6

Claude Opus 4.6

Record-BreakingBenchmarks

What's New inOpus 4.6

Adaptive Thinking

1M Token Context

Context Compaction

Agent Teams

Opus 4.5 vs Opus 4.6

More Capability.Same Price.

Standard

1M Context

Built for theHardest Work

Enterprise Knowledge Work

Agentic Coding

Scientific Research

Legal Analysis

Computer Use

Deep Research

Start Building withOpus 4.6

Record-Breaking
Benchmarks

What's New in
Opus 4.6

More Capability.
Same Price.

Built for the
Hardest Work

Start Building with
Opus 4.6