On benchmarks, Opus 4.8 is a step up rather than a leap. It scores 88.6% on SWE-bench Verified (vs. 87.6% for Opus 4.7), 69.2% on the harder SWE-bench Pro (vs. 64.3%), and 74.6% on Terminal-Bench 2.1 ...
Claude Opus 4.8 promises more honest AI answers. Dynamic workflows can run hundreds of Claude subagents. Fast mode gets cheaper, while regular Opus pricing stays put. Diogenes was a fourth-century B.C ...
Anthropic describes Claude Opus 4.8 as having “sharper judgement, more honesty about its progress, and the ability to work independently for longer than its predecessors.” “Early testers report that ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results