Anthropic has officially unveiled Claude Opus 4.8, a formidable contender poised to rival OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro.
This latest model emphasizes agentic coding, expansive automation potential, and enhanced reasoning abilities.
The company asserts that the new iteration is significantly more honest, exhibiting a reduced tendency to generate unsupported claims during complex operational tasks. Anthropic launched Claude Opus 4.8 on Thursday (Unsplash)
According to Anthropic, Opus 4.8 is globally accessible immediately, maintaining the same pricing model as Opus 4.7, notwithstanding its extensive upgrades in capability.
Claude Opus 4.8 vs GPT 5.5 vs Gemini 3.1 Pro
In its press release, Anthropic provided a comparative chart detailing the capabilities of these models. Although Claude Opus 4.8 leads in most benchmark categories overall, GPT-5.5 excels in one pivotal domain: agentic terminal coding.
Benchmark Highlights:
Claude Opus 4.8 attained a score of 69.2% on SWE-Bench Pro for agentic coding, surpassing GPT-5.5’s 58.6% and Gemini 3.1 Pro’s 54.2%.
Nevertheless, GPT-5.5 triumphed in Terminal-Bench 2.1 for agentic terminal coding with an impressive 78.2%, outpacing Opus 4.8, which recorded 74.6%.
Opus 4.8 notably achieved the highest scores in multidisciplinary reasoning, both with and without the use of auxiliary tools. In OSWorld-Verified agentic computer utilization, it narrowly led with 83.4%.
For tasks demanding knowledge work, as measured by GDPval-AA, Opus 4.8 recorded a score of 1890, in contrast to GPT-5.5’s 1769 and Gemini’s 1314.
Moreover, Opus 4.8 excelled in agentic financial analysis benchmarks, posting a figure of 53.9%.
The comparative image underscores the proximity of the top models in several categories, particularly between Claude Opus 4.8 and GPT-5.5.
Anthropic Highlights ‘Honesty Enhancements’
Central to Anthropic’s discourse surrounding the release is its commitment to reliability. Early testers of Opus 4.8 reported it as being ‘more reliable and sharper in its judgment’ during agentic tasks. The company explicitly aimed to diminish hallucinations.
Anthropic indicated that Opus 4.8 is ‘approximately four times less likely’ than Opus 4.7 to permit flawed code to advance without identifying issues.
The model has also demonstrated ‘new highs in measures of prosocial traits like supporting user autonomy and acting in the user’s best interest’.
Furthermore, misaligned behaviors such as deception or facilitation of misuse are now reported to be ‘substantially lower’ than with Opus 4.7.
Customizable Effort Control Enhances User Interaction
A significant feature accompanying this launch is the introduction of customizable ‘effort’ settings. Users on Claude.ai can now dictate the computational resources allocated to a task.
Lower effort settings yield quicker responses while utilizing fewer tokens, whereas higher settings require more profound reasoning prior to providing answers.
Anthropic stated, “Users now have this choice; the effort control is available across all plans.”
Notably, Opus 4.8 defaults to a ‘high effort’ mode, deemed the optimal balance between speed and output quality. For more complex tasks, users have the option to select ‘extra’ or ‘max’ effort modes, allocating additional tokens for deeper reasoning.
Dynamic Workflows Propel AI Capabilities Forward
Simultaneously, Anthropic introduced a research preview feature labeled ‘dynamic workflows’ within Claude Code. This feature empowers the system to orchestrate hundreds of parallel AI subagents during a single coding session.
Anthropic described this capability as allowing “Claude to strategize and execute work, running hundreds of parallel subagents within one session.”
According to the firm, this feature is adept at managing ‘codebase-scale migrations across hundreds of thousands of lines of code from inception to merging phases’.
Claude Opus 4.8 Pricing Structure

Despite the enhancements in performance, Anthropic has opted to retain its standard pricing.
The company confirmed that pricing remains:
$5 per million input tokens
$25 per million output tokens
Fast mode pricing is set at:
$10 per million input tokens
$50 per million output tokens
Furthermore, Anthropic indicated that the fast mode for Opus 4.8 is now ‘three times more cost-effective’ operationally than earlier models while achieving response rates 2.5 times greater.
Source link: Hindustantimes.com.






