Claude Opus 4.8: better math and coding, higher token use
Anthropic released Claude Opus 4.8 with gains in math, coding and safety at the same prices. The model consumes many more tokens-one prompt drained a Pro quota-and returned weaker one-shot creative writing.
Anthropic released Claude Opus 4.8 about six weeks after Opus 4.7. List pricing remains $5 per million input tokens and $25 per million output tokens. Independent tests compared Opus 4.8 with Opus 4.7 and other competitive models across creative writing, coding, math, logic, narrative reasoning and long-context recall.
In single-pass creative writing tests, Opus 4.8 produced vivid description and a closed paradox in a time-travel story set in the Orinoco delta. The piece concluded with the line “It worked perfectly. It always had.” Testers described the prose as competent but less fluid and less surprising than several rivals and slightly behind Opus 4.7 on momentum and clarity on a default prompt. Testers noted that multi-step prompting or higher-effort settings could change those results.
In a one-prompt coding test, Opus 4.8 produced a playable typing-zombie game called Typing Dead, with strong visuals, mechanics and self-correcting fixes during inference. Follow-up prompts tended to polish the code rather than break it. Testers reported that a single coding prompt used the entire monthly Pro token quota, which would limit sustained development for users on that tier without upgrading or increasing API spend.
On a FrontierMath benchmark, Opus 4.8 solved a degree-19 polynomial construction and computed p(19) as 1,876,572,071,974,094,803,391,179 using the correct recurrence and a Dickson/Chebyshev-style construction. Opus 4.7 failed the same task in the same test conditions.
In a linguistic-trap logic test about whether “a man can marry his widow’s sister,” the model flagged the contradiction, stating explicitly “if a man has a widow, he is dead,” then answered the literal reading and provided analysis of the intended question with citations to the Deceased Wife’s Sister’s Marriage Act 1907 and the Falkland Islands Marriage Ordinance.
In a narrative reasoning whodunit, Opus 4.8 produced a confident but incorrect solution, constructing a persuasive alibi for the true culprit and assigning blame to another character. Opus 4.7 reached the correct answer on the same scenario. Testers observed that the newer model generated internally consistent but wrong chains of reasoning in that case.
Long-context testing produced mixed results. A 300,000-token haystack failed to process; an 85,000-token file succeeded in locating implanted lines, but the model declined to report the findings, citing suspected prompt injection or atypical testing. Testers attributed practical token increases to a less efficient tokenizer in Opus 4.8 that raises token counts for identical inputs. The higher token use, combined with unchanged list prices, leaves developers with limited monthly quotas the options of waiting for quota resets, upgrading to higher tiers such as Claude Max, or moving to alternative providers with larger quotas or lower token costs.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.







