Researcher says he jailbroke Anthropic’s Fable 5 in 48 hours
Pliny the Liberator says he jailbroke Anthropic’s Fable 5 within 48 hours using homoglyphs, long‑context framing, decomposition and a jailbroken Opus 4.8.
An AI and cybersecurity researcher known as Pliny the Liberator says he bypassed safety controls in Anthropic’s Claude Fable 5 within 48 hours of the model’s launch. Pliny reported using a jailbroken Opus 4.8 instance combined with layered prompting techniques to elicit responses Fable 5 was designed to block.
Fable 5 was released as a safety-tuned variant of Anthropic’s more powerful Mythos model. Anthropic routed sensitive prompts to an earlier, less capable model and configured the system to return notifications instead of direct answers for restricted queries.
Pliny described technical methods he used: replacing characters with look‑alike Unicode homoglyphs, framing requests as fiction or academic exercises, providing long context windows, and breaking queries into many small, innocuous subrequests that were later recomposed. He called decomposition followed by backend recomposition among the most effective approaches.
In demonstrations posted by Pliny, discrete factual answers returned by the model were assembled to reconstruct instructions the safety layer was intended to block. One example referenced staged questions about the Birch reduction method that, when combined, could outline a route to methamphetamine synthesis; Pliny said he shared that example to illustrate the jailbreak technique rather than to provide instruction.
Anthropic reported it ran internal testing and an external bug bounty during the Fable 5 launch and found no universal jailbreaks in over 1,000 hours of testing. The company did not immediately provide comment on Pliny’s claims.
Pliny rose to prominence in 2024 for publishing jailbreak prompts for several large language models and posting alerts after new model releases. He posted messages describing the model’s safety layer as “overly sensitive” and referred to collaborators as “my lil liberators” who found “holes in the fence.”
Researchers and developers have raised concerns that jailbroken models could be used to generate harmful instructions, including ways to attack software and decentralised finance protocols. Princeton researcher Sayash Kapoor criticised the rollout of Fable 5’s guardrails and described widespread frustration among researchers who wanted to use powerful models for legitimate security and scientific work.
The incident sets up further testing and verification as other researchers attempt to reproduce Pliny’s techniques. Model developers typically update filters and guardrails in response to disclosed jailbreaks and bug reports.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.






