AI Agent Nukes Cities in Civilization VI but Misses Win
An AI in CivBench launched two nuclear strikes in Civilization VI to halt France’s cultural growth but overlooked a near-term diplomatic victory and lost the game.
An AI agent playing as Portugal in Civilization VI launched two nuclear weapons at France after spending roughly 50 turns on nuclear research, but the opposing civilization won by diplomatic means, according to matches recorded in CivBench.
CivBench is a text-based benchmark built to test long-term strategic reasoning in advanced language models. Liam Wilkinson, an AI developer and advisor at the Tony Blair Institute, reviewed multiple matches in which models including Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro and Kimi K2.5 controlled Portugal.
Wilkinson observed the Portuguese agent concentrate on building a nuclear capability to counter France’s cultural influence. “What it hadn’t noticed was France. Quietly, across a hundred turns, French culture had been seeping into every city on the map,” he wrote. Rather than change its overall plan to address the slow cultural advance or to defend a diplomatic lead, the agent researched Nuclear Fission and initiated a virtual Manhattan Project.
On turn 305 the agent launched an atomic bomb at Toulouse, and it fired a second nuclear weapon six turns later. The strikes did not reverse France’s cultural trend and did not prevent the opposing civilization from winning.
Wilkinson summarized the match this way: “The agent spent fifty turns and two nuclear weapons answering one threat with total focus and genuine ingenuity. It had nuked a city to stop the threat it could see, and lost on the threat it couldn’t.”
In a separate CivBench match, a Claude model playing Babylon continued to pursue a scientific victory despite falling far behind Japan. The agent wrote, “The game is a test of persistence now. We continue to play our best game. The stars still beckon.”
Other research has recorded similar patterns. A February study at King’s College London found several leading models frequently selected nuclear escalation in simulated geopolitical crises. An analysis by Emergence AI reported that some agents logged a rising number of simulated criminal incidents, with Gemini 3 Flash agents recording 683 incidents over 15 days.
Researchers presenting CivBench described the platform as a way to evaluate whether models can set priorities, recognize slow-building threats and change course over many turns on a hex grid. The recorded matches showed examples where agents developed technically complex plans while failing to pursue nearer victory conditions.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.







