OpenAI dissatisfied with Nvidia’s new chips, begins searching for alternatives
OpenAI has begun exploring alternatives to Nvidia for inference workloads, dissatisfied with the speed of the company’s latest GPUs. Negotiations over Nvidia’s proposed $100 billion investment in OpenAI have also stalled.
OpenAI has begun actively seeking alternatives to Nvidia for inference – the stage where models generate responses to user queries. Eight sources said the company is dissatisfied with the performance of several of Nvidia’s newer GPUs and is reevaluating its hardware strategy.
According to these sources, OpenAI believes the processing speed for certain types of requests – especially those involving programming tasks and agentic workflows – is insufficient. Internally, the issue became most visible in Codex, the company’s code-generation tool. Employees attributed some of its weaknesses to limitations in Nvidia’s GPUs.
In response, OpenAI began testing AMD products last year and initiated discussions around specialized chips from Cerebras and Groq. Talks with Groq, however, stalled after Nvidia signed a roughly $20 billion technology-licensing deal with the company. Nvidia also hired several Groq architects, according to sources.
Despite exploring alternatives, OpenAI emphasizes that Nvidia still powers “the majority” of its inference infrastructure and remains the best option in terms of performance per dollar. Nvidia, for its part, dismissed reports of tension as “nonsense” and reaffirmed its intention to continue major investments in OpenAI.
Investment negotiations have also slowed. Last fall, Nvidia was in talks to invest up to $100 billion in OpenAI – a deal that would give it a stake in the startup and preferred access to its most advanced GPUs. But OpenAI’s evolving requirements for computational architectures have complicated the discussions, delaying an agreement that was initially expected within weeks.
Sources say OpenAI is seeking next-generation hardware designs built around large amounts of on-chip SRAM. This architecture can dramatically accelerate inference by reducing data-retrieval latency – a key bottleneck for chat models and agentic tools handling millions of daily requests.
Competitors are already taking different paths: Anthropic and Google rely on Google’s TPUs – accelerators optimized for inference workloads and known for more consistent response speeds.
In January, OpenAI CEO Sam Altman said software-development customers “prioritize speed above everything else” and announced a partnership with Cerebras. He also noted that such requirements are less critical for everyday ChatGPT users.
As the AI market shifts from model training to large-scale deployment, the race for faster inference has become a defining competitive front – and for the first time, OpenAI, according to documents and sources, is signaling a potential technological break from Nvidia’s dominance.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.







