DeepSeek R1 AI model trained for $294K

Chinese AI developer DeepSeek published detailed training costs for its R1 reasoning model in a Nature journal article on September 18. The company spent $294,000 to train the model using 512 Nvidia H800 GPUs over 80 hours.
The Hangzhou-based firm provided these figures in supplementary materials to the peer-reviewed paper. The H800 processors are modified versions of Nvidia's more powerful chips, designed specifically for the Chinese market after U.S. export controls restricted access to H100 and A100 processors.
DeepSeek revealed it also owns A100 chips, which were used during preparatory stages before the main R1 training began on the H800 cluster. U.S. officials have previously questioned how the company accessed restricted hardware. Nvidia has stated that DeepSeek acquired H800 chips through legal channels.
The Nature paper addresses the ongoing debate about "distillation," a technique where models learn from other AI systems. DeepSeek acknowledged that training data for its V3 model included web pages containing "a significant number of OpenAI-model-generated answers." The company described this inclusion as incidental rather than intentional.
"Regarding our research on DeepSeek-R1, we utilized the A100 GPUs to prepare for the experiments with a smaller model," the authors wrote in supplementary materials. The main R1 training then ran "for a total of 80 hours" on the H800 cluster.
The $294,000 training cost contrasts sharply with figures from U.S. competitors. OpenAI CEO Sam Altman has stated that training frontier AI systems costs "much more" than $100 million, though his company has not released specific budget breakdowns.
R1's release in January caused a significant market reaction, with investors concerned that lower-cost competitors could challenge established AI companies. The detailed cost disclosure provides the first official glimpse into how DeepSeek achieved competitive performance at a fraction of reported U.S. training budgets.
