DeepSeek reveals R1 training cost just $294,000
Chinese AI firm DeepSeek disclosed in Nature that training its R1 model cost just $294,000, intensifying debate over China’s challenge to US rivals like OpenAI that report far higher expenses.
-
FILE - The smartphone apps DeepSeek page is seen on a smartphone screen in Beijing, Tuesday, January 28, 2025. (AP Photo/Andy Wong, File)
Chinese artificial intelligence developer DeepSeek has revealed that training its latest reasoning-focused model, R1, cost only $294,000, a fraction of what US competitors are reported to spend. The disclosure appeared in a peer-reviewed article published Wednesday in the journal Nature.
The company, based in Hangzhou, said R1 was trained on 512 Nvidia H800 chips over 80 hours. Founder Liang Wenfeng, listed as a co-author, has been largely absent from public view since January, when the firm's release of lower-cost AI systems triggered a sell-off in global tech stocks amid fears they could challenge industry leaders like Nvidia.
Training large language models typically requires extensive computing power and vast datasets, making costs a focal point in the race to develop advanced AI. In 2023, OpenAI chief executive Sam Altman said that building foundational models had cost his company "much more" than $100 million, though no detailed figures were provided. DeepSeek's figures, if accurate, represent a direct challenge to the notion that only firms with nine-figure budgets can compete at the cutting edge of AI.
Chip access and export controls
DeepSeek acknowledged in supplementary material that it owns A100 chips, which were used in early stages to run smaller experiments before switching to H800 processors. US officials told Reuters in June that the firm had access to "large volumes" of Nvidia's more advanced H100 chips despite export restrictions imposed in 2022. Nvidia, however, maintained that DeepSeek used H800s obtained through legal channels. The ambiguity has sharpened concerns in Washington about whether export controls can effectively limit China's AI progress.
Distillation debate resurfaces
The Nature article also touched on a controversy surrounding model distillation, a process in which one AI system is trained on the outputs of another. A White House advisor and several US AI experts previously suggested that DeepSeek had relied heavily on distillation from OpenAI's models. The company has defended the method, saying it improves efficiency and lowers costs, allowing broader access to powerful AI systems.
In January, DeepSeek confirmed that some of its models were distilled from Meta's Llama system. In its latest paper, the firm noted that training data for its V3 model included web pages containing "a significant number of OpenAI-model-generated answers," which may have indirectly shaped its development. The researchers added, however, that this was "not intentional but rather incidental."
Read more: Chinese tech continues to accelerate after DeepSeek’s AI breakthrough
Competition with OpenAI
The disclosure underscores the intensifying rivalry between DeepSeek and OpenAI, two firms that embody the larger US-China contest over artificial intelligence. While OpenAI emphasizes safety alignment and has kept most of its training processes proprietary, DeepSeek has pursued efficiency and transparency by publishing technical details that position it as a viable low-cost alternative.
OpenAI's billion-dollar closed ecosystem versus DeepSeek's claims of training competitive reasoning models for under $300,000 raises questions about whether compute scale remains the key moat for US AI firms. Investors and policymakers alike are watching whether DeepSeek's model marks the beginning of a new phase in global AI competition, where efficiency, not just raw scale, determines leadership.