INTELBRIEF
February 6, 2025
The Geopolitics of DeepSeek: Narratives, Perception, and the AI Race
Bottom Line Up Front
- Through a series of advanced training techniques and architectural choices, PRC-based DeepSeek has reportedly developed its high-performing reasoning AI model with limited financial and computing resources.
- Skepticism about the computing power utilized, DeepSeek’s access to chips, and its training costs and techniques are warranted.
- By making its models largely open-source, DeepSeek democratizes access to advanced AI, challenging the dominant proprietary AI industry, while remaining strategically silent on its training data and some aspects of the training process.
- DeepSeek’s cloud-hosted models could serve as a powerful intelligence collection tool for the CCP, while also facilitating the dissemination of its narratives through stringent censorship.
DeepSeek's R1 model, released on January 20 as an open-source large language model (LLM), has ignited concerns and skepticism amid the ongoing U.S.-China race for technological supremacy. Developed by DeepSeek and backed by the Chinese hedge fund High-Flyer Capital Management, the launch triggered major volatility in U.S. stock markets, with Nvidia's market value dropping by $590 billion in just one day. The widely circulated claim about DeepSeek’s V3 model’s $5.6 million training cost, one of the major reasons for the panic, is misleading: it only represents the final training run costs and excludes the components that usually form the brunt of costs, including research and development for prior iterations, hardware, and other operational costs. While the model has been praised for its efficiency, with claims of low computational cost and strong performance on various benchmarks—comparable to OpenAI's closed-source o1 model—these assertions warrant careful scrutiny to assess their accuracy and their actual implications.
From a technical point of view, DeepSeek’s latest R1 model appears a remarkable feat. Through a series of advanced training techniques and architectural choices, DeepSeek has allegedly developed its models relatively cheaply and with limited computing resources, facing U.S. export controls on advanced chips to the PRC. Concretely, this included focusing on a range of optimization tools including the use of a Mixture of Experts architecture, model distillation, and precision training techniques to minimize the need for advanced chips and massive computing power. Perhaps most notably, it used Group Relative Policy Optimization, an advanced and more computationally efficient training method, that evaluates answers from the AI model in groups, providing feedback based on how each one compares to others.
Yet, claims about computing power and training methods should not be taken at face-value: While DeepSeek claims the use of Nvidia H800 chips, a less powerful chip designed to comply with U.S. export restrictions, the U.S. Department of Commerce is investigating whether it may have had access to restricted Nvidia chips through indirect channels. Additionally, there are multiple signals that point toward violations of OpenAI’s Terms of Service. DeepSeek may have engaged in inappropriate model distillation, a process where a smaller, simpler model is trained to mimic the behavior of a larger, more complex model, capturing its answers while being more efficient in compute. Microsoft believes that in 2024 individuals from DeepSeek pulled substantive data from OpenAI’s Application Programming Interface. Additionally, DeepSeek’s V3 model, released last December, occasionally referred to itself as ChatGPT, potentially but not conclusively indicating that inappropriate distillation took place –aligning with the widespread industrial espionage and IP theft by the CCP.
One of the most interesting components of DeepSeek’s strategy in the current geopolitical landscape is the release of its models as largely open source under the permissive MIT license, in a time of proprietary AI dominating the field. By making its models largely open source, DeepSeek effectively democratizes access to advanced AI technology, making restriction of its use unenforceable. Anyone can self-host DeepSeek’s models on their device. This renewed appreciation for open-source models is also a vindication for companies like Meta and IBM, which have been among a handful of U.S.-based actors releasing their models open-source. Yet, DeepSeek’s R1 release does not fully comply with open-source requirements: according to the Open Source Initiative, users must be able to use the model without seeking permission, study how the system works and inspect its components, modify the system for any purpose, and share the model with others with or without modifications. While DeepSeek has released the model weights for R1, it does not provide documentation of training data, and the complete source code used for training.
DeepSeek's rapid ascent to become the most downloaded application on Apple's App Store raises significant security concerns, similar to other PRC-owned applications like TikTok and RedNote. According to its privacy policy, DeepSeek stores user data from its cloud-hosted models on servers within the PRC. The collected data encompasses device information, keystroke patterns, IP addresses, chat histories, system language, and performance metrics. This extensive data collection could provide significant intelligence gathering opportunities for the CCP, which has passed multiple laws which require cooperation of private companies for national security purposes. These concerns have prompted governments and organizations to limit, prohibit, or ban the application altogether.
In addition to intelligence and privacy concerns, blatant security vulnerabilities have already emerged. Successful cyberattacks have compromised the application, and cloud security firm Wiz discovered an exposed database containing chat histories and sensitive information. The model also demonstrates clear censorship of topics sensitive to the CCP. When users inquire about subjects such as Taiwan independence, 1989 Tiananmen Square Massacre, or other topics typically blocked by China's Great Firewall, the model systematically deflects to other topics. This highlights the limitations and the potential for interference and propaganda within LLMs as knowledge hubs, a particularly useful tool for authoritarian governments to reach and influence foreign audiences.
AI is, alongside quantum computing, a key component in the U.S.-China race for technological supremacy. Both AI, particularly machine learning and predictive analytics, and quantum computing could enable China to offset its relative lack of military experience compared to the United States, enhancing real-time decision-making integrated into its command-and-control structures. Part of the PRC’s goal of achieving “military modernization” by 2035 is for the PLA to attain the military-technological development stage of “intelligentization” -- the integration of AI, quantum computing, and other emerging technologies. While DeepSeek's R1 model may seem like a remarkable achievement in optimization and a win for the open-source AI community, it is crucial to remain skeptical about both its technical and security implications. As more details unfold about its development and censorship practices, it will become clear that the AI race is as much about shaping narrative and perception as it is about actual scientific progress. The CCP, adept at leveraging media and information campaigns, is likely to prioritize crafting an image of technological supremacy alongside innovation and scientific development. Research from Graphika reveals how PRC state-linked social media accounts amplified narratives celebrating DeepSeek’s AI models before news of the R1 launch caused a dramatic drop in U.S. tech stocks. This underscores the need to approach DeepSeek's rise with caution, recognizing the role of strategic storytelling in the broader geopolitical tech race.