What makes DeepSeek great? How will DeepSeek’s moment impact the trajectory of AI in China and America?
To find out, ChinaTalk interviewed Kevin Xu, formerly of GitHub and Obama’s Press Office. Kevin is also the founder of Interconnected and Interconnected Capital.
In this knockout roundup, we explore…
What DeepSeek does and doesn’t illustrate about Chinese innovation,
Tensions between open-source cosmopolitanism and nationalism in DeepSeek and the broader Chinese tech community,
DeepSeek’s organizational and talent management strategy, parallels to OpenAI, and what the fame will mean for the firm and Chinese AI policy,
What DeepSeek should and may mean for the future of export controls and broader US innovation policy.
Have a listen on Spotify, iTunes, or your favorite podcast app.
Organizing Greatness
Jordan Schneider: Kevin, what makes DeepSeek great?
Kevin Xu: My one-liner description of DeepSeek for anyone catching up is that DeepSeek’s newest model is very capable, very affordable, and very open-sourced. DeepSeek as a company, startup, or what I would actually call more of a university-like AI research lab, is also very idiosyncratic.
They’re idiosyncratic because what they have accomplished and how they’re set up does not represent the norm of how most Chinese tech companies, US companies, or really any big tech companies around the world are set up. This comes out in three different dimensions that I wrote about in my newsletter right after New Year, after reading the V3 technical paper.
First, the lab has no pressure to commercialize whatsoever. While they have an API that charges some money, it’s probably at cost or slightly above cost just to recoup what they put into training the model. They are not OpenAI — they have no revenue expectations. They probably don’t have a sales team and likely never will. This has given the entire team significant freedom to quickly iterate and improve their model based on the latest open research.
Second, they run their own data center. They’ve been running their own data center since probably 2019, before ChatGPT and before expert control. This setup is quintessential to their roots as a quant fund, as all quant funds probably run their own on-prem data centers. This allows them to maximize the speed and efficiency of trading algorithms with their purchased hardware to get the most return for their LPs or fund. That level of hardware-to-software expertise from running their own data center has led to much of the mind-blowing innovation, discovery, or new paradigm that commentators have been discussing recently.
There is significant software-to-hardware optimization on the network bandwidth traffic balance loading. Recently, in the R1 paper, the engineers are literally going a layer below CUDA to work with the assembly language layer to control different parts of their GPUs, squeezing the most performance out of them to provide a model that is both very capable and affordable.
Because they have no commercial goals, strategy, or pressure, they’re willing to open source as much knowledge as they feel like. They’re probably the most open-source model we have in the market to date. All of this makes DeepSeek both very impressive and interesting, but also a unique example of what is happening in AI. Most people should pause before extrapolating a larger pattern from what DeepSeek has done.
Jordan Schneider: Let’s talk about hiring practices and labor design. This builds off a piece we just ran on the ChinaTalk newsletter from JS Tan, a PhD candidate at MIT who used to work in the cloud industry. There are at least three parallels between OpenAI and DeepSeek.
We have a very young CEO who is doing this out of the goodness of his heart, curiosity, and super long-run vision about how this is going to pay off — parallel number one to OpenAI. I remember in 2019, when a journalist asked Sam Altman how they were going to make money, he essentially said, “I have no idea. We’ll figure it out at some point.”
Parallel number two to OpenAI — DeepSeek is super small. OpenAI in the pre-ChatGPT era was in the hundreds. In 2020, it had literally 150 people, which is the same number of engineers that DeepSeek currently runs off.
They both have a bias towards young and truly exceptional talent that they’re willing to pay as much as needed to, outbidding the likes of ByteDance, Tencent, and Alibaba. There are all these lures of the OpenAI era where you had a 25-year-old intern or a 26-year-old hire basically coming up with key fundamental innovations that ended up unlocking GPT-3.
We’re at a point in AI engineering where having decades of experience is as much of a hindrance as it is an advantage in really trying to push the frontier of knowledge. To quote from Liang Wenfeng 梁文锋, the CEO of DeepSeek, “We need people who are extremely passionate about technology, not people who are used to using experience to find answers. Real innovation often comes from people who don’t have baggage.” A headhunter said they look for people with three to five years of work experience at most — any more than eight and you’re just a pass.
Generally, you hear stories about Chinese technology media and the Chinese tech ecosystem not hiring anyone under 30 or under 40, not because they’re inexperienced and can’t do the job, but because they can’t work a hundred hours a week, have kids, or might have a heart attack if pushed too hard. This is really different.
The other thing they’re doing — they have no KPIs, no hard organizational structure, no silos, no bake-offs between different teams competing against each other. That whole energy and ethos is, as you said Kevin, much more like a happy academic lab of young engineers super psyched to create the future. This contrasts with what’s presumably happening at Tencent, Alibaba, and even Google, where there is intense outside pressure to prove a return on investment in model making, which may not necessarily be the best way to deliver breakthrough innovations the way AI is set up today.
Kevin Xu: There are many ways we can break down this unique cultural or organizational structure. Big shout out to JS Tan for a really great piece that I found myself nodding along to because it’s very spot on.
It’s important to understand how DeepSeek could pull this off. Coming back to what I said about no commercial strategy, they happen to be able to benefit from a quant fund that works right next door, which can basically fund whatever they need so far. Their approach to talent, if I can summarize what you said, Jordan, is that it minimizes politics and maximizes potential.
In a world like AI and AGI development, where none of us really know how to get there, we all have inklings. They reference OpenAI, look at what Llama is doing, probably look at what some other Chinese labs are doing, and have hunches of their own. It’s a killer to progress if you have a hunch in the middle of the night and then have to wait two weeks to get your GPUs before you can validate those hunches — based on my observation, that doesn’t exist in DeepSeek.
You have to minimize the organizational structure so you can get the resources you need almost immediately to validate these experimental needs. You need young talent that doesn’t have the baggage of experience, prestige, or previous accomplishments to be willing to try random ideas that may or may not succeed, with no ramification to failure.
Sam gave an interview to Bloomberg around New Year’s where he reminisced about that era of OpenAI. When they were trying to staff up OpenAI back in 2016 or so, they wanted to optimize for younger talent. They actually couldn’t get experienced talent to come work for them because building AGI at the time in Silicon Valley was a bit of a taboo. People didn’t want to be associated with it, preferring to work on something more utilitarian like self-driving, SaaS, or traditional machine learning applications.
DeepSeek has replicated that, maybe intentionally or not, in a way that is completely unencumbered from commercial need. Liang did try to raise venture capital, which goes into another nuance of the current state of Chinese VC — there’s zero appetite for the kind of idealistic vision or dream that DeepSeek is designed to pursue. He couldn’t raise any money even though he tried, but they had the people and some GPUs, so they gave it a shot.
You can contrast that with other well-known or well-funded AI labs in China, whether it’s Drupal, Moonshot, or obviously all the big models coming out of Alibaba, Baidu, and Tencent. They have either their internal ROI or commercialization goals to be accountable to, or they have state-backed or financial VCs from China to be accountable to. The way you build those companies is you have to hire well-known AI quantities to attract talent and convince investors to give you money. DeepSeek just didn’t need to do that. Maybe they wanted to at a point in time, but clearly, they’ve given up on that path, and it has served them very well.
Jordan Schneider: What’s interesting is the contrast between this playbook and other big Chinese industrial policy efforts. Semiconductors is probably the most obvious, where SMIC, famously founded by former TSMC engineers, and CXMT followed the same deal. Huawei poached enormous amounts of talent from Taiwan and South Korea to get its chip design and engineering up to speed.
There’s something really special and magical about AI today where decades of experience just happens to be more of a detriment than helpful. We are living in this incredible technological paradigm shift where people are discovering new laws, principles, and algorithms every day. Having the muscle memory of dozens of years of doing this back in a prior paradigm is actually more harmful than helpful, which is just the coolest thing.
Kevin Xu: Many folks are trying to fit what DeepSeek has done into some broader AI strategy out of China, another block into the industrial policy conversation you’re talking about. What SMIC has done makes total sense because qualitatively they’re trying to catch up to something that’s known — semiconductor manufacturing. TSMC has been way ahead of anything in China for a very long time, Samsung has been ahead.
They poached Liang Mong-song 梁孟松, who was the former CTO-ish figure [excellent
Asianometry video on him here]. While he didn’t get a CTO job at TSMC, he went to Samsung, helped Samsung catch up to TSMC, and then tried to play the same role with SMIC to help them catch up to Samsung and TSMC. All you need is one or two people who already know how this is done to get you to do something that already exists as quickly as possible. That means getting the experienced person, having their halo effect get you the talent and equipment, and then proceeding with state subsidies.This is qualitatively different from AI research or AGI development, where we’re all just maybe six months, nine months ahead of each other into the unknown. None of the halo effect of experience — well, I shouldn’t say none of it matters, but it matters so much less than in other well-established areas of industry and technology, like chip manufacturing.
Jordan Schneider: Liang clearly, at least given his interviews, is the most AGI-pilled of all the Chinese AI lab leaders, as well as the most open-source pilled. Kevin, before launching your fund, you worked at GitHub for a long time and had significant interaction with the global open source community as well as the community in China. Let’s discuss how you see their energy baked into the team and motivation at DeepSeek.
Kevin Xu: I’ll explain this in two parts. Part one is open source itself, which at the basic level is a way to develop and build technology in the open, letting anyone find, download, share, and change the code built in the open. The assumption is that code itself has very little value in isolation — there’s no need to keep it behind a license or closed doors. Building in the open is the best way to build technology. This has been a 40-plus year movement.
The movement really took root in China in the mid-2000s. The Linux Foundation, the nonprofit steward of Linux, the largest open source operating system in the world, has been doing events and hosting conferences in China since the mid-2000s. Looking at Liang’s personal history, that’s when he started college. He was probably a star high school student, very technically gifted.
There’s a generation of Chinese entrepreneurs around that age group — late 30s, early 40s now — who are very much open source enthusiasts. They believe it’s one of the best things that happened to their own discovery and self-development. Now that they’ve come of age, they’re very pro-open source because of its benefits and positive-sum elements.
Coming back to DeepSeek’s idiosyncrasies, Liang could pull this off because he has a fund that could fund this research project with no need to directly commercialize the open source model, which is very different from other competitors.
From a cultural perspective, inside the Chinese open source community, there’s something I loosely describe as open source zeal or calling — “kai yuan qinghuai” (开源情怀). This manifests in a strong corps of engineers in China who love open source. They used to use mostly Western open source software to do their work, help their companies catch up, and learn more about technology.
About six or seven years ago, there was a turning point where many of these engineers, likely including Liang, wanted to produce open source themselves. They wanted to contribute, make their own open source projects, and have the world use them. They get incredibly excited when Western firms, especially Silicon Valley firms with brand recognition, use their database, microservices, sidecar, or whatever infrastructure package.
They provide free support, fix bugs in the middle of the night, and answer questions in forums. For them, it’s a source of validation and approval. They understand how little respect Chinese engineering and technology gets — whenever China develops something, it’s usually labeled as cheating or stealing. It’s somewhat nationalistic but also engineering pride — they want to prove their open source products can be used and loved by Western firms too. They care less about money than that validation.
This “开源情怀” (open source ethic) is evident in what DeepSeek is doing and hopefully will continue to do, though it’s an open question if they can continue to open source in perpetuity.
Jordan Schneider: Kevin, I’m going to hit you with another one — “崇洋媚外” (to revere foreign things and pander to foreigners). There’s this really interesting dichotomy. Looking closely at Liang’s two extensive interviews, the first one in 2023 was very much in that open source zeal and ethos — “I’m doing this thing, there’s no profit motive, it’s going to be good for humanity. It’s kind of crazy, but look how much we’re going to give to the world through our models and algorithms.”
Then in the July 2024 interview, the focus shifted more toward showing the world and Chinese engineers that they can achieve hardcore innovation (“硬核创新”). As we discussed at the beginning of this podcast, Chinese tech firms have this chip on their shoulder, poor organizational structure, poor incentives, and they believe all they can do is commercialize Western technology — just do consumer software stuff, not really push the frontier of innovation from a software perspective. Liang is saying, “Look, we’re doing this and it’s going to be awesome.”
A few weeks ago, Feng Ji 冯骥, the creator of Black Myth: Wukong, talked about DeepSeek as something that shapes China’s national destiny (国运). This tension between the open source, global surplus good that AI can bring and which country will do it first and make the most of it is really going to define the future of DeepSeek and the future of AI in general — where the world ends up falling on that divide.
Kevin Xu: The tension there is very palpable. They’re not irreconcilable, but there is a tension. Open source, as I described, is borderless by definition. Code speaks for itself. Anyone can leverage it, use it as this public good, which has its own drawbacks as well.
If you go to any open source conferences, which I’ve been to many up and down the stack, engineers all speak variants of English in different accents from around the world. Their community isn’t defined by nationality — it’s defined by whether you’re a C guy or a Rust person, a Kubernetes person or an OpenStax person. These are different camps of technology and open source that divide themselves into their own little tribes. This is completely counter to national identity as a whole.
When you come to the uniqueness of the Chinese engineering and Chinese open source community, there is this nationalistic pride or chip on their shoulder. They want to prove to the world that Chinese open source technology — which by definition becomes Chinese technology as a general matter — is good enough for the world to use. At least at this point, there’s no expectation for the world to pay them back for this contribution.
This goes into the idiosyncratic nature of DeepSeek’s team, which based on all estimates is entirely folks trained and studied in Chinese universities. Some might have done stints in foreign companies, but none for very long — none have spent 10 years at Google in Silicon Valley or Meta before getting poached back to work on stuff in China. This is a very unique identity.
It almost feels like a recruiting pitch. Every time Liang gives an interview, which is very rare, there’s always a recruiting element. He wants to attract the best and brightest because he has to fight against all the other big shops who probably offer more money than DeepSeek can match. What DeepSeek can provide is freedom to explore and that open source pride that’s very attractive to many engineers. Many engineers will work for less salary if the company is more open to sharing and more open source friendly.
Taking that into different directions regarding how he’s building the lab makes a lot of sense, but the tension is certainly there.
Jordan Schneider: Regarding the salary aspect, it’s easier to literally pay top dollar when you’re paying 25-year-olds, not 40-year-olds. The reporting actually works the other way where companies say they pay the most, but it’s much easier to pay the most for junior talent, which is less proven, versus the Stanford PhD who spent 10 years at Meta that everyone is bidding on.
Kevin Xu: They’re willing to pay high dollar to take a chance on younger, unproven talent, which many bigger companies either don’t do, or they push young talent into the system without much attention.
Jordan Schneider: Let’s talk about the future for DeepSeek. Continuing our OpenAI analogy — OpenAI didn’t have a hedge fund attached to them, but they had donations from Elon Musk, Sam Altman, and others.
Kevin Xu: They’re both nonprofit, in a sense.
Jordan Schneider: At a certain point, they realized that to scale both their product and research, they needed to partner with a hyperscaler, which turned out to be Microsoft — and now it’s Oracle and Dubai money. How do you think this plays out for DeepSeek from a corporate organization perspective after experiencing this moment?
Kevin Xu: My best guess is that DeepSeek has a good chance of maintaining its current structure for at least another year or two, which in AI is practically an eternity. They’re very well set up to do what they do, almost accidentally. I don’t know if Leung ever thought about doing DeepSeek when he started a hedge fund, but being a quant hedge fund allows for secrecy.
Recently, I learned that Liang worships Jim Simons, the founder of Renaissance Technologies, which is one of the most successful and secretive quant hedge funds ever. Liang wrote the Chinese foreword to the definitive biography on Jim Simons a few years ago — The Man Who Solved the Market by Gregory Zuckerman of Wall Street Journal, which is a fantastic read for anyone interested in financial history. Liang’s media shyness mirrors Simons’ approach. As long as his hedge fund continues performing well, they could maintain this structure for a long time. They don’t even have a PR department or typical big company infrastructure.
What might counterintuitively change this situation is if US export controls are loosened due to their development — which relates to the discussion about export controls being ineffective because of DeepSeek. If NVIDIA products become more accessible again, DeepSeek and other Chinese firms will immediately seek to acquire them. This scenario would require them to secure substantial funding quickly, either through partnering with a hyperscaler or taking outside investment. The dynamic could shift dramatically — even if you gave DeepSeek $100 billion tomorrow to build AI infrastructure, their deployment options would be limited, partially due to export controls.
Jordan Schneider: Even if export controls don’t change or get tighter, there’s still lots of compute available. We have a story coming out in a few days about China’s weird chip surplus, where apparently there are tons of chips floating around. Random local governments have them; ByteDance, Huawei, Alibaba, and Tencent — these guys are no slouches. The opportunity to scale inference and their experiments if DeepSeek gets a partnership is really remarkable.
I can only imagine the conversations going on in the hyperscalers who have their own AI labs. There’s this famous quote from Satya Nadella after ChatGPT dropped where he brings Peter Lee, who ran Microsoft Research, into the office and interrupts him mid-presentation to say, “OpenAI built this with 250 people. Why do we have Microsoft Research at all?”
Leaders at the hyperscalers are looking at their teams thinking, “We should just sign some partnership like Microsoft did with OpenAI because this is the golden goose. These guys and girls are just better than what we have.” If someone comes knocking on Liang’s door and says, “Look, export controls or not, we can give you 10x the compute budget you have now” — that’s a very interesting scenario.
Kevin Xu: I’ll put another company on people’s radar from an AI monitoring perspective on the Chinese side — Xiaomi. From what I understand, one of the key authors of the DeepSeek V3 development paper got poached by Xiaomi very recently to lead their AI division.
Xiaomi has been making a lot of noise about doing more with AI, going all-in on AI. They have all these EVs that are really good, they’re going to do self-driving, and they already have devices everywhere. They have a lot of channels to diffuse gen AI. That’s another company that doesn’t get talked about frequently when discussing Chinese AI.
To your point, Jordan, there will be many interesting forks in the road for DeepSeek that they didn’t anticipate. There’s a lot of speculation about whether DeepSeek’s release coinciding with Trump’s inauguration was intentional, or if it was to push back against this or destroy NVIDIA because their quant fund has an NVIDIA short position — all conspiracy theory mumbo jumbo that’s fun to read but shouldn’t be taken too seriously.
Looking at the timeline of the most important releases, DeepSeek V3, the base large language model — not the reasoning model — was released the day after Christmas. If you want to maximize your press impact in the Western world, you’ve chosen the worst time to release anything. Your PR person should be fired immediately.
Regarding R1’s release coinciding with Trump’s inauguration day — again, if you want to maximize impact, no one in the West is paying attention to anything that isn’t about President Trump. It took a week for us to really grasp what R1 meant after its release.
The Stargate announcement of $500 billion caught many people in the AI world by surprise, with consensus skepticism about whether they really need this or have the money. The R1 paper just sat there until people realized, “Wait, there’s this other thing we can use that costs so much less to train.” The numbers are obviously cherry-picked, but Stargate probably elevated DeepSeek in a way they never thought possible.
The only real forcing function for the DeepSeek release schedule is Chinese Lunar New Year. Just like American companies want to finish shipping before Christmas, Chinese companies want to complete their shipping before Chinese Lunar New Year so they can enjoy the holidays and return for another year of hustling.
Jordan Schneider: I was just thinking about other cultural changes about to hit them. The OpenAI attrition was interesting — there was all this Sam Altman skepticism, people didn’t believe him or thought he was a liar. They had that moment in December 2023 with this sort of false unity. Then all these people who used to be 25 years old, now in their early 30s, were like, “We’re done with this place and this guy.” OpenAI had a lot of senior research turnover over the past year.
DeepSeek is at the beginning of that cycle. Not to say Liang has any integrity issues, but just the sheer amount of money that other AI labs offer... things are going to change, perhaps not for the better, given all this attention.
US-China Relations and the Tragedy of Fame 偶像包袱
Jordan Schneider: Let’s talk about the political angle. Right before Chinese New Year, Liang gets a little glow-up with Li Qiang sitting down as the only AI representative in this work report preview. Now that the eye of Sauron is turning towards open source AI in China, what do you think that means for the firm and for open source AI in general?
Kevin Xu: I honestly have no idea. The Li Qiang meeting was very interesting to observe. For context, the setting was a work report where they regularly do these “top leader learn from industry leader” setups — what’s happening, what you’re doing, and in the most benign sense, how can the government help?
The most reliable leak from the meeting was that Liang predictably told the country’s leader they need more chips, they’re hardware constrained, export controls really hurt, and it would be great to have more chips. The other participants weren’t from tech — they were from different industries like sports, science, and robotics. Robotics people were actually overrepresented, and Liang was probably the only AGI model builder.
The fact that he’s now on the country’s top leaders’ radar is significant. To what extent? I honestly have no idea. Will this become part of some negotiation between President Trump and President Xi? If their last call mentioned TikTok, will their next call mention DeepSeek? That’s interesting to monitor.
There’s a chance the state could do a little too much to mess with DeepSeek’s little world that’s working really well, or they could let it be. Frankly, I have no idea how this will turn out.
Jordan Schneider: Before we get to the US angle, let’s look at the ledger of helpful versus harmful things. More chips? DeepSeek can get more chips by partnering with hyperscalers. That’s probably way less headache than trying to work with the Beijing municipal cloud, whose interconnects are probably already eaten away by rats.
Do they need tax incentives? No. R&D credits? Not really. Annoying things that could happen include golden shares, forcing the team to hang out in Yan’an 延安 for 10 days to see the caves and be more red, party cells, party meetings — a little annoying, but not the most annoying. It’s just something you have to deal with.
The bigger question to me is the long-term question of if and when open source models get too powerful for comfort, what does the Chinese government think about its firms pushing the frontier? Right now these models are fun, they’re fine — you can use them to write poetry, maybe book a restaurant for you, but you can’t just hack your city’s hospital system or read police reports by having a DeepSeek model on your laptop.
We’re on the trajectory where that sort of thing might happen. My base case is that once the models get more powerful, you’ll start to see more controls on what is and isn’t allowed to be released to the world. Now that folks on the Central Committee are going to be following this more closely than before, this open versus closed source dynamic with AI software — I see downside scenarios for a company that really prides itself on putting everything it does out into the public.
Kevin Xu: There are definitely many downside scenarios. Regarding how open source fits into the government perspective, at least from what I’ve seen in Chinese government releases, they’ve been interestingly very pro-open source so far in how they project their thinking about technology abroad, almost as a source of soft power, whether knowingly or not.
The Ministry of Industry Information Technology (MIIT) released part of their four-year report stating they want to see two to three very well-known, recognized Chinese open source projects by 2025. If they were to do the progress report today, DeepSeek would be on it — a great KPI moment for everybody in that ministry, whether they had a role in it or not.
More recently, Foreign Minister Wang Yi, speaking at the UN, talked about open source AI as something China wants to use to support the global south and help them develop their AI. At least from the outside perspective, they’re “very open source friendly.” From a projection perspective — how does that work when one of your labs is at the frontier or very competitive? These are just a few data points folks should keep in mind as we think about the distribution of outcomes for DeepSeek, from worst to best.
If any Chinese government leaders are listening and want helpful advice, a useful thing would be to unlock higher-quality data for training. There’s always going to be a data problem as far as I can see. Reinforcement learning has been put on the map again because of R1’s progress, so maybe synthetic data and models training each other can get us farther. But there’s always more quality data that could be unlocked, and the government has quite a lot of it.
Jordan Schneider: I want everyone to be fully into the AGI. We need that ancient Shang dynasty wisdom for our future AI overlords. To demonstrate part of the big freak out, I’m going to refer to this quote from Mark Zuckerberg on Joe Rogan:
Mark Zuckerberg: “If there’s going to be an open source model that everyone uses, like, we should want it to be an American model, right? There’s this great Chinese model that just came out — this company, DeepSeek, they’re doing really good work. It’s a very advanced model, and if you ask it for any negative opinions about Xi Jinping, it will not give you anything. If you ask it if Tiananmen Square happened, it will deny it.”
Jordan Schneider: My contention to you, Kevin, is this seems kind of overblown. The percentage of global queries that will be about Tiananmen or Xi Jinping thought or Xinjiang is basically zero. How would you conceptualize what companies, and countries more broadly, gain and lose from their model being the open source one that the world thinks is the coolest and wants to build on?
Kevin Xu: What Zuck said was very interesting. I wrote a post on my Interconnected newsletter about this too. Regarding whether Zuck was right or not, this has been part of his core strategic message for a few years — the “Xi or me” argument. Because Meta has been under US regulatory pressure for many years, one of his more effective arguments is that you can drive him down, but the world will just use TikTok, which they do, or use less Instagram, and that’s actually worse for US competitiveness.
When it comes to censorship in the model, I did some testing on this and need to update my post. With an open source model, it really depends on where you run it, not whether the model itself is censored. Every single model coming out of DeepSeek and the open source ones from Alibaba are probably trained from the world’s Internet knowledge as a starting point. Different things get fine-tuned out or fine-tuned in during post-training.
If you use these models on your own laptop by downloading them because they’re open source — I use Ollama — the behavior is very different than if you use it as a chatbot on their officially hosted website. The censoring or business logic that prevents your chatbot from saying potentially problematic things happens more in the cloud layer than the models themselves, though some occurs in the model as well.
There’s a third element — if you put DeepSeek on a third-party, non-official DeepSeek hosting environment, it also behaves differently. Perplexity, a leading AI search startup that’s really hot, is killing it with their DeepSeek deployment to get more users because they’re deploying it on their own US cloud server, promising no censorship, while still leveraging their best open source model, DeepSeek R1.
Regarding the broader global AI diffusion conversation — probably not many people would ask a chatbot about Tiananmen or other politically taboo topics in China. If they’re not getting the information they should, that’s problematic. Right now, these open source models really shine in business adoption settings. If you’re a large company wanting to use AI to summarize customer service scripts, help salespeople be more efficient, or summarize internal knowledge bases, OpenAI APIs are likely too expensive.
Having this super affordable option that you can deploy on a third-party cloud — it’s now on Azure, a very interesting late development. Azure has embraced DeepSeek as one of their first-party models you can just click and use. Microsoft’s pragmatic CEO is being shareholder-friendly — people want DeepSeek, so let’s put them on Azure.
Microsoft isn’t worried about the supposedly censorious characteristics of DeepSeek or any Chinese open source model because you don’t have that effect if you serve them in the right cloud setting for the right business use case. Most businesses just want to get AI going with a cost structure that makes economic sense. Nobody wants a $200 per month pro subscription account that doesn’t last long. People want cheaper AI, which is what open source brings to the industry.
This validates Meta’s open source strategy — pushing out Llama without concern for profit because they don’t need Llama to make money, really eroding the closed source model moat that OpenAI and Anthropic have been erecting to protect themselves. The big question going forward is how long that could last.
Jordan Schneider: Another argument beyond spreading Xi Jinping thought is the slightly more nuanced one that DeepSeek can now run on Ascend 910s, Huawei’s AI chip. The argument goes that if people standardize on DeepSeek, and DeepSeek partners with Huawei to make their chips, this could make Huawei chips — which are currently not all that competitive with NVIDIA’s — a much more exciting buying prospect. This applies both domestically in China and for global cloud providers who, thanks to the AI diffusion rule, will have a harder time getting Western chips. It’s analogous to how Huawei was able to outcompete Ericsson and Nokia.
Kevin Xu: Beyond the hardware, from a US competitiveness perspective, what should really concern people about the Huawei ecosystem benefiting from this is DeepSeek engineers’ ability to work below CUDA to really maximize their NVIDIA GPU. That level of low-level engineering could potentially help Huawei’s software ecosystem.
The bigger struggle for people to adopt anything that’s not NVIDIA is that CUDA is still dominant. Huawei has been trying to promote their own equivalent of CUDA, working closely with PyTorch, the open source training framework. To really mold that ecosystem and push developers away from CUDA — not necessarily NVIDIA GPUs — there needs to be more sharing of how DeepSeek has optimized this to make the software layer of the Ascend series more competitive.
This could erode the CUDA moat that, by extension, is sort of the American moat — something many people still don’t fully grasp. Even if Ascend hardware has better FLOPS and better performance at all scales, people still won’t use it extensively because of CUDA. This could really flip that situation. That’s another interesting unknown where DeepSeek comes in.
Jordan Schneider: Here’s our transition to export controls, because Huawei’s limitation is on the manufacturing side, which comes back to their inability to import semiconductor manufacturing equipment — that’s why we’re in this whole situation. Let me quote this clip from President Trump:
President Trump: “Today and over the last couple of days, I’ve been reading about China and some of the companies in China, one in particular coming up with a faster method of AI and much less expensive method. And that’s good because you don’t have to spend as much money. I view that as a positive, as an asset. So, I really think if it’s fact and if it’s true — and nobody really knows if it is — but I view that as a positive because you’ll be doing that too, so you won’t be spending as much and you’ll get the same result, hopefully.
The release of DeepSeek AI from a Chinese company should be a wake-up call for our industries that we need to be laser-focused on competing to win because we have the greatest scientists in the world. Even Chinese leadership told me that. They said you have the most brilliant scientists in the world in Seattle and various places, but Silicon Valley — they said, there’s nobody like those people. This is very unusual. When you hear about DeepSeek, when you hear somebody come up with something — we always have the ideas, we’re always first. So I would say, that’s a positive. That could be very much a positive development. Instead of spending billions and billions, you’ll spend less and you’ll come up with hopefully the same solution. Under the Trump administration, we’re going to unleash our tech companies and we’re going to dominate the future like never before.”
Jordan Schneider: Whoever wrote this I am sure is a ChinaTalk listener. DM me, we should chat.
Kevin Xu: They must be! I was very surprised and mildly impressed by how quickly DeepSeek made it into President Trump’s speech to a group of mostly Republican congressional leaders. This was a very US domestic event. He was talking to the Republican conference about his legislative agenda — tax cuts, deporting illegal immigrants, and so on. Then he just worked DeepSeek in there on the day of the big market crash that DeepSeek triggered.
What he said at the time was very complimentary, almost like a tough love message. “US Industry, you got to get your act together. Look at this DeepSeek thing — they’re kicking our butt.” But then it’s paired with, “The Chinese leaders tell us we have the best people, the best scientists, so we shouldn’t rest on our laurels.” That messaging has changed a little already over time, so we’ll see where this really ends up from a policy perspective.
I have to begrudgingly give kudos to the Trump White House speechwriting team for working that in so quickly. As a former communications person myself, they’re on top of it. Now regarding export controls, we’ve seen rumors of them tightening further under the Trump administration.
Jordan Schneider: Let’s roll another clip from Howard Lutnick’s confirmation hearing.
Howard Lutnick: “We’ve got to find a way to back our export controls with a tariff model, so that we tell China, ‘You think we are your most important trading partner — when we say no, the answer is no.’ It’s a respect thing. They’ve disrespected us, they’ve figured out ways around it. I do not believe that DeepSeek was done all above board — that’s nonsense. Okay? They stole things, they broke in, they’ve taken our IP. It’s gotta end. I’m going to be rigorous in our pursuit of restrictions and enforcing those restrictions to keep us in the lead, because we must stay in the lead.
…
I take a very jaundiced view of China. I think they only care about themselves and seek to harm us. And so we need to protect ourselves. We need to drive our innovation.
And we need to stop helping them. Open platforms, Meta’s open platform — let DeepSeek rely on it. Nvidia’s chips, which they bought tons of and found their ways around it, drive their DeepSeek model. It’s got to end. If they’re going to compete with us, let them compete, but stop using our tools to compete with us. I’m going to be very strong on that. I’m thrilled to oversee BIS, and I’m thrilled to coordinate and empower BIS with tariffs that will improve the strength. When we say no, the answer’s got to be no.”
Jordan Schneider: In the first news cycle of this, when I did my podcast last week with Miles Brundage, there was concern that this would be a turning point in the export control debate. People, myself included, were already worried about Trump turning export controls into a package deal he’ll use to get them to buy more soybeans. This is at least an initial interesting marker by the Trump administration that no, this stuff isn’t necessarily on the table when it comes to America striking a deal.
Let me take a step back, Kevin, and ask you more broadly. There’s something really sad about this. What DeepSeek has done and the team it has is something really special. People talk about what the US and China can collaborate on, and the first thing they point to is medical collaboration — wouldn’t it be great if we worked together to cure cancer? There’s going to be so much positive social good generated from artificial intelligence.
If the politics were different in China, if the government was different than it is today, that one piece of the Trump comment where he says “This is awesome, everyone’s gonna save a lot of money, this is gonna be cheaper and better for the world” — if it was almost any other country on the planet, that would be the reaction. But it’s not, because of all the reasons we’ve talked about in 500 other ChinaTalk episodes we don’t need to get into right now. I’m curious to what extent, Kevin, you think there is any sort of third path when it comes to US-China relations, strategic competition and artificial intelligence.
Kevin Xu: President Trump’s comment sounds more like a Silicon Valley investor, while Lutnick, who was supposed to be the Wall Street free market guy, sounds more like a hardcore DC China export control supporter. From a tactical perspective, it’s important for people to understand that export controls have worked reasonably well to achieve their goal of keeping Chinese AI progress from advancing further. Without export controls, Chinese AI labs would have gone much further than where DeepSeek or Alibaba has gone right now.
That’s a tactical question. The tactics arise from the strategy of “small yard, high fence.” People conflate strategy with tactics. “Small yard, high fence” is the why question. Export control is the how question. The how flows from the why. The big question mark now is whether the Trump administration challenges or changes the why of our relationship with China when it comes to AI competition. That will change export control one way or another in its entirety and could change the whole game. Many folks who have been hyperventilating about export control failing miss the difference between what is tactical and what is strategic.
Jordan Schneider: Let’s tease out some of those futures. On one hand, we have Dario Amodei saying we need to cut them off from the H100s, squeeze them as much as we can because we’re about to run into the singularity. We want liberal democracies to be able to decide how that works, not the CCP.
Then we have VC investor Trump saying this is great for the world — American companies need to put up or shut up, step up their game. It’s not my job to bail you out, and if you lose, whatever — it’s a global good, this is open source, things will work out, you all get great AI tutors. What other visions can you conjure, Kevin, of how this could potentially play out?
Kevin Xu: Dario’s essay is very much worth a read. There’s the benign humanity benefit path, and then there’s the strictly military competition, head-to-head path. One of the justifications in Dario’s essay for his argument is the military perspective — if China gets to AGI before the democratic or Western world does, they will apply it to military settings.
Coming back to the cancer-curing setting, nobody would argue or care whether the cancer solution from AGI is a democratic or non-democratic solution. If there’s a third way to parse this difficult path forward, can we disaggregate the use cases? There are obvious use cases where the nature is conflict — that’s military, that’s war, there’s no question about that. Then there’s business productivity, efficiency, all the enterprise AI stuff I mentioned. Finally, there’s the humanity benefit aspect — health.
Watching the Stargate press conference at the White House, they justified this not through military terms, but healthcare. That’s what Larry Ellison said. It’s potentially finding a cure for cancer, which Sam Altman mentioned. That’s the benign message justifying the billions of dollars of AI investment they hope to have in the United States.
There’s this interesting dichotomy that only the West can find the cure to cancer first, then license it to the rest of the world. Hopefully, this gets diluted over time when open source is more widely accepted from a positive-sum perspective. But we must be mindful that AI could be used for conflict as well, or for nuclear development.
Anthropic deserves kudos — based on what I know, they work with the US Government’s Nuclear Review Commission to red-team their new models, ensuring they don’t accidentally leak important and harmful information related to nuclear power or nuclear weapons, not just to adversaries but more importantly to non-state actors and terrorist organizations.
This represents the middle-of-the-road AI safety lens that deserves more discussion without preventing the world from finding a cure for cancer earlier, just because of this paradigm of conflict we can’t escape.
Jordan Schneider: Reading Dario’s export control essay in parallel with his October 2024 piece “Machines of Loving Grace” is instructive. With his biomedical background, he painted a clear picture of how medical research will be accelerated. But on the other hand, you have missile systems, drone swarms, and all the nastiness that can come from it.
The base case might be that it’s probably impossible to stop the PLA from benefiting from AI. They’re going to get the chips first, they’ll be able to squeeze all the secrets out of the best engineers and impress talent. They might not do it efficiently, but what the US, China, and the rest of the world can try to minimize is the crazy leakage — like instead of one kid going crazy and shooting up a school, they go crazy and release a bioweapon that kills half of humanity.
Kevin Xu: The latter case concerns me much more — either rogue actors or non-state actors, as opposed to reasonably irrational state actors literally blowing things up for unknown reasons. Not being deeply versed in military matters, I’m skeptical of what the latest AI models can really do for making weapons more harmful. The cyber attack angle sounds more plausible to me. From a physical weapons perspective, it’s hard for me to assess.
Even from the United States perspective, the Department of Defense has been deploying AI using Palantir software and similar tools. The biggest benefit people discuss is better, more efficient supply chain management — that’s an enterprise AI application, not an offensive one. It makes your back office and supply chain, which are very complicated in the DoD military setting, much more efficient. That’s no different from IBM wanting to use AI to make their organization leaner.
Jordan Schneider: The bull case is AI-driven scientific advancements which can then be weaponized. There are many naughty technical problems where solutions could be revolutionary — underwater radar, laser weapons — lots of sci-fi possibilities. With the equivalent of a thousand von Neumanns running at a thousand hertz per second, you can imagine developing some pretty crazy capabilities that could give you a dramatic military advantage.
AI isn’t there yet, but Dario believes it might be soon. Given the pace of advancement, it’s not crazy to consider that a double-digit possibility. With that premise, you have to take all this export control stuff really seriously. That’s where I come out on the issue. It’s unfortunate because there will be global social costs to restricting who can work on this technology and pursue its upside potential. But hopefully we’ll get there too.
Kevin Xu: Open source makes that diffusion answer very straightforward — it will just be out there. With open source, if you want to make money from it, most of the time you don’t even know who’s using it if they don’t tell you. That’s both the beauty and curse of building open source software. It’s always been the case and always will be. Hopefully the benign version wins, but we shouldn’t be complacent about the possibility of the non-benign version becoming reality.
Honestly, I trust nothing coming out of China. They have one objective, crush us.
Big factor in all of this - THEY ARE SO OPEN!? THEY NEVER AND WILL NOT GIVE OUT THEIR DATA?! As usual with them, I am very suspicious of them, and they make me very uncomfortable.
I agree with the statement "We want liberal democracies to be able to decide how that works, not the CCP." The question is, does the US under the current administration still fit that description?