Deep Tech VC on AI, Chips, & US-China Competition

Deep tech as alpha for humanity

and

Oct 16, 2023

James Wang is a general partner at Creative Ventures, a venture capital firm that invests in early-stage deep tech. They’re also the sponsor of today’s interview!

James dialed in from Taipei to join ChinaTalk for a discussion on national AI competition, hard tech investing, and the glories of automated pizza making.

Highlights from our convo include:

How China may struggle to move on from NVIDIA’s software stack
China’s AI catch-up game, and why it’s all about the data…and regulatory uncertainty.
How pizza robots explain whether AI will take your job or not
How deep tech startups turn obscure research into world-changing tech when they move slow and build things

Ain’t That Just Huawei?

Jordan Schneider: Let’s start with Huawei’s breakthrough — a seven-nanometer chip manufactured out of SMIC.

In a previous interview on ChinaTalk, Dylan Patel of SemiAnalysis and Doug O’Laughlin of Fabricated Knowledge had a rather bullish take on the broader implications of this.

One nit you have to pick with the bull thesis on Chinese chip manufacturing and AI hardware is that the argument suggests replicating what Nvidia did with CUDA would be a relatively straightforward task.

Explain what CUDA is, why it’s so important, and why folks use it to train frontier models. What would it mean from a software perspective if China starts relying on Huawei or Biren AI accelerators to make their own GPTs and clouds?

James Wang: The bull thesis sounds a lot like a hardware person talking about how software is easy…

CUDA is basically Nvidia’s proprietary property. Its programming language and APIs are not proprietary, but it’s a much easier way to write parallel code that’s targeted only at Nvidia GPUs. This has been one of Jensen Huang’s big initiatives.

Everything’s pretty much been built on this — a lot of the low-level libraries and AI research. If you’re going to move away from it, you have to rebuild a lot of infrastructure and architecture. You would need to start from scratch for a lot of it.

The argument that it’s going to be easy to move away from CUDA is belied by Advanced Micro Devices experience. They tried to replicate CUDA with their own version called ROCm, which is supposed to be compatible with CUDA, but it’s buggy and doesn’t work particularly well.

CUDA is part of NVIDIA’s big moat. They use CUDA to lock in a bunch of researchers using it to build up their models. It makes it hard to move away from it. Machine learning for GPUs almost exclusively still use NVIDIA GPUs.

Jordan Schneider: Is there a potential playbook where China could cobble together open-source products to get around a foreign software roadblock?

James Wang: Yes, but if you can throw infinite money and infinite time at anything, you can pretty much replicate anything.

From the perspective of a VC, you can always poke holes in anyone’s business model by basically saying, “Definitely Intel or whoever will be able to replicate this.” The question is, “Will they?”

China has a history of doing this better because they’ve needed to. Huawei has its own OS now, but there’s a reason they were trying to use Android for the longest time. You have to basically build up something that matches everyone else’s capabilities from scratch, which is a pain in the ass. Then you also have to maintain it yourself, which is annoying.

From the perspective of the entire global machine-learning community, for the optimization you’d basically have to take each and every single thing as you’re playing around with it, as you’re prototyping, and translate that into an optimized version of whatever else you’re using.

It’s a massive slowdown. It’s not impossible, but it’s a far step back if you have to start from scratch.

It’s not always like the AI equivalent of the Manhattan Project — brilliant minds working together to develop open-source AI. In reality, it’s a bunch of statisticians throwing stuff together. It sticks. It barely works. It’s poorly documented. It runs and no one’s quite sure why it works.

It’s the Data Economy, Stupid

Jordan Schneider: We now have firms like Scale AI valued in the tens of billions of dollars. Ostensibly they exist just to create better data. Give us a sense of today’s data economy — who needs data and why?

James Wang: The received wisdom is that China is going to dominate AI because China has all this data. They have a ton of people. They have less data privacy laws. Suddenly, that’s going to mean that their AI will be much better. That didn’t turn out to be the case in terms of OpenAI and others.

OpenAI and other LLMs have true societal value. But value to society and economic value overall don’t necessarily translate to captured value for a company. For a lot of these models, the data underlying them is just stuff scraped from the web.

How much defensibility do you have for an open-source model? There’s not much defensibility. How are you going to have that ecosystem evolve over time?

Transformers and LLMs didn’t get better because we were able to throw more data at them. They got better because the representation of data and how we use it fundamentally got better.

Machine learning models around 2014 to 2015 had trouble telling the difference between pictures of cats and non-cats as two classes of things. They didn’t get better because we got more pictures of cats or more compute power. They got better because our models got better.

In terms of data economy, it’s not just about quantity — it’s about how proprietary it is. It’s also about the models.

When you combine all of this, China didn’t end up having as much of an advantage because the development has been within the models.

It’s not like the company with the best model suddenly wins. Everyone tends to have these models. It’s hard to capture value if you’re using one of these open-source models with data that’s easily available, even if it’s a high volume.

It’s going to be companies that have proprietary data that are going to be able to ride the wave of AI. It’s extremely hard for people to replicate. Usually, that’s data within the physical world.

Anything that’s within the digital world, anyone can either replicate it or just scrape it from the internet.

Everyone has the models. But the interesting things tend to be from physical data or data that’s just hard to create off the bat.

Jordan Schneider: How does proprietary data collection scale, and how does the cost of that compare to scaling compute?

James Wang: It cost millions of dollars to train ChatGPT. That work looks more like a semiconductor company than a traditional software company, where you can pretty quickly and pretty cheaply iterate and get products out there. You have to basically get the product right. You can’t just retrain the model — that’s expensive.

Proprietary data collection can be expensive — materials design, drug discovery, and other physical applications. The data doesn’t exist yet. It’s not just conveniently on the internet waiting for you to scrape it, potentially against terms of service.

Collecting this data is a physical activity that you can’t crowdsource to the entire internet.

A drug discovery company that uses AI isn’t going to look like a software company. It’s going to look like a pipeline company with AI on one and a robot arm on the other for a feedback loop.

When a startup is in its early stages, you have a robot arm that basically does the physical experiment — physically creating the drugs, proteins, or biologics. That’s super expensive to do. A lot of the companies that are going to be interesting are going to be the ones doing that painful step and incorporating that data well.

Model Competition

Jordan Schneider: Why are Chinese GPT models, like ERNIE by Baidu, lagging so far behind Western models despite all the research papers and data being publicly available? Why aren’t Chinese firms catching up faster and competing with better models?

James Wang: From talking to different folks within the Chinese and US AI ecosystems, there’s a particular reason why ChatGPT is better than Bard. A lot of it is hand-tuning common queries.

There’s a difference between something that’s hand-tuned that consumers find to be nifty and something that can work in any generalized circumstance.

Now, why has the Chinese AI ecosystem as a whole, not just in terms of LLMs, not been progressing as fast? This is speculation, but I’ve heard that China has much more stringent regulations on what you’re supposed to check and what the model is supposed to do.

Putting that much time and energy into compliance is a big burden. A lot of Chinese tech companies and entrepreneurs don’t seem the most motivated to create huge, impressive, globally dominant models.

A lot of the entrepreneurs are trying to get out of China right now. That’s partly because of the economic circumstances, but also because Chinese regulators have stomped on the tech ecosystem before. Everyone’s still wary about saying everything’s okay and we can go back to normal.

A lot of the energy has been drained from the Chinese ecosystem as well, especially from the entrepreneurial and exploratory frontiers.

A lot of this is self-inflicted, but they have a lot of room to do fiscal stimulus. They have a lot of room to do the right things to make their economy turn around pretty quickly. It’s just a question of willingness. Are they going to give domestic entrepreneurs free rein to do whatever they want?

The Chinese tech ecosystem was extremely inwardly focused — almost myopically in certain cases — creating domestic consumption for gaming and other stuff. That was a great rebalance for their economy, but they stomped on it.

Jordan Schneider: America can hire a lot of physics PhDs to evaluate queries. Chinese firms can do the same thing. AI’s diffusion as a general-purpose technology is arguably more important than bleeding-edge research when it comes to long term productivity.

James Wang: When you’re moving quickly, industrial clusters are even more important, because every time someone tries to take up something that’s trailing-edge, it’s already quite far behind. Each new jump opens up significantly more capabilities.

Semiconductors are the same way. If you look at TSMC founder Morris Chang 張忠謀, the germinating seed of the semiconductor industry still came from the US.

In terms of IP technology, there’s definitely something to be said about extremely fast-moving technologies and being in industrial clusters and having that cluster within your economy. It’s the exchange of ideas, the ability to test and play around with stuff.

Every barrier faced every time you add a new iteration slows things down a huge amount if you’re doing a huge number of iterations. From a prototyping perspective — the ability to have comfort and ease around trying new things — it matters a lot if you’re in the right place.

Modular pizza station : r/toolgifs — You may not like it, but this is what peak robot performance looks like. Picnic Works

AI’s Slice of the Pie

Jordan Schneider: Tell me about your work investing in AI for the pizza market.

James Wang: Pizza is huge … but burgers are even bigger. Some of these other food-service segments are even bigger than pizza. But the reason we think that pizza is the right place is because of the market structure. With burgers, McDonald’s has most of the market share. Everyone else is tiny.

Pizza is not only a large market, it’s an oligopoly. The biggest players — Pizza Hut, Domino’s, and others — are all similar in size. A startup can enter this market, have multiple shots at getting into one of these big players, and play them against each other.

If you were a food service or robotics company trying to go after burgers first because you looked at it and saw it’s the biggest market, you ignored the market structure. You have one customer that you need to sell to, and that customer has all the leverage in the world against you.

If you go after the wrong market in deep-tech investing, you’ll basically end up executing and then realizing it’s the wrong thing but not be able to pivot anywhere.

Jordan Schneider: One of your portfolio companies, Picnic, has robots that make pizza here in New York City. Why wouldn’t restaurants just pay people more to overcome the service labor shortage? Why can’t restaurants increase their margin to keep workers around?

James Wang: The online community or politicians say it’s corporate greed — that’s why people aren’t raising wages, and that’s why we can’t get people into these jobs. But businesses need people to make the business work.

When you’re talking about a pizza shop, food services, or logistics, they aren’t paying more because the business can’t sustain more. The pizza shop wants to retain workers to sell more pizzas, because otherwise the pizzas aren’t going to get made.

But how do you pay the workers more if your margins are already compressed down to ten or seven percent? You can raise the price of the pizza, but the customers might stop buying as much. You can try to compress the margins even more, but at that point you might just go out of business, especially in terms of instability.

You can’t ask your landlord to just bring down your rent. You don’t have room to increase wages. Most business owners are happy to increase wages. If they do, then maybe they can get good workers and stay for longer and not have to deal with the turnover as much.

However, if you suddenly have a magic robotic tool that drops from the sky, that makes a single worker ten times more productive. You can then sell more pizzas or sell them for cheaper. You increase your margins.

If you want to talk about corporate greed — yes, you’re not going to pay the worker ten times as much. But you’d be happy to pay them twice as much if they’re suddenly ten times more productive and you have higher margins.

Jordan Schneider: How far away are we from robot pizza makers?

James Wang: Some of the general models can help, but the problem is that it’s easier to use AI to either significantly enhance or basically replace certain tasks in the knowledge-worker sector. It’s much harder, but much more necessary to do that in the labor and physical activity sectors.

We’ve had millions of years of evolution to be able to manipulate and move things around. I can pick something up. I can hold it without crushing it. I can move it around. I can throw it. I can do all sorts of complex tasks that are complicated for models. Humans do these things natively.

It’s not the same as some of the large language models that just scrape the data from the internet. You have to generate the data, create the physical models, or create the models that can learn from physical activity in a better way.

If you mess up something on a digital level, you just can rerun it. The physical world doesn’t tolerate exceptions as well.

The ultimate example is probably self-driving cars. If you have an accident there, that’s a pretty bad outcome versus an accident in the digital world.

Jordan Schneider: How should people think about AI displacing and creating jobs generally?

James Wang: Even economists often say AI is different. It’s going to create mass job losses. That’s particularly strange because right now we’re sitting on a huge gap in terms of number of people filling jobs. We’re not on the verge of mass unemployment.

AI is a tool. Inventing shovels didn’t suddenly put construction workers out of a job. The more productive you are, the more willing you are to use workers and technology to do the thing that you’re trying to do, which is generate some sort of output. The output of the construction industry is a building, not the number of workers that you require to make the building.

The fully automated ChinaTalk Diner, coming to a metaverse near you?

Deep-Tech Investing, Public & Private

Jordan Schneider: What’s the role of incubators and accelerators in helping VCs bring deep tech to market?

James Wang: Accelerators or incubators like the NSF’s Innovation Corps, UC-Berkeley’s SkyDeck, and Stanford University’s StartX help the ecosystem by helping new tech take the first step.

They take a PhD researcher who is not used to thinking about markets at all, who’s just thinking about the basic research and what hasn’t been discovered. They’re making them think about how the technology can be used. They’re saying, “Let’s talk to customers. Let’s try to figure out if there’s any money to be made here.”

The VC ecosystem takes over after that first step. How do we get this technology into the right state to sell it to the right customers? What’s the right pricing model? How do we sequence the steps for creating and financing the project?

There’s a huge amount of complexity and difficulty in becoming a large company. Especially for deep tech, you don’t have much room for mistakes.

Jordan Schneider: How can government spending make leveraged investments in deep tech?

James Wang: Basic research is the place where general government spending can do the most good. Basic research is commercializable.

With R&D, we don’t know what we don’t know. We need to find out and spend money to get to a commercializable point.

Government spending after that point is different. A lot of Asian countries try to pick winners by spending government money to get something to commercialization. That tends to not work particularly well because of the dynamic we were talking about earlier. You have to pick the right market and think about market structure.

Move Slow and Build Things

Jordan Schneider: What heuristics do you use to decide what tech is worth investing in?

James Wang: With software you can invest in people, because software is easily changed. That’s oversimplifying, but you can basically change your code, change your Google AdWords target, and suddenly, magically be in a new market the next week.

Hard tech or deep tech doesn’t work that way. Even the AI we just talked about takes quite a bit of work and quite a bit of money to change what it’s doing. You need to retrain models, let alone the work required to change synthetic biology, trying to make yeast or bacteria do something completely different. You can’t do that in a rapid fashion.

With hard tech, you not only have to look at the team, which is still important; you have to understand the market you’re going after. If you don’t go after the right market to start with, then you might not have a company by the end of it. It takes too much money and time to change.

People get fooled because a lot of deep tech is super general-purpose. You can use AI or synthetic biology for a lot of things. The question is, “What is the right market to go after first?”

You have to not get deluded into thinking a technology is just a breakthrough or two away from being world changing. This is another trap a lot of people fall into.

There’s a difference between what we would call engineering risk and R&D risk. Investors may have a ten-year time horizon — but that ten-year horizon is for getting the technology out there and scaling. It’s not a ten-year horizon for sitting around waiting for the technology to mature enough to go to market.

We’ve seen this multiple times with gene therapy or AI, where it’s just a breakthrough or two away from viability. We look for something that’s ready for the market and take ten years to get the engineering and scaling right. Ideally, it’s less than ten years.

Jordan Schneider: What models work when trying to help hard-tech founders get the right business sense?

James Wang: The bigger successes in our portfolio come when the VC is much more active in the company. It isn’t just giving bite-size fortune cookie advice.

It’s taking a PhD who doesn’t necessarily have the interest or the inclination toward this and also giving them practical advice — how you try to transition toward a CEO role, for example. Or it may be giving feedback, like you’re either not good at this or you don’t seem to like it.

We need to bring in other people, too. That’s where you need investors. You do need a board of directors to make that happen.

We’ve had a popular trend of having founder-led, founder-friendly companies always defer to the founder. That works for software or companies with founders who were already super oriented in these marketplaces and their customers already. It doesn’t necessarily work for deep-tech companies.

Talking to some of the old-timers in the VC space, it used to be more popular to be hands-on, take board of director seats, and sometimes change the management of companies.

That became unpopular, however, during the software era. What we’re finding now is that there’s a real reason why VCs used to do that. Companies in deep tech are different.

The researchers who are motivated to launch deep-tech startups are often academics who believe their technology will otherwise never see the light of day. No one will take it and run with it because no one else even knows it exists.

Researchers have different skill sets and motivations. These don’t necessarily translate to a market orientation of wanting to make money, which is what is required to sustain the company and push it forward.

Lessons from…Bridgewater?

Jordan Schneider: You now run a VC firm which is a long way from Bridgewater, where we both started our careers looking at macroeconomics. How did you end up in tech?

James Wang: I ended up at Bridgewater after working in West African microfinance. I liked being able to look at the macroeconomy in all these different ways. It’s still useful in my current profession.

But the thing about global macro is that you can abstract away a lot of stuff. You don’t have to worry about the messiness of people and technological change. You’re playing at the 50,000-foot level, looking down at economies, comparing equities, exchange rates, and interest rates.

It’s probably the closest thing in finance to abstract math versus applied math. I wanted to get closer to the ground, closer to technology, especially after I saw a lot of interesting things happening in compute and AI or synthetic biology and advanced materials.

Hedge funds aren’t supposed to affect the economy. So you predict things, you trade on things, and you make money. But you’re not supposed to affect markets because if you do, you’re losing money from it.

With a VC, time horizons are much longer. We’re not explicitly an impact fund, but we’re investing in some of the biggest problems facing humanity. We’re looking at aging populations, labor shortages, and rising health care costs. With climate change, we’re looking at tech for the energy transition, to sustain decreasing carbon emissions.

If you scale these things, not only will we make money, but it will help the planet a lot. That’s not something you can say in the hedge fund industry. It is completely counter to what you would want to do if you’re at a hedge fund.

Jordan Schneider: I worked in client service and we had these a lot of our clients were sovereign wealth funds. Part of what Bridgewater would do is sell a return stream, as well as having Jordan Schneider tell them about their economy. And there was something that really struck out to me was: fiscal policy, monetary policy, it's important and you don't wanna screw it up. But at the end of the day, if you're thinking over like a decadal horizon, productivity growth is what's gonna get you there.

And what is productivity growth? It's human capital, education and healthcare. But it's also technology and figuring out how to get your country to continuously innovate. That just seemed to be like the real juice. And it's some the sort of whole Bridgewater-like vision of the world of this very Newtonian physics of everything is static, but technology just isn't at all. And we've all seen this in a beautiful way over the past over the past 24 months with the large language model revolution.

And it just seemed much more exciting to me than just like watching Argentina blow up for the 14th time and like figuring out how to find some spread between it and the Brazilian real.

So anyways, I'm glad we're both here. I think there's more more alpha for humanity for us doing this than playing around in bond markets. But anyways, revealed preference…

James tweets and has an AI-focused substack, Weighty Thoughts

12 Likes∙

2 Restacks

ChinaTalk