Biotech. What is it? Why should you care? Does biotech really matter for national security? What are China’s biotech ambitions?
To find out, ChinaTalk interviewed the Chair and Vice Chair of the National Security Commission on Emerging Biotechnology. Jason Kelly, the Chair, is the Co-Founder and CEO of Ginkgo Bioworks, a publicly traded firm that provides a horizontal platform for cell programming. Michelle Rozo, the Vice Chair, also currently serves as Vice President of Technical Capabilities at In-Q-Tel, and she previously held positions in Biden’s NSC, the Department of Defense, and on the Hill.
Co-hosting today is Chris “CRISPR” Miller, author of Chip War.
We get into:
The powerful science behind genetic engineering ;
How the US government turned biotechnology into a $1 trillion industry over the course of the last fifty years;
Why generative AI is destined to revolutionize synthetic biology;
And whether China’s national biotech champions can leapfrog the US.
Programming with Proteins
Jason Kelly: Biology is code. It’s code written in As, Ts, Cs, and Gs, not 0s and 1s like in a computer. But I swear, inside a bacterial cell there are 3 million letters in a line, and that cell reads that code in order to do all the things that a bacteria does. It grows, it eats things, it produces stuff. Your human genome is 3 billion letters long. Plant genomes are actually bigger than human genomes — they are 5 billion or even 10 billion letters long in some cases.
That genome is the programming code that defines what the organism is made out of, how it builds itself, and all the things it does. We’ve learned over the years the power of having a programmable substrate in the form of computers. You can put a different code into this machine and then magically it does something it couldn’t do before.
You download an app to your phone and suddenly it can call you a cab.
That’s what biotechnology is. It’s designing new code to put inside of cells, and the next day, they do something they couldn’t the day before. It is to be able to program them — because DNA is fundamentally code.
Jordan Schneider: So does this story begin with Watson and Crick? Does it begin in 1976 with the founding of Genentech?
Jason Kelly: It was only mid-century that we even figured out that DNA was code. At the beginning of the twentieth century, we didn’t know what material inside cells passed information from generation to generation. We only figured that out in the 1950s.
What’s special about biotechnology is that it’s not really about understanding how biology works. Now, biological scientists are hugely important to biotech — but biotech is about programming the cell by putting in new DNA code to make it do something new.
That started basically in the late 1970s with the invention of what was called “recombinant DNA.” Genentech was the first biotech company, which started in South San Francisco. But we needed three technologies to make that possible:
The first is called a polymerase chain reaction, or PCR. PCR lets you take a fragment of a genome and make lots of copies of it. The reason you need that is if you want to work with DNA in the lab, you need a lot of copies. Having one little molecule of DNA, like you would have inside a cell, is not enough to work with. You need millions of copies. So PCR does this little thing where it amplifies DNA.
The second tool is recombinant DNA (also called restriction enzymes). People have heard about CRISPR, which is the latest version of this — but the 1976 version was a technique using restriction enzymes. Those are like little molecular scissors, which would cut DNA at a certain location.
Then the third technology was actually an understanding of how biology worked. It was a thing called the operon. Researchers were figuring out how bacterial cells switch a gene on and off inside a bacterial cell.
Genentech began the biotech industry with a new way of producing insulin for diabetics. Prior to 1976, we got insulin from pigs, porcine insulin, and diabetics would take it as shots. It wasn’t great. People had all kinds of allergic reactions. If only we could just make some human insulin, right?
Genentech took a polymerase chain reaction to amplify copies of the human insulin gene from a human genome in the lab. They then took that tube full of human insulin genes and cut the ends off of the DNA with restriction enzymes, so that each piece had little sticky bits hanging off of them. Then, using our knowledge of the operon — which is how a bacteria turns a gene on and off — they make a cut right downstream of an on-switch inside a bacterial cell, and then we paste in those human insulin genes from the lab right behind the on-switch.
In a tank in South San Francisco, they start growing vats of these bacteria that have been reprogrammed to crank out human insulin. It’s just like installing an app.
Reprogrammed E. coli bacteria made human insulin to give to juvenile diabetics. That was the birth of the biotechnology industry, made possible by those three underlying technologies — PCR, restriction enzymes, and biological knowledge about the operon.
Jordan Schneider: How did the US government make Genentech possible?
Jason Kelly: Herbert Boyer and Stanley Cohen were two professors who were studying these restriction enzymes and invented this process of cloning. There was absolutely mind-boggling government-funded work going on to create that initial breakthrough.
PCR was done in an industrial lab by this guy, Kary Mullis. He was kind of a weird guy. The understanding of the biology behind the operon was also government-funded research. Funding was given for no reason other than understanding the bacteria.
This is the biggest difference between computers and biology: both are programmable, but humans invented computers. We did not invent biology. So in order for us to program these incredibly powerful biological computers, we first have to understand them. That’s what the basic research of biology does.
AI models are leading in this data generation, which will be very important for how we compete with China and others. But basic biology research is absolutely driven by the US government. The US government is the only source of funding for this research.
Genomics and the Gene Gun
Jordan Schneider: After Genentech starts making insulin, what else happens over the subsequent decades to expand the applications and market size of biotech?
Jason Kelly: A whole bunch of companies start to copy Genentech. You have Biogen, Genzyme, and other companies that started to make human proteins as drugs.
But then you have a company out in St. Louis, Missouri, that is working on the problem of how to deliver DNA into a plant. Because in this case, restriction enzymes don’t really work.
That company is called Monsanto. The 1990s brought the birth of plant biotechnology. The fastest-adopted technological innovation in the history of agriculture was GM crops for corn and soybeans in the United States. In the 1990s, they invented the low-level technology to deliver DNA into a plant cell. Interestingly, it wasn’t as good as our methods of putting DNA into a bacterial cell the way they could in pharma, until CRISPR came along recently.
But Monsanto did figure out how to get DNA into a plant cell — and then you had the birth of agricultural biotechnology. Thus, the two big markets of biotech were born: pharmaceuticals and agriculture.
Then in the 2000s, you had a big push in genomics. Genomics is the reading of DNA code. You can take a cell, grind it up, put it into a machine about the size of a washing machine and you can read out the DNA code of the cell. This took off in the 2000s because of a public-private effort.
Remember Bill Clinton up on stage with Francis Collins, and Craig Venter from Celera announcing the sequencing of the human genome? That US effort spurred a private effort and a big race, which beget the genomics industry. Today, a company called Illumina in San Diego is the undisputed leader there. We are here as a result of the US government committing to the Human Genome Project in the late 1990s.
Jordan Schneider: What is genomics, and what does it do exactly?
Jason Kelly: Remember, DNA is code. When you look out in nature, you see all these plants and animals and microbes — well you can’t see the microbes — but inside the cells of every one of those organisms is an operating system defined in code. If you want to be a biotechnologist, if you want to be a designer of DNA, those are the code libraries. That’s all the code that nature has been testing through evolution over the last four billion years.
Put your technology glasses on and look at a seed — you plant a seed in the soil, you add air, water, and sunlight, and you’re not going to believe this, but something starts to just self-assemble out of the ground. With no manufacturing facility, it starts to build solar panels, also known as “leaves,” to start harvesting sunlight energy to continue self-manufacturing.
Where is it getting the building blocks? From CO₂ — it is pulling atmospheric CO₂ out of the air to manufacture itself and then once it’s built enough infrastructure, it starts producing products. Corn, fruit, whatever we might call it — it just starts making things with this unbelievable self-assembling manufacturing technology.
If we didn’t have seeds and I walked in with one, you’d think I was Steve Jobs. It’s worth remembering that this technology is very cheap and self-repairing. If you cut your skin, it fixes itself. Do we have other materials like that?
These things called “cells” have magical powers, and inside of those cells is the code that defines the molecular machines. What is genomics? It’s the technology to go read all those code books.
Genomics lets us see how nature does all this stuff so that as biotechnologists, we can then repurpose it where we want. If you want to have self-healing clothing or self-repairing roads, you need genomics to go get that code.
Michelle Rozo: The relevant analogy here, by the way, is that we’ve sequenced the equivalent of one drop of the entire ocean of genomic data.
There’s so much we don’t understand and so many places in which biology operates. From the Arctic to the ocean to the desert — there are functions encoded in microbes, plants, and organisms that allow them to withstand all of the different environments on Earth.
We don’t understand enough about any of it yet. We don’t know enough to really leverage it in the biotech applications that Jason’s already explained. It’s a huge opportunity for industry, for scientists, and for the United States.
Chris Miller: Between 1976 and today, what kinds of changes have we seen in those fundamental biotech tools? In particular, how have their mechanisms changed and what are their costs to operate?
Jason Kelly: Between the 1970s and the 1990s, we essentially had the tools of cut-and-paste biotech. Take a gene, and move it from one species into another. That might be putting the human gene for insulin production into a microbe, it might be taking a gene from a bacteria and putting it in corn to make it insect-resistant without pesticides — it was cut and paste. But even the pasting was kind of poor.
You couldn’t paste genes wherever you wanted. You could only paste it in certain spots. With plants, I swear to God you had to use a machine called a “gene gun,” where you would put DNA on a gold nanoparticle, fire it through the plant cell wall, and it would just land somewhere random in the genome. For our computer science example, imagine if when you compiled your code to the processor, the code just landed somewhere random in memory. That’s literally how we programmed in the 1970s, 1980s, and 1990s.
Huge, billion-dollar industries were built on installing the code at random. I want people to understand this because your question about the tools, Chris, is so critical — because our tools have been horrible. Yet industries worth hundreds of billions of dollars exist in biotech. The reason is that the substrate is freakishly powerful. Even though all we can do is barely change a bacteria or barely change corn — the mechanisms we are working with are intrinsically robust. Nothing else can engage with the human body like a protein can. Nothing. There is no nanotechnology that can replace insulin.
The ChatACGT Moment
AI neural nets are going to get dramatically superhuman capabilities in the design of DNA long before they get superhuman in the design of contracts.
Jordan Schneider: Why are people so excited about the confluence of AI and biotech?
Jason Kelly: AI has the potential to have a bigger impact in biotechnology than it has in everyday human life. Remember, the foundational structures of the human language models like ChatGPT are neural nets. Those neural nets were trained on big corpora of human-written sentences — but the actual neural nets were not specifically designed for human language. It’s a generic machine learning technique, and very specifically how it works: you’ve got three nodes, two up at the top connected by a line to a third one below it, and on the connectors are little weights, little numbers.
For the biologists, this ends up being kind of funny, but this is computer science’s attempt to mimic a brain. (Neurons are quite different and more complicated and actually do general intelligence, by the way.)
You have a whole network of these nodes, and you train the network with a human-written sentence. How do you do that? Let’s say the sentence has 10 words in it. You leave out a word — say, you leave out the fifth word. Then you feed the other 9 words into the top of this neural net, and it goes beep-boop and sends little signals through the weights to the nodes below it. If those weights cross a threshold, then a special one fires. Then at the bottom, out pops the missing word.
If it’s right, you’re like, “Good job, neural net!” If it’s wrong, you change some of the weights, and then you do it again. Then you do it again but you leave out the second word. Then you do that for billions and billions and billions of sentences, over and over again. If you do it enough, this little thing — which is just nodes and lines and weights — eventually learns English grammar.
It learns to write poems because it’s seen so many of them before; it learns our language without requiring any intentional rule-inputs by the programmers. That’s very important — because now I’m going to explain the biological translation.
A bacterial gene might be 800 As, Ts, Cs, and Gs long. It’s read from start to finish. It’s like a paragraph. I can feed it into a neural net, I can leave parts out, and I can say, “Hey, predict what’s missing” — just like you did when you gave it a human sentence. (By the way, I know the gene because of genomics, so we give it all these genes to read and strategically leave out pieces to train the net.) And eventually, it learns to speak DNA.
The key here is that, in order to create ChatGPT, you have to have enough books written in actual human language. You can’t just train it with nonsense words. Same thing in the biology case: we need actual, coherent DNA language. Where do we get that? Nature. Remember I mentioned earlier that that’s our code base. We have all these genes in nature. We feed them in, and I swear to God, this thing is learning to speak DNA.
Here’s why it could be more impactful than in human language: we invented human language. These computer brains are having to compete with us on our own terrain. You want to have a GPT lawyer? Well, it’s competing in the field of law invented by humans, competing in contracts invented by humans, using the English language invented by humans — and we’re asking it to be as good as our best.
Meanwhile, over here in DNA land, I don’t speak DNA. Do you speak DNA? We look at this stuff, and it’s incomprehensible, alien gibberish. But why couldn’t the neural net learn to speak it? It learned to speak English. It learned to speak Chinese. GPT speaks Chinese better than I do, and it will speak DNA better than I do — except no human on the planet speaks DNA. AI neural nets are going to get dramatically superhuman capabilities in the design of DNA long before they get superhuman in the design of contracts.
A lot of what we’re finding with the AI stuff is that it’s giving us efficiency gains and everybody’s hoping for general intelligence and yada yada — we’ll see. But for sure it’s going to do what we do now worse than us. That’s true in language, but not in biotech. It will absolutely outdo us in biotech.
Chris Miller: So where are we right now in the development of ChatGPT-DNA?
Jason Kelley: Actually, this is a very important point for US strategy, because…
Subscribers get access to the rest of our conversation, where we discuss:
Moore’s Law of synthetic biology;
Imminent pharmaceutical breakthroughs powered by the confluence of AI and genetic engineering;
Applications of biotech that can revolutionize military logistics;
The US vs Chinese innovation ecosystem for biotech