OpenAI: How Do They Do It? Lessons from Jiu-Jitsu Innovation and for S&T Policy
What jiu-jitsu teaches us about OpenAI, and what OpenAI teaches us about optimizing grant-making funds and FROs
Satya Nadella didn’t want to hear it.
Last December, Peter Lee, who oversees Microsoft’s sprawling research efforts, was briefing Nadella, Microsoft’s CEO, and his deputies about a series of tests Microsoft had conducted of GPT-4, the then-unreleased new artificial intelligence large-language model built by OpenAI. Lee told Nadella Microsoft’s researchers were blown away by the model’s ability to understand conversational language and generate humanlike answers, and they believed it showed sparks of artificial general intelligence—capabilities on par with those of a human mind.
But Nadella abruptly cut off Lee midsentence, demanding to know how OpenAI had managed to surpass the capabilities of the AI project Microsoft’s 1,500-person research team had been working on for decades. “OpenAI built this with 250 people,” Nadella said, according to Lee, who is executive vice president and head of Microsoft Research. “Why do we have Microsoft Research at all?” (The Information)
For all the episodes we’ve done on ChinaTalk about abstract US vs. China advantages and disadvantages in AI, shows about data policy, export controls, and semiconductor industrial policy, it was a team of 250 at OpenAI that ushered in this new AI age. As far as I can tell, there was no hard structural barrier that would have stopped another team in China — or anywhere else for that matter — from making the breakthroughs OpenAI has.
My own take on why an organization like OpenAI hasn’t emerged out of China does not come down directly to the data, compute, and talent AI triad. Rather, what set OpenAI apart has been a boatload of patient capital combined with leadership secure enough in their research direction to not rush things but still sharp and grounded enough to make the right big bets on breakthroughs like transformers.
In 2019, an interviewer asked Sam Altman how OpenAI intended to make money. His response is worth reflecting on:
The honest answer is we have no idea. We have never made any revenue. We have no current plans to make revenue. We have no idea how we may one day generate revenue. We have made a soft promise to investors that, “Once we’ve built a generally intelligent system, basically we will ask it to figure out a way to generate an investment return for you.” [crowd laughs] It sounds like an episode of Silicon Valley — you can laugh, it’s all right — but it really is what I actually believe is going to happen.
Even with a track record like Altman’s, this answer would not fly in the Chinese VC space.
Chinese firms can certainly innovate, and Chinese-educated researchers were core members of the OpenAI team. The Chinese private sector VC space, however, has been too oriented to near-term results — and the government, setting aside the important exception of the space sector, has not shown an ability to fund long-term S&T projects that push beyond the research frontier. The only Chinese firm I’ve come across willing to make billion-dollar, decadal bets on R&D is Huawei (more here), and it was Ren Zhengfei’s personal vision rather than any structural government support that made his firm such a remarkable exception.
That said, billionaires go on resource-unconstrained technological goose chases all the time (see: Bezos and Blue Origin). So there’s more going on than just that. To continue exploring this question, I’m featuring two guest essays. The first, from Sam Hammond of the Foundation for American Innovation, argues that OpenAI’s institutional structure uniquely allowed it to explore and exploit new breakthroughs in a way that established firms like Microsoft and Baidu, as well as universities around the world, couldn’t. Next, Eric Gilliam of FreakTakes — reasoning from recent innovations in, of all things, jiu-jitsu — contends that OpenAI is more likely than any other firm to continue to lead.
Before we get into these pieces, I wanted to let you know that the annual ChinaTalk Student Essay Contest is launching in two weeks, and we’re on the hunt for a sponsor!
The contest draws in research papers and theses from undergrad and master’s students — and winners have the opportunity to appear on the ChinaTalk podcast to talk with me about their work. Sponsoring the essay contest is a perfect opportunity for a grad school program, research company, investment firm, or even an individual who wants to support students doing rigorous work to understand modern China.
The latest winner was Maggie Baughman: you can read her essay here and listen to the podcast here.
Reach out to me directly to discuss! jordan@chinatalk.media
Sam Hammond—OpenAI’s Big Lesson for Science Policy
Why did it take OpenAI to push the field forward — an independent nonprofit turned capped-for-profit — and not, say, the National Science Foundation, one of America’s world-class research universities, or an established tech giant?
To be sure, most of the intellectual breakthroughs behind our current AI summer originated in academic research, in some cases many decades ago. The first multi-layer perceptron trained by stochastic gradient descent dates back to 1967 and a close cousin to transformer models was first published about in the early 1990s, along with a methodology for unsupervised pre-training.
Yet as any graying AI researcher will tell you, neural networks were a backwater of the field until relatively recently. The tide began to turn only as hardware capabilities caught up and new private research labs, such as Google Brain in 2011, began experimenting with neural networks trained with many hidden layers.
It’s hard to see how America’s scientific grant-making agencies could have catalyzed this progress on their own. Grant-funded science rewards novelty, while the core architectures behind recent progress in AI are anything but. Imagine a research proposal that said something like the following:
We are requesting $1 billion to train a really big neural network. We don’t anticipate making any major theoretical advances in algorithm design, and we don’t yet know the exact details of our plan, much less if it will work. But if there’s even a small chance that scaling existing models up will reveal a path to AGI, we think it’s worth the shot.
Such a proposal would be dead on arrival — yet it is essentially what OpenAI led with when they launched in 2015 with a billion dollars in backing. After experimenting in robotics and a few other cul de sacs, the release of the famous “Attention is all you need” paper in 2017 caused OpenAI to go all-in on transformer architectures. They released “Improving Language Understanding by Generative Pre-Training” a year later, giving birth to the GPT class of language models. The rest is history.
OpenAI as an FRO
OpenAI was founded with the specific mission to create artificial general intelligence. With a hefty budget, it built up a high-caliber and dedicated team to use the flexibility of an independent research organization to engineer different ideas before settling on a core technology. This makes OpenAI an accidental example of a Focused Research Organization.
Focused Research Organizations (FROs) are science and engineering programs addressed to “well-defined challenges that require scale and coordination but that are not immediately profitable.” The case for FROs in the context of federal funding for science and R&D was made succinctly by Sam Rodriques and Adam Marblestone:
The U.S. government is ill-equipped to fund R&D projects that require tight coordination and teamwork to create public goods. The majority of government-funded research outside of the defense sphere—including research funded through the National Institute of Health (NIH), the National Science Foundation (NSF), the Defense Advanced Research Projects Agency (DARPA), and the Advanced Research Projects Agency–Energy (ARPA-E)—is outsourced to externalized collaborations of university labs and/or commercial organizations. However, the academic reward structure favors individual credit and discourages systematic teamwork. Commercial incentives encourage teamwork but discourage the production of public goods. As a result, the United States is falling behind in key areas like microfabrication and human genomics to countries with greater abilities to centralize and accelerate focused research.
The solution is to enable the U.S. government to fund centralized research programs, termed Focused Research Organizations (FROs), to address well-defined challenges that require scale and coordination but that are not immediately profitable. FROs would be stand-alone “moonshot organizations” insulated from both academic and commercial incentive structures. FROs would be organized like startups, but they would pursue well-defined R&D goals in the public interest and would be accountable to their funding organizations rather than to shareholders. Each FRO would strive to accelerate a key R&D area via “multiplier effects” (such as dramatically reducing the cost of collecting critical scientific data), provide the United States with a decisive competitive advantage in that area, and de-risk substantial follow-on investment from the private and/or public sectors.
Yet if anything, Rodriques and Marblestone understate the challenge of moving breakthrough science through traditional funding channels, and thus the room for improvement. Funding rates for grant applications at the NSF and NIH have steadily declined since the 1970s, back when it was common for one in every two grants to be approved. Today, approval rates run as low as ten to twenty percent, and tend to go to principal investigators (PIs) and research teams that are older and more established, harming scientific diversity and creativity. Just two percent of NIH-supported institutions receive fifty-three percent of all research project grants, for example.
Meanwhile, PIs of federally sponsored research report spending over forty percent of their time on administrative tasks. Yet the time cost of compliance and bureaucracy is arguably less important than the sheer inflexibility it imposes on research agendas — a fact Patrick Collison discovered after surveying a cross-section of biomedical researchers through the Fast Grants program. As he explained on the Ezra Klein podcast last year,
And we just asked them, as a general matter in your regular research, if you could spend your grant money however you want, how much would you change your research agenda?
So not an increase in the funding level, which tends to be what we discuss in as much as we’re discussing science policy across society. But much more specifically and narrowly, if you had complete autonomy in how you spend whatever grant money you’re getting, how much of your research agenda would change? And our intuition was that maybe a third of people would like to be doing something meaningfully different to what they actually are.
But of these scientists, and these are really good scientists, four out of five told us that they would change their research agendas, quote, “a lot.” [Jordan: I’m sure the same goes for researchers in corporations as well!]
This cries out for structural innovation in how the federal government funds science and R&D. FROs are one such innovation and, while not a panacea, they come with many advantages. Three stand out in particular:
FROs are mission-oriented: By orienting around a “well-defined tool or technology, a key scientific dataset, or a refined process or resource,” FROs allow researchers to flexibly allocate their time and resources toward the higher-level goal, thus avoiding the confines of a narrow project grant. To give a hypothetical example, where the NSF might fund a series of proposal with applications to battery technology, an FRO would instead set an ambitious but achievable high-level mission, something like “improve battery efficiency by twenty percent relative to the state of the art.” The organization would be spun up with a large initial investment and otherwise given wide autonomy in how it achieves its goal. A researcher within the FRO might decide to study the very same material, but they could also decide to abandon the idea if a more promising approach presented itself. That is, instead of a grant agency issuing new lines of inquiry, the project mission guides researchers through the most relevant search space of intermediate discoveries.
FROs are vertically integrated teams: Science is increasingly a team-led enterprise. FROs acknowledge that reality from the get-go, allowing researchers from across disciplines and institutional backgrounds to collaborate under one roof. Such collaborations are often infeasible in academic settings, as issues of financing, credit-taking, and intellectual property abound. Different teams within an FRO can further benefit by sharing human resources and equipment, or by finding new synergies at the proverbial water cooler.
FROs collapse the false dichotomy between science and engineering: Basic research and engineering often go hand in hand. Unfortunately, our modern grant-making institutions tend to divide the two, as if engineering and technology were the mere “applications” of antecedent discoveries. In the real world, science and engineering feed off each other in a virtuous cycle. The Apollo Program produced myriad scientific discoveries in the quest to engineer a man on the moon. Or take solar energy costs, which continue to decline year over year not merely thanks to the work of academic scientists, but also because solar manufacturers are deploying their technology at scale. In turn, engineers and chemists are driven to find new efficiencies and process innovations through a classic case of “learning by doing.”
In short, FROs aspire to work like NASA did in the 1960s or SpaceX does today. It’s a model that can work on ideas big and small, just as long as it’s a public good. Existing FROs include:
EvE Bio: On a five-year mission to map the “pharmome” by building a public domain knowledge map of FDA-approved drugs’ functional protein binding partners across thousands of human gene products.
Rejuvenome: Established to conduct the largest and most systematic study of the biological effects of putative anti-aging interventions.
Cultivarium: Creating open-source tools for life scientists to expand access to novel microorganisms for biological research and development.
E11 Bio: Building the foundation for a full-stack, hundred-billion-neuron scale mapping of the human brain.
Science as gradient descent
The tinkering culture of engineers and technologists is the experimental method by any other name. Only instead of receiving feedback from an anonymous reviewer, engineers deal with the uncompromising feedback of reality. The Wright Brothers come to mind — two bicycle mechanics who tinkered their way to the first motor-operated airplane, but only after overcoming the painful feedback of gravity.
The success of OpenAI is likewise a testament to the power of tinkering. As Sam Altman has indicated in interviews, the leap between GPT-3 and GPT-4 wasn’t due to some deep breakthrough so much as the accumulation of many small tweaks. Yet such tweaks were discoverable only because of the real-world feedback they received by deploying their models at scale.
It’s time these lessons were brought to bear in how the federal government funds science and R&D more broadly. Only a handful of FROs have been created to date, and mostly through private dollars. With robust public support, their potential use-cases could easily multiply.
After all, science advances through a process not dissimilar to gradient descent. Scientists poke at the frontiers of human knowledge through experiments and backpropagate their findings into an updated picture of reality. The errors in our model of the world steadily decrease until we’re able to generalize our understanding into all-new settings. Science at its worst, in contrast, mimics the overly symbolic approaches that led to the last AI winter, resulting in shallow models of reality that overfit the data and fail to reproduce.
This all makes FROs the transformer architecture of science funding: a way to focus attention. By letting researchers work in parallel toward a common, well-defined objective, FROs are able to grok the dependencies between different disciplines. The few we have promise to blow past every normal benchmark of scientific productivity.
Scale is all they need.
Eric Gilliam—How Recent Innovation in Jiu-Jitsu Helps Us Think About OpenAI’s Future
Pedro Domingos, a computer science writer and researcher, recently tweeted about the (possible) impending decline of OpenAI as we know it:
OpenAI’s dirty little secret is that GPTs are reaching their limits and it doesn’t know what to do next.
To be sure, I’m not expert enough to say whether GPTs are approaching their limits — but even if they are, I wouldn’t be so sure it spells doom for the future of OpenAI. In fact, even if GPTs are tapped out, I still think our best bet is to assume that the next AI breakthrough will be produced by OpenAI or DeepMind.
To explain why, I’d like to draw on recent innovations in the sport of no-gi jiu-jitsu. In the past ten years, jiu-jitsu experienced a sudden burst of innovation — in particular, the development of a system of leg attacks — that brought massive success to its innovators; and for a while, anybody familiar with the dark art of “leg locking” had a severe advantage. Four years passed, and the rest of the field learned how to defend the new positions.
But importantly: even after the rest of the field caught up, those original “leg locking” innovators didn’t fade into obscurity. Instead, these competitors remained atop the heap — not by focusing on merely further developing leg attacks, but by re-deploying their well-developed systems of research to new problem areas.
Their continued success exemplifies the advantages that firms like OpenAI and DeepMind have — even after their initial breakthroughs reach the point of diminishing returns. The strength of these firms stems not from their particular innovations, but from their organizational systems which produced their breakthrough innovations in the first place.
Jiu-Jitsu’s Recent Evolution
At its core, Jiu-jitsu is the art of one person subduing another without striking them. The sport is thousands of years old, and in fields this old, one doesn’t tend to see explosive growth in an area very often. Yet roughly nine years ago, a major innovation took the world of competitive jiu-jitsu by storm: the heel hook.
Before the heel hook, it was generally accepted that there were two ways to submit people: joint locks and chokes.
Joint locks attempt to break or tear limbs. The most common example is the armbar, below:
Chokes work by cutting off air or blood flow to the brain. Below is the well-established rear naked choke:
Meanwhile, attacking people’s legs was rare — it wasn’t taken seriously by jiu-jitsu pros. Oftentimes, competitors would attempt leg attacks on their training partners as a joke.
Enter John Danaher — a Columbia philosophy PhD dropout who took up coaching jiu-jitsu in the 1990s. But on one surreptitious day, Danaher had dinner with one of those rare pros who employed leg attacks, and commented to Danaher, “Why would you ignore fifty percent of the human body?”
Danaher became obsessed with leg attacks. He seriously studied the minutiae of leg attacks for years — troubleshooting and experimentation, problem by problem. He eventually began seriously teaching leg attacks to his best students — and it worked better than anybody could ever have imagined. His top-tier students soon adopted a new moniker: the Danaher Death Squad, or DDS. For several years, Danaher’s gym was by far the most successful in the world.
Soon enough, though, the high-level competitors caught up. So what happened to the “DDS leg lock specialists”? Did they fade into obscurity?
Not at all. On the contrary — they remain at the top of the heap. Their secret lies in their persistence in improving and evolving on other attack systems beyond leg attacks. The system of research their gym deployed in evolving the leg attack game, it turns out, generalizes far beyond leg attacks. For example, among other areas, the former DDS team members have since done much of the work in innovating and developing a system called “wrestling up.”
The crucial point here: although their first major intervention — leg attacks — reached a point of diminishing returns, their well-developed system of research and innovation did not.
The Future of OpenAI
To believe that a place like OpenAI is screwed moving forward, then, is to believe:
ChatGPT’s development was a fluke and entirely unrelated to OpenAI’s organizational and research infrastructure;
OpenAI developed a system overfit to producing GPT-related developments;
OpenAI’s organizational and research infrastructure will degrade moving forward.
To be clear, any of those could be the case. I’m just saying that I don’t believe that GPT’s reaching a point of diminishing returns per se portends ominously of OpenAI’s future.
OpenAI and DeepMind have results on their side. As much as anyone, they have faith that their systems can lead to results. To be sure, there is a fine line between fiddling around and doing serious research — but as far as I understand, it’s hard to write off ChatGPT’s success as a lone Eureka! moment. OpenAI’s processes — whatever they are — seem to have been a primary driver.
Historically, systems-driven innovation was the norm in many great industrial R&D labs. For example, in 1927 Karl Compton — who would eventually become the head of Princeton’s physics department and the president of MIT — wrote a Science article dedicated to all the things a university physics department could learn from industry R&D labs:
Much can … be done to promote cooperation and coordination through actual methods of organization. This has been strikingly demonstrated in some of the big industrial research laboratories, from which the output has greatly exceeded the individual capacities of the research workers and has been achieved only by coordination of effort.
To Compton, the project coordination of places like Bell Labs or GE Research was super-additive. Their systems of hiring, coordination, resource allocation, and problem selection were generating outcomes far greater than the sum of their parts.
The descendants of the Danaher Death Squad have successfully deployed their system of research far beyond their first great area of innovation — and even after their gym split. DeepMind was able to make a massive discovery in the life sciences — AlphaFold — after its groundbreaking reinforcement learning work done playing strategy board game Go.
So, why would we expect that OpenAI won’t follow suit? If OpenAI’s processes are as effective as they seem, then it should be exciting, not disappointing, for their researchers to redeploy their efforts on newer problems. The idea of an OpenAI model developing new jiu-jitsu positions in ten years may not be out of the question. OpenAI could be a one-trick pony — but it could also be the next great process innovator.