The Future of AI Diplomacy: Can The State Department Grok AI?
Will AI boost diplomatic decision-making or strain state secrecy?
How will AI change diplomacy?
To discuss the State Department’s options for AI integration, we interviewed the State Department's Deputy Chief Data and AI Officer, Garrett Berntsen. He served as an officer during two tours in Afghanistan and recently rotated off the NSC. He believes diplomacy can be more effective with comprehensive, timely, and accurate data-driven analysis, and that AI will be part of achieving that mission.
We get into:
How AI can streamline bureaucratic busy work
The value of data-driven negotiation prep in diplomatic contexts
The benefits of transparency in a democratic society
What level of risk is appropriate for the civil service
How close State is to getting FSOs access to GPT4
The balance between transparency and secrecy, how the Snowden leaks changed the State Department’s relationship with technology, and why that balance is changing now
What the State Department can and can't import from the private sector
Thanks to the Andrew Marshall Foundation and Hudson Institute for sponsoring this episode.
Replacement vs Enhancement
Jordan Schneider: Pitch me, Garrett. What is the vision for how data can inform and enhance diplomacy?
Garrett Berntsen: Fundamentally the tools of diplomacy are mostly words and information. That has been the currency of diplomacy for hundreds of years and remains the currency that our hard-working staff uses overseas and domestically.
The advances in large language models, and generative artificial intelligence, are tools to help manage, store, and analyze data and information through the written word.
At the State Department, our vision for these technologies is to use them to augment everything that our staff and workforce do globally, whether it's negotiations, management functions, reporting, open source analysis, or other types of analysis. All of those things can be made faster, easier, and better through data and AI.
Jordan Schneider: There is a broader understanding within military thinkers, at least into the 20th Century, that technological change is something you need to lean into, understand, and adapt to if you're going to outcompete your adversaries.
In the military context, you could argue that bravery is key. After all, the pace of change across the second, third, and fourth centuries was so slow. Prior to the Industrial Revolution, you didn't necessarily need one generation to learn two or three different technological paradigms.
But that's not necessarily the case with diplomacy. The notion that diplomacy is an art, not a science, is widespread. After all, diplomacy is such a human, interpersonal endeavor. The people who signed up for this career path were presumably excited about that part of the game. How do you get those people to buy into your vision?
Garrett Berntsen: We're never going to automate that magic diplomatic moment where you're negotiating across the table from someone. AI will never be able to go out into the field and talk with a local counterpart or a relationship.
What we want to do is reduce burdens. If you talk to foreign service officers overseas, they wish they could spend less time doing the same repetitive stuff. They want to go out and do more of what they joined the foreign service to do. Telling them that we can automate repetitive tasks or make those tasks easier, is the easiest pitch.
The second pitch is more to the heart of the mission of diplomacy. Diplomats deserve the fastest, most accurate analysis to help them in those moments of negotiation. Look at baseball—the game is hundreds of years old, and people said it wasn’t a data-driven sport. If you’ve seen the movie Moneyball, you know that we now understand that there's deep data analysis that can help prepare teams to win.
Data science tools cannot hit the home run for you, but they can put you in the position to have the best team and the best information when you are doing your job. That is the harder pitch I will say.
Here’s a situation that’s happened before: we bring an analysis to someone who's been in the department for a long time, and they say, “This is counter to what I believe to be true.” They might say, “I've been in the foreign service for 30 years, and this is not correct.” That's where the rubber meets the road.
When we have those moments, we say, “This is an analysis. All models are wrong, but some are useful. If your experience and intuition run counter to this analysis, let's discuss the reasons. We can all be smarter and have a better outcome from that back-and-forth.”
Jordan Schneider: You led with, “I'm not taking away anyone's job,” but I can imagine a GPT-6 George-Kennan-plus-Henry-Kissinger-plus-ethics model that performs better than a replacement-level foreign service officer.
To be clear, this is coming from someone who did not pass the Foreign Service Officer Exam.
I'm sure you don’t want to scare people away from the vision, but in a world where AI becomes marginally or substantively more powerful than it is today, how far do you think this can go?
Garrett Berntsen: Those are tough questions to answer. I'm not a futurist. I don't know what the world will look like in 20 or 30 years. I like to read science fiction, but I don't know how close we are to AGI.
What I know now, is that we are in a world where American diplomacy is needed, and we need our foreign service officers, our diplomats, and our civil servants doing this work. Many of them are burdened with things that they, candidly, could be doing more efficiently.
The more we can push them forward to negotiation, engagement with partners, and soft-skill-based activities, the better. Maybe in my grandkids' time or my great-grandkids' time, we'll have AI bots negotiating in the UN. I don't know. That seems crazy to me.
But in the world we live in, our allies need us out in their capitals working with them hand in glove, and I want to buy more hours every day and every week for foreign service officers to do that work. I'll let people smarter than me talk about replacement and how we change the workforce.
Jordan Schneider: Let’s take a UN Security Council negotiation for example. I don't think we are that far away from simulating models of the 15 countries negotiating with each other. If you run that simulation 10,000 times starting with two, or three, or twenty different strategies, I would think you could figure out the optimal strategy that's going to get the highest number of countries on board.
Garrett Berntsen: The Bureau of International Organization Affairs (IO) is a very forward-leaning organization on using data and analysis to inform its operations. We started a workforce program from our team, helping Bureaus hire Bureau Chief Data Officers (BCDOs).
They were one of the first ones, and they actually hired someone from my team, Paula Osborn, who’s phenomenal.
Those employees look at modeling, evaluating, and simulations, but at the end of the day, it is to inform our negotiators. It is to give them the best tools to be ready to think about the 2nd, 3rd, and 4th-order effects that you need to negotiate.
To your broader point on an agent-based simulation, that could happen. I just don't know if our society would be ready for that. I don't think I would be.
Risk management
Jordan Schneider: In your documents, you talk about this tension between trying to run as fast as a Fortune 500 and managing risk given the high stakes and large number of dimensions involved. You want industry-leading practices, but the extent to which the State Department can push the envelope is limited because there is so much on the line compared to say, a private equity (PE) firm.
Garrett Berntsen: We are never going to keep completely at pace with a PE firm because our incentives are just completely different. It is apples and oranges. We are making decisions on behalf of the American people. We are negotiating things that really impact people's lives.
Buying and selling or trading stocks automatically with an AI bot, that's just totally different. That impacts people's lives, too, but we have a sort of special trust given to us as public officials, civil servants, and federal employees.
In terms of the timeline, we always have to do more homework. We are not willing to take as much risk as a private equity firm. We should go as fast as we can under the law, considering the difficulty of changing the way our workforce works.
Jordan Schneider: There are different buckets of tasks— for some it's okay to run fast, but for some, it’s not. Maybe you can break down which tasks go into each bucket.
Garrett Berntsen: A lot of the work we do in the management area is more analogous to the work of the private sector, for example, HR services, IT services, and performance management. For all of these things, candidly, I do think we need to be much closer to delivering quality software that executes in the way the private sector does.
There is effort going into that. There's modernization. Those are the areas where we want to be as close to the private sector as possible.
When it comes to the things that are uniquely public activities, we have a much higher burden for using AI. Examples of this type include implementing the law, rulemaking, or providing foreign assistance.
In the State Department, we do foreign assistance funds, passport adjudications, and consular decisions. No one else decides who gets a passport. That is a U.S. government activity.
Those areas are rights-impacting, so we have a higher burden. A lot of this is outlined in the executive order on the use of AI. As the sole provider of a public good, we have a higher burden to do it right. If that means we do have to go a bit slower and involve more lawyers, then that is what we will do. We need to be cautious and test it.
No private equity firm has to give out passports. If a company was producing passports—and thus could face the litigation risks for getting that wrong—I think they would be very cautious also.
The Biden Doctrine and Weaponizing Transparency
“Every government has its secrets, but some need them more than others.”
Jordan Schneider: Garrett, you wrote a piece a few years ago with Ryan Fedasiuk discussing the new Biden doctrine on transparency. We saw this doctrine illustrated in the run-up to the 2022 full-scale invasion of Ukraine, where the Biden administration strategically revealed intelligence demonstrating that the invasion was imminent.
In your article, you wrote, “Every government has its secrets, but some need them more than others.” How are you integrating these ideas about transparency into your current work?
Garrett Berntsen: Good research there. There's a spectrum of philosophical beliefs in the department. I would say I am probably an outlier regarding transparency—I'll bifurcate this again to internally and then externally. There are challenges in both areas.
Internally, I would really like us to be more aggressive with knowledge management and sharing information inside of the department. We have buy-in from our leadership to do that. Deputy Secretary Rich Verma has been looking to modernize the way we share information.
The Secretary has also said this— we need to be trying new things. We can take a little bit more risk internally sharing information more transparently. We can raise challenges and address policy differences in a transparent way. Again, that's internally.
External transparency— which was the primary issue discussed in my article— has actually been codified under the name “strategic disclosure.”
My argument is that, as a democratic republic, it is a strength that we can be as transparent as we are. It's powerful in our markets, it's powerful in terms of communicating to the world and the public about how we operate.
For some of our adversaries, transparency is a weakness. They really want to keep things secret. In the modern era, the government has tools to push transparency, but mechanisms exist outside of government too. You see the incredible work CSET does to produce data sets, and they're making them easy to access. It is to the advantage of the US to drive transparency.
As you said, the administration really believes in that doctrine. They have used it to great effect in the last couple of years.
Jordan Schneider: I'd like to think that America does less horrifically objectionable things generally. But historically, because our society has freedom of the press, any horrifically objectionable things end up being revealed, regardless of whether they are state secrets or not.
You saw that throughout the 2000s with Abu Ghraib and the secret detention sites. Authors can write and publish books like “From the Ashes of History” without being assassinated.
The ability to keep dark secrets is just not sustainable for this country, regardless of what level of classification you put on things. I like that you price this into your assumptions and try to leverage that as an asymmetric advantage compared to closed societies.
Strategic Simulations
The Department of State produces research on a variety of issues—national security issues, economic issues, and political issues—but no one had taken that entire corpus and evaluated it using large language models.
Jordan Schneider: Let's talk about China. One of your strategies for getting buy-in has been running 6-month data campaigns— or “sprints” to bring all your analysts and try to tackle a problem and show that data has value. Your team recently ran one such sprint focused on strategic competition with China. What are some interesting findings you can share from that work?
Garrett Berntsen: The State Department provides a lot of foreign assistance funding globally, and there are accounts in particular on countering the influence of the PRC. We wanted to optimize the method for allocating those funds.
This is a great problem for data. You look for some indicators of impact and you can build a cost and impact evaluation model. Again, that doesn't mean the process is just to input {1…n}, get outputs from the model, and then run with it. The result informs a discussion with policymakers, and it is one additional factor in the broader foreign assistance decision-making process.
That was groundbreaking, and it provided a way for decision-makers to evaluate all these projects in an apples-to-apples way, as best as we could, and make some portfolio decisions.
Another interesting use case was a study that we ran looking at the entirety of the cable traffic.
The Department of State’s Foreign Service officers overseas produce cable reporting on a variety of different issues—national security issues, economic issues, and political issues—but no one had taken that entire corpus and evaluated it using large language models.
I’m not talking about generative AI—just general AI tools and evaluative tools. Those can help us look at the entire corpus and say, “What are we reporting on in the aggregate?” Then we can go back to our posts with reporting priorities. It allows us to have a global perspective on what's being reported, as well as identify novel concepts achieved in partnership with other stakeholders.
This sounds rote, but this is how we operate—we have questions and hypotheses, we test them with data, and we provide insights to decision-makers and attempt to aid engagement with the field.
Below you’ll get access to part two where we discuss:
The balance between transparency and secrecy in the age of big data
How the Snowden leaks changed the State Department’s relationship with technology
What techniques the State Department can import from the private sector
Battle of the Bureaucrats
Jordan Schneider: Garrett, in a tweet a while ago, you said that senior leaders are hungry for data. Junior staff are, too, but middle managers are the most skeptical. It's probably not a generational thing, but I'm curious– what are the institutional incentives that are making that level of seniority a little less excited to engage with what you and your team are bringing to the table?