#235 – Ajeya Cotra on whether it’s crazy that every AI company’s safety plan is ‘use AI to make AI safe’

Every major AI company has the same safety plan: when AI gets crazy powerful and really dangerous, they’ll use the AI itself to figure out how to make AI safe and beneficial. It sounds circular, almost satirical. But is it actually a bad plan?

Today’s guest, Ajeya Cotra, recently placed 3rd out of 413 participants forecasting AI developments and is among the most thoughtful and respected commentators on where the technology is going.

She thinks there’s a meaningful chance we’ll see as much change in the next 23 years as humanity faced in the last 10,000, thanks to the arrival of artificial general intelligence. Ajeya doesn’t reach this conclusion lightly: she’s had a ring-side seat to the growth of all the major AI companies for 10 years — first as a researcher and grantmaker for technical AI safety at Coefficient Giving (formerly known as Open Philanthropy), and now as a member of technical staff at METR.

So host Rob Wiblin asked her: is this plan to use AI to save us from AI a reasonable one?

Ajeya agrees that humanity has repeatedly used technologies that create new problems to help solve those problems. After all:

Cars enabled carjackings and drive-by shootings, but also faster police pursuits.
Microbiology enabled bioweapons, but also faster vaccine development.
The internet allowed lies to disseminate faster, but had exactly the same impact for fact checks.

But she also thinks this will be a much harder case. In her view, the window between AI automating AI research and the arrival of uncontrollably powerful superintelligence could be quite brief — perhaps a year or less. In that narrow window, we’d need to redirect enormous amounts of AI labour away from making AI smarter and towards alignment research, biodefence, cyberdefence, adapting our political structures, and improving our collective decision-making.

The plan might fail just because the idea is flawed at conception: it does sound a bit crazy to use an AI you don’t trust to make sure that same AI benefits humanity.

But if we find some clever technique to overcome that, we could still fail — because the companies simply don’t follow through on their promises. They say redirecting resources to alignment and security is their strategy for dealing with the risks generated by their research — but none have quantitative commitments about what fraction of AI labour they’ll redirect during crunch time. And the competitive pressures during a recursive self-improvement loop could be irresistible.

In today’s conversation, Ajeya and Rob discuss what assumptions this plan requires, the specific problems AI could help solve during crunch time, and why — even if we pull it off — we’ll be white-knuckling it the whole way through.

Links to learn more, video, and full transcript: https://80k.info/ac26

This episode was recorded on October 20, 2025.

Chapters:

Cold open (00:00:00)
Ajeya’s strong track record for identifying key AI issues (00:00:43)
The 1,000-fold disagreement about AI's effect on economic growth (00:02:30)
Could any evidence actually change people's minds? (00:22:48)
The most dangerous AI progress might remain secret (00:29:55)
White-knuckling the 12-month window after automated AI R&D (00:46:16)
AI help is most valuable right before things go crazy (01:10:36)
Foundations should go from paying researchers to paying for inference (01:23:08)
Will frontier AI even be for sale during the explosion? (01:30:21)
Pre-crunch prep: what we should do right now (01:42:10)
A grantmaking trial by fire at Coefficient Giving (01:45:12)
Sabbatical and reflections on effective altruism (02:05:32)
The mundane factors that drive career satisfaction (02:34:33)
EA as an incubator for avant-garde causes others won't touch (02:44:07)

Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Music: CORBIT
Coordination, transcriptions, and web: Katy Moore

Oppdag Premium

Prøv 14 dager gratis

Kjøp Premium

Episoder(333)

#20 - Bruce Friedrich on inventing outstanding meat substitutes to end speciesism & factory farming

Before the US Civil War, it was easier for the North to morally oppose slavery. Why? Because unlike the South they weren’t profiting much from its existence. The fight for abolition was partly won bec...

19 Feb 20181h 18min

#19 - Samantha Pitts-Kiefer on working next to the White House trying to prevent nuclear war

Rogue elements within a state’s security forces enrich dozens of kilograms of uranium. It’s then assembled into a crude nuclear bomb. The bomb is transported on a civilian aircraft to Washington D.C, ...

14 Feb 20181h 4min

#18 - Ofir Reich on using data science to end poverty & the spurious action-inaction distinction

Ofir Reich started out doing math in the military, before spending 8 years in tech startups - but then made a sharp turn to become a data scientist focussed on helping the global poor. At UC Berkeley...

31 Jan 20181h 18min

#17 - Will MacAskill on moral uncertainty, utilitarianism & how to avoid being a moral monster

Immanuel Kant is a profoundly influential figure in modern philosophy, and was one of the earliest proponents for universal democracy and international cooperation. He also thought that women have no ...

19 Jan 20181h 52min

#16 - Michelle Hutchinson on global priorities research & shaping the ideas of intellectuals

In the 40s and 50s neoliberalism was a fringe movement within economics. But by the 80s it had become a dominant school of thought in public policy, and achieved major policy changes across the Englis...

22 Des 201755min

#15 - Phil Tetlock on how chimps beat Berkeley undergrads and when it’s wise to defer to the wise

Prof Philip Tetlock is a social science legend. Over forty years he has researched whose predictions we can trust, whose we can’t and why - and developed methods that allow all of us to be better at p...

20 Nov 20171h 24min

#14 - Sharon Nunez & Jose Valle on going undercover to expose animal abuse

What if you knew that ducks were being killed with pitchforks? Rabbits dumped alive into containers? Or pigs being strangled with forklifts? Would you be willing to go undercover to expose the crime? ...

13 Nov 20171h 25min

#13 - Claire Walsh on testing which policies work & how to get governments to listen to the results

In both rich and poor countries, government policy is often based on no evidence at all and many programs don’t work. This has particularly harsh effects on the global poor - in some countries governm...

31 Okt 201752min

Reklamefrie Premium-podkaster

Hør populære podkaster som Storefri med Mikkel og Herman, Ida med hjertet i hånden, Krimpodden og mye mye mer

Skap din egen podkastboble

I appen skaper du ditt eget bibliotek med favoritter, og vi gir deg også anbefalinger til podkaster du ikke kan gå glipp av.

Prøv 14 dager gratis

Dersom du er ny Podme-bruker får du 14 dager gratis prøveperiode når du oppretter abonnement

Premium

99 kr/ måned

Tilgang til alle våre Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker

Prøv 14 dager gratis

Premium

129 kr/ måned

Tilgang til alle Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker
En Ekstra bruker

Prøv 14 dager gratis

Populært innen Fakta

relasjonspodden-med-dora-thorhallsdottir-kjersti-idem

Historiene og stemmene du vil høre

Ubegrenset tilgang til alle dine favorittpodkaster og lydbøker

Les mer