#235 – Ajeya Cotra on whether it’s crazy that every AI company’s safety plan is ‘use AI to make AI safe’

#235 – Ajeya Cotra on whether it’s crazy that every AI company’s safety plan is ‘use AI to make AI safe’

Every major AI company has the same safety plan: when AI gets crazy powerful and really dangerous, they’ll use the AI itself to figure out how to make AI safe and beneficial. It sounds circular, almost satirical. But is it actually a bad plan?

Today’s guest, Ajeya Cotra, recently placed 3rd out of 413 participants forecasting AI developments and is among the most thoughtful and respected commentators on where the technology is going.

She thinks there’s a meaningful chance we’ll see as much change in the next 23 years as humanity faced in the last 10,000, thanks to the arrival of artificial general intelligence. Ajeya doesn’t reach this conclusion lightly: she’s had a ring-side seat to the growth of all the major AI companies for 10 years — first as a researcher and grantmaker for technical AI safety at Coefficient Giving (formerly known as Open Philanthropy), and now as a member of technical staff at METR.

So host Rob Wiblin asked her: is this plan to use AI to save us from AI a reasonable one?

Ajeya agrees that humanity has repeatedly used technologies that create new problems to help solve those problems. After all:

  • Cars enabled carjackings and drive-by shootings, but also faster police pursuits.
  • Microbiology enabled bioweapons, but also faster vaccine development.
  • The internet allowed lies to disseminate faster, but had exactly the same impact for fact checks.

But she also thinks this will be a much harder case. In her view, the window between AI automating AI research and the arrival of uncontrollably powerful superintelligence could be quite brief — perhaps a year or less. In that narrow window, we’d need to redirect enormous amounts of AI labour away from making AI smarter and towards alignment research, biodefence, cyberdefence, adapting our political structures, and improving our collective decision-making.

The plan might fail just because the idea is flawed at conception: it does sound a bit crazy to use an AI you don’t trust to make sure that same AI benefits humanity.

But if we find some clever technique to overcome that, we could still fail — because the companies simply don’t follow through on their promises. They say redirecting resources to alignment and security is their strategy for dealing with the risks generated by their research — but none have quantitative commitments about what fraction of AI labour they’ll redirect during crunch time. And the competitive pressures during a recursive self-improvement loop could be irresistible.

In today’s conversation, Ajeya and Rob discuss what assumptions this plan requires, the specific problems AI could help solve during crunch time, and why — even if we pull it off — we’ll be white-knuckling it the whole way through.


Links to learn more, video, and full transcript: https://80k.info/ac26

This episode was recorded on October 20, 2025.

Chapters:

  • Cold open (00:00:00)
  • Ajeya’s strong track record for identifying key AI issues (00:00:43)
  • The 1,000-fold disagreement about AI's effect on economic growth (00:02:30)
  • Could any evidence actually change people's minds? (00:22:48)
  • The most dangerous AI progress might remain secret (00:29:55)
  • White-knuckling the 12-month window after automated AI R&D (00:46:16)
  • AI help is most valuable right before things go crazy (01:10:36)
  • Foundations should go from paying researchers to paying for inference (01:23:08)
  • Will frontier AI even be for sale during the explosion? (01:30:21)
  • Pre-crunch prep: what we should do right now (01:42:10)
  • A grantmaking trial by fire at Coefficient Giving (01:45:12)
  • Sabbatical and reflections on effective altruism (02:05:32)
  • The mundane factors that drive career satisfaction (02:34:33)
  • EA as an incubator for avant-garde causes others won't touch (02:44:07)

Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Music: CORBIT
Coordination, transcriptions, and web: Katy Moore

Episoder(333)

The 80,000 Hours Career Guide (2023)

The 80,000 Hours Career Guide (2023)

An audio version of the 2023 80,000 Hours career guide, also available on our website, on Amazon, and on Audible.If you know someone who might find our career guide helpful, you can get a free copy se...

4 Sep 20234h 41min

#162 – Mustafa Suleyman on getting Washington and Silicon Valley to tame AI

#162 – Mustafa Suleyman on getting Washington and Silicon Valley to tame AI

Mustafa Suleyman was part of the trio that founded DeepMind, and his new AI project is building one of the world's largest supercomputers to train a large language model on 10–100x the compute used to...

1 Sep 202359min

#161 – Michael Webb on whether AI will soon cause job loss, lower incomes, and higher inequality — or the opposite

#161 – Michael Webb on whether AI will soon cause job loss, lower incomes, and higher inequality — or the opposite

"Do you remember seeing these photographs of generally women sitting in front of these huge panels and connecting calls, plugging different calls between different numbers? The automated version of th...

23 Aug 20233h 30min

#160 – Hannah Ritchie on why it makes sense to be optimistic about the environment

#160 – Hannah Ritchie on why it makes sense to be optimistic about the environment

"There's no money to invest in education elsewhere, so they almost get trapped in the cycle where they don't get a lot from crop production, but everyone in the family has to work there to just stay a...

14 Aug 20232h 36min

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

In July, OpenAI announced a new team and project: Superalignment. The goal is to figure out how to make superintelligent AI systems aligned and safe to use within four years, and the lab is putting a ...

7 Aug 20232h 51min

We now offer shorter 'interview highlights' episodes

We now offer shorter 'interview highlights' episodes

Over on our other feed, 80k After Hours, you can now find 20-30 minute highlights episodes of our 80,000 Hours Podcast interviews. These aren’t necessarily the most important parts of the interview, a...

5 Aug 20236min

#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making b...

31 Jul 20233h 13min

#157 – Ezra Klein on existential risk from AI and what DC could do about it

#157 – Ezra Klein on existential risk from AI and what DC could do about it

In Oppenheimer, scientists detonate a nuclear weapon despite thinking there's some 'near zero' chance it would ignite the atmosphere, putting an end to life on Earth. Today, scientists working on AI t...

24 Jul 20231h 18min

Populært innen Fakta

fastlegen
dine-penger-pengeradet
relasjonspodden-med-dora-thorhallsdottir-kjersti-idem
rss-strid-de-norske-borgerkrigene
foreldreradet
mikkels-paskenotter
treningspodden
rss-bisarr-historie
jakt-og-fiskepodden
rss-sunn-okonomi
sinnsyn
tomprat-med-gunnar-tjomlid
rss-kunsten-a-leve
hagespiren-podcast
rss-bak-luftfarten
ukast
fryktlos
hverdagspsyken
rss-mind-body-podden
gravid-uke-for-uke