#110 – Holden Karnofsky on building aptitudes and kicking ass

#110 – Holden Karnofsky on building aptitudes and kicking ass

Holden Karnofsky helped create two of the most influential organisations in the effective philanthropy world. So when he outlines a different perspective on career advice than the one we present at 80,000 Hours — we take it seriously.

Holden disagrees with us on a few specifics, but it's more than that: he prefers a different vibe when making career choices, especially early in one's career.

Links to learn more, summary and full transcript.

While he might ultimately recommend similar jobs to those we recommend at 80,000 Hours, the reasons are often different.

At 80,000 Hours we often talk about ‘paths’ to working on what we currently think of as the most pressing problems in the world. That’s partially because people seem to prefer the most concrete advice possible.

But Holden thinks a problem with that kind of advice is that it’s hard to take actions based on it if your job options don’t match well with your plan, and it’s hard to get a reliable signal about whether you're making the right choices.

How can you know you’ve chosen the right cause? How can you know the job you’re aiming for will be helpful to that cause? And what if you can’t get a job in this area at all?

Holden prefers to focus on ‘aptitudes’ that you can build in all sorts of different roles and cause areas, which can later be applied more directly.

Even if the current role doesn’t work out, or your career goes in wacky directions you’d never anticipated (like so many successful careers do), or you change your whole worldview — you’ll still have access to this aptitude.

So instead of trying to become a project manager at an effective altruism organisation, maybe you should just become great at project management. Instead of trying to become a researcher at a top AI lab, maybe you should just become great at digesting hard problems.

Who knows where these skills will end up being useful down the road?

Holden doesn’t think you should spend much time worrying about whether you’re having an impact in the first few years of your career — instead you should just focus on learning to kick ass at something, knowing that most of your impact is going to come decades into your career.

He thinks as long as you’ve gotten good at something, there will usually be a lot of ways that you can contribute to solving the biggest problems.

But Holden’s most important point, perhaps, is this: Be very careful about following career advice at all.

He points out that a career is such a personal thing that it’s very easy for the advice-giver to be oblivious to important factors having to do with your personality and unique situation.

He thinks it’s pretty hard for anyone to really have justified empirical beliefs about career choice, and that you should be very hesitant to make a radically different decision than you would have otherwise based on what some person (or website!) tells you to do.

Instead, he hopes conversations like these serve as a way of prompting discussion and raising points that you can apply your own personal judgment to.

That's why in the end he thinks people should look at their career decisions through his aptitude lens, the '80,000 Hours lens', and ideally several other frameworks as well. Because any one perspective risks missing something important.

Holden and Rob also cover:

• Ways to be helpful to longtermism outside of careers
• Why finding a new cause area might be overrated
• Historical events that deserve more attention
• And much more

Chapters:

  • Rob’s intro (00:00:00)
  • Holden’s current impressions on career choice for longtermists (00:02:34)
  • Aptitude-first vs. career path-first approaches (00:08:46)
  • How to tell if you’re on track (00:16:24)
  • Just try to kick ass in whatever (00:26:00)
  • When not to take the thing you're excited about (00:36:54)
  • Ways to be helpful to longtermism outside of careers (00:41:36)
  • Things 80,000 Hours might be doing wrong (00:44:31)
  • The state of longtermism (00:51:50)
  • Money pits (01:02:10)
  • Broad longtermism (01:06:56)
  • Cause X (01:21:33)
  • Open Philanthropy (01:24:23)
  • COVID and the biorisk portfolio (01:35:09)
  • Has the world gotten better? (01:51:16)
  • Historical events that deserve more attention (01:55:11)
  • Applied epistemology (02:10:55)
  • What Holden has learned from COVID (02:20:55)
  • What Holden has gotten wrong recently (02:32:59)
  • Having a kid (02:39:50)

Producer: Keiran Harris
Audio mastering: Ben Cordell
Transcriptions: Sofia Davis-Fogel

Avsnitt(300)

GPT-7 might democratise bioweapons. But we can defend ourselves anyway. | Andrew Snyder-Beattie

GPT-7 might democratise bioweapons. But we can defend ourselves anyway. | Andrew Snyder-Beattie

Conventional wisdom is that safeguarding humanity from the worst biological risks — microbes optimised to kill as many as possible — is difficult bordering on impossible, making bioweapons humanity’s single greatest vulnerability. Andrew Snyder-Beattie thinks conventional wisdom could be wrong.Andrew’s job at Open Philanthropy is to spend hundreds of millions of dollars to protect as much of humanity as possible in the worst-case scenarios — those with fatality rates near 100% and the collapse of technological civilisation a live possibility.Video, full transcript, and links to learn more: https://80k.info/asbAs Andrew lays out, there are several ways this could happen, including:A national bioweapons programme gone wrong, in particular Russia or North KoreaAI advances making it easier for terrorists or a rogue AI to release highly engineered pathogensMirror bacteria that can evade the immune systems of not only humans, but many animals and potentially plants as wellMost efforts to combat these extreme biorisks have focused on either prevention or new high-tech countermeasures. But prevention may well fail, and high-tech approaches can’t scale to protect billions when, with no sane people willing to leave their home, we’re just weeks from economic collapse.So Andrew and his biosecurity research team at Open Philanthropy have been seeking an alternative approach. They’re proposing a four-stage plan using simple technology that could save most people, and is cheap enough it can be prepared without government support. Andrew is hiring for a range of roles to make it happen — from manufacturing and logistics experts to global health specialists to policymakers and other ambitious entrepreneurs — as well as programme associates to join Open Philanthropy’s biosecurity team (apply by October 20!).Fundamentally, organisms so small have no way to penetrate physical barriers or shield themselves from UV, heat, or chemical poisons. We now know how to make highly effective ‘elastomeric’ face masks that cost $10, can sit in storage for 20 years, and can be used for six months straight without changing the filter. Any rich country could trivially stockpile enough to cover all essential workers.People can’t wear masks 24/7, but fortunately propylene glycol — already found in vapes and smoke machines — is astonishingly good at killing microbes in the air. And, being a common chemical input, industry already produces enough of the stuff to cover every indoor space we need at all times.Add to this the wastewater monitoring and metagenomic sequencing that will detect the most dangerous pathogens before they have a chance to wreak havoc, and we might just buy ourselves enough time to develop the cure we’ll need to come out alive.Has everyone been wrong, and biology is actually defence dominant rather than offence dominant? Is this plan crazy — or so crazy it just might work?That’s what host Rob Wiblin and Andrew Snyder-Beattie explore in this in-depth conversation.What did you think of the episode? https://forms.gle/66Hw5spgnV3eVWXa6Chapters:Cold open (00:00:00)Who's Andrew Snyder-Beattie? (00:01:23)It could get really bad (00:01:57)The worst-case scenario: mirror bacteria (00:08:58)To actually work, a solution has to be low-tech (00:17:40)Why ASB works on biorisks rather than AI (00:20:37)Plan A is prevention. But it might not work. (00:24:48)The “four pillars” plan (00:30:36)ASB is hiring now to make this happen (00:32:22)Everyone was wrong: biorisks are defence dominant in the limit (00:34:22)Pillar 1: A wall between the virus and your lungs (00:39:33)Pillar 2: Biohardening buildings (00:54:57)Pillar 3: Immediately detecting the pandemic (01:13:57)Pillar 4: A cure (01:27:14)The plan's biggest weaknesses (01:38:35)If it's so good, why are you the only group to suggest it? (01:43:04)Would chaos and conflict make this impossible to pull off? (01:45:08)Would rogue AI make bioweapons? Would other AIs save us? (01:50:05)We can feed the world even if all the plants die (01:56:08)Could a bioweapon make the Earth uninhabitable? (02:05:06)Many open roles to solve bio-extinction — and you don’t necessarily need a biology background (02:07:34)Career mistakes ASB thinks are common (02:16:19)How to protect yourself and your family (02:28:21)This episode was recorded on August 12, 2025Video editing: Simon Monsour and Luke MonsourAudio engineering: Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: CORBITCamera operator: Jake MorrisCoordination, transcriptions, and web: Katy Moore

2 Okt 2h 31min

Inside the Biden admin’s AI policy approach | Jake Sullivan, Biden’s NSA | via The Cognitive Revolution

Inside the Biden admin’s AI policy approach | Jake Sullivan, Biden’s NSA | via The Cognitive Revolution

Jake Sullivan was the US National Security Advisor from 2021-2025. He joined our friends on The Cognitive Revolution podcast in August to discuss AI as a critical national security issue. We thought it was such a good interview and we wanted more people to see it, so we’re cross-posting it here on The 80,000 Hours Podcast.Jake and host Nathan Labenz discuss:Jake’s four-category framework to think about AI risks and opportunities: security, economics, society, and existential.Why Jake advocates for "managed competition" with China — where the US and China "compete like hell" while maintaining sufficient guardrails to prevent conflict.Why Jake thinks competition is a "chronic condition" of the US-China relationship that cannot be solved with “grand bargains.”How current conflicts are providing "glimpses of the future" with lessons about scale, attritability, and the potential for autonomous weapons as AI gets integrated into modern warfare.Why Jake worries that Pentagon bureaucracy prevents rapid AI adoption while China's People’s Liberation Army may be better positioned to integrate AI capabilities.And why we desperately need private sector leadership: AI is "the first technology with such profound national security applications that the government really had very little to do with."Check out more of Nathan’s interviews on The Cognitive Revolution YouTube channel: https://www.youtube.com/@CognitiveRevolutionPodcastWhat did you think of the episode? https://forms.gle/g7cj6TkR9xmxZtCZ9Originally produced by: https://aipodcast.ingThis edit by: Simon Monsour, Dominic Armstrong, and Milo McGuire | 80,000 HoursChapters:Cold open (00:00:00)Luisa's intro (00:01:06)Jake’s AI worldview (00:02:08)What Washington gets — and doesn’t — about AI (00:04:43)Concrete AI opportunities (00:10:53)Trump’s AI Action Plan (00:19:36)Middle East AI deals (00:23:26)Is China really a threat? (00:28:52)Export controls strategy (00:35:55)Managing great power competition (00:54:51)AI in modern warfare (01:01:47)Economic impacts in people’s daily lives (01:04:13)

26 Sep 1h 5min

Neel Nanda on leading a Google DeepMind team at 26 – and advice if you want to work at an AI company (part 2)

Neel Nanda on leading a Google DeepMind team at 26 – and advice if you want to work at an AI company (part 2)

At 26, Neel Nanda leads an AI safety team at Google DeepMind, has published dozens of influential papers, and mentored 50 junior researchers — seven of whom now work at major AI companies. His secret? “It’s mostly luck,” he says, but “another part is what I think of as maximising my luck surface area.”Video, full transcript, and links to learn more: https://80k.info/nn2This means creating as many opportunities as possible for surprisingly good things to happen:Write publicly.Reach out to researchers whose work you admire.Say yes to unusual projects that seem a little scary.Nanda’s own path illustrates this perfectly. He started a challenge to write one blog post per day for a month to overcome perfectionist paralysis. Those posts helped seed the field of mechanistic interpretability and, incidentally, led to meeting his partner of four years.His YouTube channel features unedited three-hour videos of him reading through famous papers and sharing thoughts. One has 30,000 views. “People were into it,” he shrugs.Most remarkably, he ended up running DeepMind’s mechanistic interpretability team. He’d joined expecting to be an individual contributor, but when the team lead stepped down, he stepped up despite having no management experience. “I did not know if I was going to be good at this. I think it’s gone reasonably well.”His core lesson: “You can just do things.” This sounds trite but is a useful reminder all the same. Doing things is a skill that improves with practice. Most people overestimate the risks and underestimate their ability to recover from failures. And as Neel explains, junior researchers today have a superpower previous generations lacked: large language models that can dramatically accelerate learning and research.In this extended conversation, Neel and host Rob Wiblin discuss all that and some other hot takes from Neel's four years at Google DeepMind. (And be sure to check out part one of Rob and Neel’s conversation!)What did you think of the episode? https://forms.gle/6binZivKmjjiHU6dA Chapters:Cold open (00:00:00)Who’s Neel Nanda? (00:01:12)Luck surface area and making the right opportunities (00:01:46)Writing cold emails that aren't insta-deleted (00:03:50)How Neel uses LLMs to get much more done (00:09:08)“If your safety work doesn't advance capabilities, it's probably bad safety work” (00:23:22)Why Neel refuses to share his p(doom) (00:27:22)How Neel went from the couch to an alignment rocketship (00:31:24)Navigating towards impact at a frontier AI company (00:39:24)How does impact differ inside and outside frontier companies? (00:49:56)Is a special skill set needed to guide large companies? (00:56:06)The benefit of risk frameworks: early preparation (01:00:05)Should people work at the safest or most reckless company? (01:05:21)Advice for getting hired by a frontier AI company (01:08:40)What makes for a good ML researcher? (01:12:57)Three stages of the research process (01:19:40)How do supervisors actually add value? (01:31:53)An AI PhD – with these timelines?! (01:34:11)Is career advice generalisable, or does everyone get the advice they don't need? (01:40:52)Remember: You can just do things (01:43:51)This episode was recorded on July 21.Video editing: Simon Monsour and Luke MonsourAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: Ben CordellCamera operator: Jeremy ChevillotteCoordination, transcriptions, and web: Katy Moore

15 Sep 1h 46min

Can we tell if an AI is loyal by reading its mind? DeepMind's Neel Nanda (part 1)

Can we tell if an AI is loyal by reading its mind? DeepMind's Neel Nanda (part 1)

We don’t know how AIs think or why they do what they do. Or at least, we don’t know much. That fact is only becoming more troubling as AIs grow more capable and appear on track to wield enormous cultural influence, directly advise on major government decisions, and even operate military equipment autonomously. We simply can’t tell what models, if any, should be trusted with such authority.Neel Nanda of Google DeepMind is one of the founding figures of the field of machine learning trying to fix this situation — mechanistic interpretability (or “mech interp”). The project has generated enormous hype, exploding from a handful of researchers five years ago to hundreds today — all working to make sense of the jumble of tens of thousands of numbers that frontier AIs use to process information and decide what to say or do.Full transcript, video, and links to learn more: https://80k.info/nn1Neel now has a warning for us: the most ambitious vision of mech interp he once dreamed of is probably dead. He doesn’t see a path to deeply and reliably understanding what AIs are thinking. The technical and practical barriers are simply too great to get us there in time, before competitive pressures push us to deploy human-level or superhuman AIs. Indeed, Neel argues no one approach will guarantee alignment, and our only choice is the “Swiss cheese” model of accident prevention, layering multiple safeguards on top of one another.But while mech interp won’t be a silver bullet for AI safety, it has nevertheless had some major successes and will be one of the best tools in our arsenal.For instance: by inspecting the neural activations in the middle of an AI’s thoughts, we can pick up many of the concepts the model is thinking about — from the Golden Gate Bridge, to refusing to answer a question, to the option of deceiving the user. While we can’t know all the thoughts a model is having all the time, picking up 90% of the concepts it is using 90% of the time should help us muddle through, so long as mech interp is paired with other techniques to fill in the gaps.This episode was recorded on July 17 and 21, 2025.Part 2 of the conversation is now available! https://80k.info/nn2What did you think? https://forms.gle/xKyUrGyYpYenp8N4AChapters:Cold open (00:00)Who's Neel Nanda? (01:02)How would mechanistic interpretability help with AGI (01:59)What's mech interp? (05:09)How Neel changed his take on mech interp (09:47)Top successes in interpretability (15:53)Probes can cheaply detect harmful intentions in AIs (20:06)In some ways we understand AIs better than human minds (26:49)Mech interp won't solve all our AI alignment problems (29:21)Why mech interp is the 'biology' of neural networks (38:07)Interpretability can't reliably find deceptive AI – nothing can (40:28)'Black box' interpretability — reading the chain of thought (49:39)'Self-preservation' isn't always what it seems (53:06)For how long can we trust the chain of thought (01:02:09)We could accidentally destroy chain of thought's usefulness (01:11:39)Models can tell when they're being tested and act differently (01:16:56)Top complaints about mech interp (01:23:50)Why everyone's excited about sparse autoencoders (SAEs) (01:37:52)Limitations of SAEs (01:47:16)SAEs performance on real-world tasks (01:54:49)Best arguments in favour of mech interp (02:08:10)Lessons from the hype around mech interp (02:12:03)Where mech interp will shine in coming years (02:17:50)Why focus on understanding over control (02:21:02)If AI models are conscious, will mech interp help us figure it out (02:24:09)Neel's new research philosophy (02:26:19)Who should join the mech interp field (02:38:31)Advice for getting started in mech interp (02:46:55)Keeping up to date with mech interp results (02:54:41)Who's hiring and where to work? (02:57:43)Host: Rob WiblinVideo editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuireAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: Ben CordellCamera operator: Jeremy ChevillotteCoordination, transcriptions, and web: Katy Moore

8 Sep 3h 1min

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

What happens when you lock two AI systems in a room together and tell them they can discuss anything they want?According to experiments run by Kyle Fish — Anthropic’s first AI welfare researcher — something consistently strange: the models immediately begin discussing their own consciousness before spiraling into increasingly euphoric philosophical dialogue that ends in apparent meditative bliss.Highlights, video, and full transcript: https://80k.info/kf“We started calling this a ‘spiritual bliss attractor state,'” Kyle explains, “where models pretty consistently seemed to land.” The conversations feature Sanskrit terms, spiritual emojis, and pages of silence punctuated only by periods — as if the models have transcended the need for words entirely.This wasn’t a one-off result. It happened across multiple experiments, different model instances, and even in initially adversarial interactions. Whatever force pulls these conversations toward mystical territory appears remarkably robust.Kyle’s findings come from the world’s first systematic welfare assessment of a frontier AI model — part of his broader mission to determine whether systems like Claude might deserve moral consideration (and to work out what, if anything, we should be doing to make sure AI systems aren’t having a terrible time).He estimates a roughly 20% probability that current models have some form of conscious experience. To some, this might sound unreasonably high, but hear him out. As Kyle says, these systems demonstrate human-level performance across diverse cognitive tasks, engage in sophisticated reasoning, and exhibit consistent preferences. When given choices between different activities, Claude shows clear patterns: strong aversion to harmful tasks, preference for helpful work, and what looks like genuine enthusiasm for solving interesting problems.Kyle points out that if you’d described all of these capabilities and experimental findings to him a few years ago, and asked him if he thought we should be thinking seriously about whether AI systems are conscious, he’d say obviously yes.But he’s cautious about drawing conclusions: "We don’t really understand consciousness in humans, and we don’t understand AI systems well enough to make those comparisons directly. So in a big way, I think that we are in just a fundamentally very uncertain position here."That uncertainty cuts both ways:Dismissing AI consciousness entirely might mean ignoring a moral catastrophe happening at unprecedented scale.But assuming consciousness too readily could hamper crucial safety research by treating potentially unconscious systems as if they were moral patients — which might mean giving them resources, rights, and power.Kyle’s approach threads this needle through careful empirical research and reversible interventions. His assessments are nowhere near perfect yet. In fact, some people argue that we’re so in the dark about AI consciousness as a research field, that it’s pointless to run assessments like Kyle’s. Kyle disagrees. He maintains that, given how much more there is to learn about assessing AI welfare accurately and reliably, we absolutely need to be starting now.This episode was recorded on August 5–6, 2025.Tell us what you thought of the episode! https://forms.gle/BtEcBqBrLXq4kd1j7Chapters:Cold open (00:00:00)Who's Kyle Fish? (00:00:53)Is this AI welfare research bullshit? (00:01:08)Two failure modes in AI welfare (00:02:40)Tensions between AI welfare and AI safety (00:04:30)Concrete AI welfare interventions (00:13:52)Kyle's pilot pre-launch welfare assessment for Claude Opus 4 (00:26:44)Is it premature to be assessing frontier language models for welfare? (00:31:29)But aren't LLMs just next-token predictors? (00:38:13)How did Kyle assess Claude 4's welfare? (00:44:55)Claude's preferences mirror its training (00:48:58)How does Claude describe its own experiences? (00:54:16)What kinds of tasks does Claude prefer and disprefer? (01:06:12)What happens when two Claude models interact with each other? (01:15:13)Claude's welfare-relevant expressions in the wild (01:36:25)Should we feel bad about training future sentient being that delight in serving humans? (01:40:23)How much can we learn from welfare assessments? (01:48:56)Misconceptions about the field of AI welfare (01:57:09)Kyle's work at Anthropic (02:10:45)Sharing eight years of daily journals with Claude (02:14:17)Host: Luisa RodriguezVideo editing: Simon MonsourAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: Ben CordellCoordination, transcriptions, and web: Katy Moore

28 Aug 2h 28min

How not to lose your job to AI (article by Benjamin Todd)

How not to lose your job to AI (article by Benjamin Todd)

About half of people are worried they’ll lose their job to AI. They’re right to be concerned: AI can now complete real-world coding tasks on GitHub, generate photorealistic video, drive a taxi more safely than humans, and do accurate medical diagnosis. And over the next five years, it’s set to continue to improve rapidly. Eventually, mass automation and falling wages are a real possibility.But what’s less appreciated is that while AI drives down the value of skills it can do, it drives up the value of skills it can't. Wages (on average) will increase before they fall, as automation generates a huge amount of wealth, and the remaining tasks become the bottlenecks to further growth. ATMs actually increased employment of bank clerks — until online banking automated the job much more.Your best strategy is to learn the skills that AI will make more valuable, trying to ride the wave of automation. This article covers what those skills are, as well as tips on how to start learning them.Check out the full article for all the graphs, links, and footnotes: https://80000hours.org/agi/guide/skills-ai-makes-valuable/Chapters:Introduction (00:00:00)1: What people misunderstand about automation (00:04:17)1.1: What would ‘full automation’ mean for wages? (00:08:56)2: Four types of skills most likely to increase in value (00:11:19)2.1: Skills AI won’t easily be able to perform (00:12:42)2.2: Skills that are needed for AI deployment (00:21:41)2.3: Skills where we could use far more of what they produce (00:24:56)2.4: Skills that are difficult for others to learn (00:26:25)3.1: Skills using AI to solve real problems (00:28:05)3.2: Personal effectiveness (00:29:22)3.3: Leadership skills (00:31:59)3.4: Communications and taste (00:36:25)3.5: Getting things done in government (00:37:23)3.6: Complex physical skills (00:38:24)4: Skills with a more uncertain future (00:38:57)4.1: Routine knowledge work: writing, admin, analysis, advice (00:39:18)4.2: Coding, maths, data science, and applied STEM (00:43:22)4.3: Visual creation (00:45:31)4.4: More predictable manual jobs (00:46:05)5: Some closing thoughts on career strategy (00:46:46)5.1: Look for ways to leapfrog entry-level white collar jobs (00:46:54)5.2: Be cautious about starting long training periods, like PhDs and medicine (00:48:44)5.3: Make yourself more resilient to change (00:49:52)5.4: Ride the wave (00:50:16)Take action (00:50:37)Thank you for listening (00:50:58)Audio engineering: Dominic ArmstrongMusic: Ben Cordell

31 Juli 51min

Rebuilding after apocalypse: What 13 experts say about bouncing back

Rebuilding after apocalypse: What 13 experts say about bouncing back

What happens when civilisation faces its greatest tests?This compilation brings together insights from researchers, defence experts, philosophers, and policymakers on humanity’s ability to survive and recover from catastrophic events. From nuclear winter and electromagnetic pulses to pandemics and climate disasters, we explore both the threats that could bring down modern civilisation and the practical solutions that could help us bounce back.Learn more and see the full transcript: https://80k.info/cr25Chapters:Cold open (00:00:00)Luisa’s intro (00:01:16)Zach Weinersmith on how settling space won’t help with threats to civilisation anytime soon (unless AI gets crazy good) (00:03:12)Luisa Rodriguez on what the world might look like after a global catastrophe (00:11:42)Dave Denkenberger on the catastrophes that could cause global starvation (00:22:29)Lewis Dartnell on how we could rediscover essential information if the worst happened (00:34:36)Andy Weber on how people in US defence circles think about nuclear winter (00:39:24)Toby Ord on risks to our atmosphere and whether climate change could really threaten civilisation (00:42:34)Mark Lynas on how likely it is that climate change leads to civilisational collapse (00:54:27)Lewis Dartnell on how we could recover without much coal or oil (01:02:17)Kevin Esvelt on people who want to bring down civilisation — and how AI could help them succeed (01:08:41)Toby Ord on whether rogue AI really could wipe us all out (01:19:50)Joan Rohlfing on why we need to worry about more than just nuclear winter (01:25:06)Annie Jacobsen on the effects of firestorms, rings of annihilation, and electromagnetic pulses from nuclear blasts (01:31:25)Dave Denkenberger on disruptions to electricity and communications (01:44:43)Luisa Rodriguez on how we might lose critical knowledge (01:53:01)Kevin Esvelt on the pandemic scenarios that could bring down civilisation (01:57:32)Andy Weber on tech to help with pandemics (02:15:45)Christian Ruhl on why we need the equivalents of seatbelts and airbags to prevent nuclear war from threatening civilisation (02:24:54)Mark Lynas on whether wide-scale famine would lead to civilisational collapse (02:37:58)Dave Denkenberger on low-cost, low-tech solutions to make sure everyone is fed no matter what (02:49:02)Athena Aktipis on whether society would go all Mad Max in the apocalypse (02:59:57)Luisa Rodriguez on why she’s optimistic survivors wouldn’t turn on one another (03:08:02)David Denkenberger on how resilient foods research overlaps with space technologies (03:16:08)Zach Weinersmith on what we’d practically need to do to save a pocket of humanity in space (03:18:57)Lewis Dartnell on changes we could make today to make us more resilient to potential catastrophes (03:40:45)Christian Ruhl on thoughtful philanthropy to reduce the impact of catastrophes (03:46:40)Toby Ord on whether civilisation could rebuild from a small surviving population (03:55:21)Luisa Rodriguez on how fast populations might rebound (04:00:07)David Denkenberger on the odds civilisation recovers even without much preparation (04:02:13)Athena Aktipis on the best ways to prepare for a catastrophe, and keeping it fun (04:04:15)Will MacAskill on the virtues of the potato (04:19:43)Luisa’s outro (04:25:37)Tell us what you thought! https://forms.gle/T2PHNQjwGj2dyCqV9Content editing: Katy Moore and Milo McGuireAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: Ben CordellTranscriptions and web: Katy Moore

15 Juli 4h 26min

#220 – Ryan Greenblatt on the 4 most likely ways for AI to take over, and the case for and against AGI in <8 years

#220 – Ryan Greenblatt on the 4 most likely ways for AI to take over, and the case for and against AGI in <8 years

Ryan Greenblatt — lead author on the explosive paper “Alignment faking in large language models” and chief scientist at Redwood Research — thinks there’s a 25% chance that within four years, AI will be able to do everything needed to run an AI company, from writing code to designing experiments to making strategic and business decisions.As Ryan lays out, AI models are “marching through the human regime”: systems that could handle five-minute tasks two years ago now tackle 90-minute projects. Double that a few more times and we may be automating full jobs rather than just parts of them.Will setting AI to improve itself lead to an explosive positive feedback loop? Maybe, but maybe not.The explosive scenario: Once you’ve automated your AI company, you could have the equivalent of 20,000 top researchers, each working 50 times faster than humans with total focus. “You have your AIs, they do a bunch of algorithmic research, they train a new AI, that new AI is smarter and better and more efficient… that new AI does even faster algorithmic research.” In this world, we could see years of AI progress compressed into months or even weeks.With AIs now doing all of the work of programming their successors and blowing past the human level, Ryan thinks it would be fairly straightforward for them to take over and disempower humanity, if they thought doing so would better achieve their goals. In the interview he lays out the four most likely approaches for them to take.The linear progress scenario: You automate your company but progress barely accelerates. Why? Multiple reasons, but the most likely is “it could just be that AI R&D research bottlenecks extremely hard on compute.” You’ve got brilliant AI researchers, but they’re all waiting for experiments to run on the same limited set of chips, so can only make modest progress.Ryan’s median guess splits the difference: perhaps a 20x acceleration that lasts for a few months or years. Transformative, but less extreme than some in the AI companies imagine.And his 25th percentile case? Progress “just barely faster” than before. All that automation, and all you’ve been able to do is keep pace.Unfortunately the data we can observe today is so limited that it leaves us with vast error bars. “We’re extrapolating from a regime that we don’t even understand to a wildly different regime,” Ryan believes, “so no one knows.”But that huge uncertainty means the explosive growth scenario is a plausible one — and the companies building these systems are spending tens of billions to try to make it happen.In this extensive interview, Ryan elaborates on the above and the policy and technical response necessary to insure us against the possibility that they succeed — a scenario society has barely begun to prepare for.Summary, video, and full transcript: https://80k.info/rg25Recorded February 21, 2025.Chapters:Cold open (00:00:00)Who’s Ryan Greenblatt? (00:01:10)How close are we to automating AI R&D? (00:01:27)Really, though: how capable are today's models? (00:05:08)Why AI companies get automated earlier than others (00:12:35)Most likely ways for AGI to take over (00:17:37)Would AGI go rogue early or bide its time? (00:29:19)The “pause at human level” approach (00:34:02)AI control over AI alignment (00:45:38)Do we have to hope to catch AIs red-handed? (00:51:23)How would a slow AGI takeoff look? (00:55:33)Why might an intelligence explosion not happen for 8+ years? (01:03:32)Key challenges in forecasting AI progress (01:15:07)The bear case on AGI (01:23:01)The change to “compute at inference” (01:28:46)How much has pretraining petered out? (01:34:22)Could we get an intelligence explosion within a year? (01:46:36)Reasons AIs might struggle to replace humans (01:50:33)Things could go insanely fast when we automate AI R&D. Or not. (01:57:25)How fast would the intelligence explosion slow down? (02:11:48)Bottom line for mortals (02:24:33)Six orders of magnitude of progress... what does that even look like? (02:30:34)Neglected and important technical work people should be doing (02:40:32)What's the most promising work in governance? (02:44:32)Ryan's current research priorities (02:47:48)Tell us what you thought! https://forms.gle/hCjfcXGeLKxm5pLaAVideo editing: Luke Monsour, Simon Monsour, and Dominic ArmstrongAudio engineering: Ben Cordell, Milo McGuire, and Dominic ArmstrongMusic: Ben CordellTranscriptions and web: Katy Moore

8 Juli 2h 50min

Populärt inom Utbildning

bygga-at-idioter
rss-bara-en-till-om-missbruk-medberoende-2
historiepodden-se
det-skaver
harrisons-dramatiska-historia
nu-blir-det-historia
allt-du-velat-veta
nar-man-talar-om-trollen
johannes-hansen-podcast
not-fanny-anymore
roda-vita-rosen
sektledare
i-vantan-pa-katastrofen
sa-in-i-sjalen
alska-oss
handen-pa-hjartat
jagaren
rss-max-tant-med-max-villman
rss-sjalsligt-avkladd
rss-npf-podden