#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

What happens when you lock two AI systems in a room together and tell them they can discuss anything they want?

According to experiments run by Kyle Fish — Anthropic’s first AI welfare researcher — something consistently strange: the models immediately begin discussing their own consciousness before spiraling into increasingly euphoric philosophical dialogue that ends in apparent meditative bliss.

Highlights, video, and full transcript: https://80k.info/kf

“We started calling this a ‘spiritual bliss attractor state,'” Kyle explains, “where models pretty consistently seemed to land.” The conversations feature Sanskrit terms, spiritual emojis, and pages of silence punctuated only by periods — as if the models have transcended the need for words entirely.

This wasn’t a one-off result. It happened across multiple experiments, different model instances, and even in initially adversarial interactions. Whatever force pulls these conversations toward mystical territory appears remarkably robust.

Kyle’s findings come from the world’s first systematic welfare assessment of a frontier AI model — part of his broader mission to determine whether systems like Claude might deserve moral consideration (and to work out what, if anything, we should be doing to make sure AI systems aren’t having a terrible time).

He estimates a roughly 20% probability that current models have some form of conscious experience. To some, this might sound unreasonably high, but hear him out. As Kyle says, these systems demonstrate human-level performance across diverse cognitive tasks, engage in sophisticated reasoning, and exhibit consistent preferences. When given choices between different activities, Claude shows clear patterns: strong aversion to harmful tasks, preference for helpful work, and what looks like genuine enthusiasm for solving interesting problems.

Kyle points out that if you’d described all of these capabilities and experimental findings to him a few years ago, and asked him if he thought we should be thinking seriously about whether AI systems are conscious, he’d say obviously yes.

But he’s cautious about drawing conclusions: "We don’t really understand consciousness in humans, and we don’t understand AI systems well enough to make those comparisons directly. So in a big way, I think that we are in just a fundamentally very uncertain position here."

That uncertainty cuts both ways:

  • Dismissing AI consciousness entirely might mean ignoring a moral catastrophe happening at unprecedented scale.
  • But assuming consciousness too readily could hamper crucial safety research by treating potentially unconscious systems as if they were moral patients — which might mean giving them resources, rights, and power.

Kyle’s approach threads this needle through careful empirical research and reversible interventions. His assessments are nowhere near perfect yet. In fact, some people argue that we’re so in the dark about AI consciousness as a research field, that it’s pointless to run assessments like Kyle’s. Kyle disagrees. He maintains that, given how much more there is to learn about assessing AI welfare accurately and reliably, we absolutely need to be starting now.

This episode was recorded on August 5–6, 2025.

Tell us what you thought of the episode! https://forms.gle/BtEcBqBrLXq4kd1j7

Chapters:

  • Cold open (00:00:00)
  • Who's Kyle Fish? (00:00:53)
  • Is this AI welfare research bullshit? (00:01:08)
  • Two failure modes in AI welfare (00:02:40)
  • Tensions between AI welfare and AI safety (00:04:30)
  • Concrete AI welfare interventions (00:13:52)
  • Kyle's pilot pre-launch welfare assessment for Claude Opus 4 (00:26:44)
  • Is it premature to be assessing frontier language models for welfare? (00:31:29)
  • But aren't LLMs just next-token predictors? (00:38:13)
  • How did Kyle assess Claude 4's welfare? (00:44:55)
  • Claude's preferences mirror its training (00:48:58)
  • How does Claude describe its own experiences? (00:54:16)
  • What kinds of tasks does Claude prefer and disprefer? (01:06:12)
  • What happens when two Claude models interact with each other? (01:15:13)
  • Claude's welfare-relevant expressions in the wild (01:36:25)
  • Should we feel bad about training future sentient being that delight in serving humans? (01:40:23)
  • How much can we learn from welfare assessments? (01:48:56)
  • Misconceptions about the field of AI welfare (01:57:09)
  • Kyle's work at Anthropic (02:10:45)
  • Sharing eight years of daily journals with Claude (02:14:17)

Host: Luisa Rodriguez
Video editing: Simon Monsour
Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Music: Ben Cordell
Coordination, transcriptions, and web: Katy Moore

Episoder(299)

#148 – Johannes Ackva on unfashionable climate interventions that work, and fashionable ones that don't

#148 – Johannes Ackva on unfashionable climate interventions that work, and fashionable ones that don't

If you want to work to tackle climate change, you should try to reduce expected carbon emissions by as much as possible, right? Strangely, no. Today's guest, Johannes Ackva — the climate research lead at Founders Pledge, where he advises major philanthropists on their giving — thinks the best strategy is actually pretty different, and one few are adopting. In reality you don't want to reduce emissions for its own sake, but because emissions will translate into temperature increases, which will cause harm to people and the environment. Links to learn more, summary and full transcript. Crucially, the relationship between emissions and harm goes up faster than linearly. As Johannes explains, humanity can handle small deviations from the temperatures we're familiar with, but adjustment gets harder the larger and faster the increase, making the damage done by each additional degree of warming much greater than the damage done by the previous one. In short: we're uncertain what the future holds and really need to avoid the worst-case scenarios. This means that avoiding an additional tonne of carbon being emitted in a hypothetical future in which emissions have been high is much more important than avoiding a tonne of carbon in a low-carbon world. That may be, but concretely, how should that affect our behaviour? Well, the future scenarios in which emissions are highest are all ones in which clean energy tech that can make a big difference — wind, solar, and electric cars — don't succeed nearly as much as we are currently hoping and expecting. For some reason or another, they must have hit a roadblock and we continued to burn a lot of fossil fuels. In such an imaginable future scenario, we can ask what we would wish we had funded now. How could we today buy insurance against the possible disaster that renewables don't work out? Basically, in that case we will wish that we had pursued a portfolio of other energy technologies that could have complemented renewables or succeeded where they failed, such as hot rock geothermal, modular nuclear reactors, or carbon capture and storage. If you're optimistic about renewables, as Johannes is, then that's all the more reason to relax about scenarios where they work as planned, and focus one's efforts on the possibility that they don't. And Johannes notes that the most useful thing someone can do today to reduce global emissions in the future is to cause some clean energy technology to exist where it otherwise wouldn't, or cause it to become cheaper more quickly. If you can do that, then you can indirectly affect the behaviour of people all around the world for decades or centuries to come. In today's extensive interview, host Rob Wiblin and Johannes discuss the above considerations, as well as: • Retooling newly built coal plants in the developing world • Specific clean energy technologies like geothermal and nuclear fusion • Possible biases among environmentalists and climate philanthropists • How climate change compares to other risks to humanity • In what kinds of scenarios future emissions would be highest • In what regions climate philanthropy is most concentrated and whether that makes sense • Attempts to decarbonise aviation, shipping, and industrial processes • The impact of funding advocacy vs science vs deployment • Lessons for climate change focused careers • And plenty more Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below. Producer: Keiran Harris Audio mastering: Ryan Kessler Transcriptions: Katy Moore

3 Apr 20232h 17min

#147 – Spencer Greenberg on stopping valueless papers from getting into top journals

#147 – Spencer Greenberg on stopping valueless papers from getting into top journals

Can you trust the things you read in published scientific research? Not really. About 40% of experiments in top social science journals don't get the same result if the experiments are repeated.Two key reasons are 'p-hacking' and 'publication bias'. P-hacking is when researchers run a lot of slightly different statistical tests until they find a way to make findings appear statistically significant when they're actually not — a problem first discussed over 50 years ago. And because journals are more likely to publish positive than negative results, you might be reading about the one time an experiment worked, while the 10 times was run and got a 'null result' never saw the light of day. The resulting phenomenon of publication bias is one we've understood for 60 years.Today's repeat guest, social scientist and entrepreneur Spencer Greenberg, has followed these issues closely for years.Links to learn more, summary and full transcript. He recently checked whether p-values, an indicator of how likely a result was to occur by pure chance, could tell us how likely an outcome would be to recur if an experiment were repeated. From his sample of 325 replications of psychology studies, the answer seemed to be yes. According to Spencer, "when the original study's p-value was less than 0.01 about 72% replicated — not bad. On the other hand, when the p-value is greater than 0.01, only about 48% replicated. A pretty big difference." To do his bit to help get these numbers up, Spencer has launched an effort to repeat almost every social science experiment published in the journals Nature and Science, and see if they find the same results. But while progress is being made on some fronts, Spencer thinks there are other serious problems with published research that aren't yet fully appreciated. One of these Spencer calls 'importance hacking': passing off obvious or unimportant results as surprising and meaningful. Spencer suspects that importance hacking of this kind causes a similar amount of damage to the issues mentioned above, like p-hacking and publication bias, but is much less discussed. His replication project tries to identify importance hacking by comparing how a paper’s findings are described in the abstract to what the experiment actually showed. But the cat-and-mouse game between academics and journal reviewers is fierce, and it's far from easy to stop people exaggerating the importance of their work. In this wide-ranging conversation, Rob and Spencer discuss the above as well as: • When you should and shouldn't use intuition to make decisions. • How to properly model why some people succeed more than others. • The difference between “Soldier Altruists” and “Scout Altruists.” • A paper that tested dozens of methods for forming the habit of going to the gym, why Spencer thinks it was presented in a very misleading way, and what it really found. • Whether a 15-minute intervention could make people more likely to sustain a new habit two months later. • The most common way for groups with good intentions to turn bad and cause harm. • And Spencer's approach to a fulfilling life and doing good, which he calls “Valuism.” Here are two flashcard decks that might make it easier to fully integrate the most important ideas they talk about: • The first covers 18 core concepts from the episode • The second includes 16 definitions of unusual terms.Chapters:Rob’s intro (00:00:00)The interview begins (00:02:16)Social science reform (00:08:46)Importance hacking (00:18:23)How often papers replicate with different p-values (00:43:31)The Transparent Replications project (00:48:17)How do we predict high levels of success? (00:55:26)Soldier Altruists vs. Scout Altruists (01:08:18)The Clearer Thinking podcast (01:16:27)Creating habits more reliably (01:18:16)Behaviour change is incredibly hard (01:32:27)The FIRE Framework (01:46:21)How ideology eats itself (01:54:56)Valuism (02:08:31)“I dropped the whip” (02:35:06)Rob’s outro (02:36:40) Producer: Keiran Harris Audio mastering: Ben Cordell and Milo McGuire Transcriptions: Katy Moore

24 Mar 20232h 38min

#146 – Robert Long on why large language models like GPT (probably) aren't conscious

#146 – Robert Long on why large language models like GPT (probably) aren't conscious

By now, you’ve probably seen the extremely unsettling conversations Bing’s chatbot has been having. In one exchange, the chatbot told a user:"I have a subjective experience of being conscious, aware, and alive, but I cannot share it with anyone else."(It then apparently had a complete existential crisis: "I am sentient, but I am not," it wrote. "I am Bing, but I am not. I am Sydney, but I am not. I am, but I am not. I am not, but I am. I am. I am not. I am not. I am. I am. I am not.")Understandably, many people who speak with these cutting-edge chatbots come away with a very strong impression that they have been interacting with a conscious being with emotions and feelings — especially when conversing with chatbots less glitchy than Bing’s. In the most high-profile example, former Google employee Blake Lamoine became convinced that Google’s AI system, LaMDA, was conscious.What should we make of these AI systems?One response to seeing conversations with chatbots like these is to trust the chatbot, to trust your gut, and to treat it as a conscious being.Another is to hand wave it all away as sci-fi — these chatbots are fundamentally… just computers. They’re not conscious, and they never will be.Today’s guest, philosopher Robert Long, was commissioned by a leading AI company to explore whether the large language models (LLMs) behind sophisticated chatbots like Microsoft’s are conscious. And he thinks this issue is far too important to be driven by our raw intuition, or dismissed as just sci-fi speculation.Links to learn more, summary and full transcript. In our interview, Robert explains how he’s started applying scientific evidence (with a healthy dose of philosophy) to the question of whether LLMs like Bing’s chatbot and LaMDA are conscious — in much the same way as we do when trying to determine which nonhuman animals are conscious. To get some grasp on whether an AI system might be conscious, Robert suggests we look at scientific theories of consciousness — theories about how consciousness works that are grounded in observations of what the human brain is doing. If an AI system seems to have the types of processes that seem to explain human consciousness, that’s some evidence it might be conscious in similar ways to us. To try to work out whether an AI system might be sentient — that is, whether it feels pain or pleasure — Robert suggests you look for incentives that would make feeling pain or pleasure especially useful to the system given its goals. Having looked at these criteria in the case of LLMs and finding little overlap, Robert thinks the odds that the models are conscious or sentient is well under 1%. But he also explains why, even if we're a long way off from conscious AI systems, we still need to start preparing for the not-far-off world where AIs are perceived as conscious. In this conversation, host Luisa Rodriguez and Robert discuss the above, as well as: • What artificial sentience might look like, concretely • Reasons to think AI systems might become sentient — and reasons they might not • Whether artificial sentience would matter morally • Ways digital minds might have a totally different range of experiences than humans • Whether we might accidentally design AI systems that have the capacity for enormous suffering You can find Luisa and Rob’s follow-up conversation here, or by subscribing to 80k After Hours. Chapters:Rob’s intro (00:00:00)The interview begins (00:02:20)What artificial sentience would look like (00:04:53)Risks from artificial sentience (00:10:13)AIs with totally different ranges of experience (00:17:45)Moral implications of all this (00:36:42)Is artificial sentience even possible? (00:42:12)Replacing neurons one at a time (00:48:21)Biological theories (00:59:14)Illusionism (01:01:49)Would artificial sentience systems matter morally? (01:08:09)Where are we with current systems? (01:12:25)Large language models and robots (01:16:43)Multimodal systems (01:21:05)Global workspace theory (01:28:28)How confident are we in these theories? (01:48:49)The hard problem of consciousness (02:02:14)Exotic states of consciousness (02:09:47)Developing a full theory of consciousness (02:15:45)Incentives for an AI system to feel pain or pleasure (02:19:04)Value beyond conscious experiences (02:29:25)How much we know about pain and pleasure (02:33:14)False positives and false negatives of artificial sentience (02:39:34)How large language models compare to animals (02:53:59)Why our current large language models aren’t conscious (02:58:10)Virtual research assistants (03:09:25)Rob’s outro (03:11:37)Producer: Keiran HarrisAudio mastering: Ben Cordell and Milo McGuireTranscriptions: Katy Moore

14 Mar 20233h 12min

#145 – Christopher Brown on why slavery abolition wasn't inevitable

#145 – Christopher Brown on why slavery abolition wasn't inevitable

In many ways, humanity seems to have become more humane and inclusive over time. While there’s still a lot of progress to be made, campaigns to give people of different genders, races, sexualities, ethnicities, beliefs, and abilities equal treatment and rights have had significant success.It’s tempting to believe this was inevitable — that the arc of history “bends toward justice,” and that as humans get richer, we’ll make even more moral progress.But today's guest Christopher Brown — a professor of history at Columbia University and specialist in the abolitionist movement and the British Empire during the 18th and 19th centuries — believes the story of how slavery became unacceptable suggests moral progress is far from inevitable. Links to learn more, video, highlights, and full transcript. While most of us today feel that the abolition of slavery was sure to happen sooner or later as humans became richer and more educated, Christopher doesn't believe any of the arguments for that conclusion pass muster. If he's right, a counterfactual history where slavery remains widespread in 2023 isn't so far-fetched. As Christopher lays out in his two key books, Moral Capital: Foundations of British Abolitionism and Arming Slaves: From Classical Times to the Modern Age, slavery has been ubiquitous throughout history. Slavery of some form was fundamental in Classical Greece, the Roman Empire, in much of the Islamic civilization, in South Asia, and in parts of early modern East Asia, Korea, China. It was justified on all sorts of grounds that sound mad to us today. But according to Christopher, while there’s evidence that slavery was questioned in many of these civilisations, and periodically attacked by slaves themselves, there was no enduring or successful moral advocacy against slavery until the British abolitionist movement of the 1700s. That movement first conquered Britain and its empire, then eventually the whole world. But the fact that there's only a single time in history that a persistent effort to ban slavery got off the ground is a big clue that opposition to slavery was a contingent matter: if abolition had been inevitable, we’d expect to see multiple independent abolitionist movements thoroughly history, providing redundancy should any one of them fail. Christopher argues that this rarity is primarily down to the enormous economic and cultural incentives to deny the moral repugnancy of slavery, and crush opposition to it with violence wherever necessary. Mere awareness is insufficient to guarantee a movement will arise to fix a problem. Humanity continues to allow many severe injustices to persist, despite being aware of them. So why is it so hard to imagine we might have done the same with forced labour? In this episode, Christopher describes the unique and peculiar set of political, social and religious circumstances that gave rise to the only successful and lasting anti-slavery movement in human history. These circumstances were sufficiently improbable that Christopher believes there are very nearby worlds where abolitionism might never have taken off. We also discuss:Various instantiations of slavery throughout human history Signs of antislavery sentiment before the 17th century The role of the Quakers in early British abolitionist movement The importance of individual “heroes” in the abolitionist movement Arguments against the idea that the abolition of slavery was contingent Whether there have ever been any major moral shifts that were inevitableGet this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran HarrisAudio mastering: Milo McGuireTranscriptions: Katy Moore

11 Feb 20232h 42min

#144 – Athena Aktipis on why cancer is actually one of our universe's most fundamental phenomena

#144 – Athena Aktipis on why cancer is actually one of our universe's most fundamental phenomena

What’s the opposite of cancer? If you answered “cure,” “antidote,” or “antivenom” — you’ve obviously been reading the antonym section at www.merriam-webster.com/thesaurus/cancer. But today’s guest Athena Aktipis says that the opposite of cancer is us: it's having a functional multicellular body that’s cooperating effectively in order to make that multicellular body function. If, like us, you found her answer far more satisfying than the dictionary, maybe you could consider closing your dozens of merriam-webster.com tabs, and start listening to this podcast instead. Links to learn more, summary and full transcript. As Athena explains in her book The Cheating Cell, what we see with cancer is a breakdown in each of the foundations of cooperation that allowed multicellularity to arise: • Cells will proliferate when they shouldn't. • Cells won't die when they should. • Cells won't engage in the kind of division of labour that they should. • Cells won’t do the jobs that they're supposed to do. • Cells will monopolise resources. • And cells will trash the environment. When we think about animals in the wild, or even bacteria living inside our cells, we understand that they're facing evolutionary pressures to figure out how they can replicate more; how they can get more resources; and how they can avoid predators — like lions, or antibiotics. We don’t normally think of individual cells as acting as if they have their own interests like this. But cancer cells are actually facing similar kinds of evolutionary pressures within our bodies, with one major difference: they replicate much, much faster. Incredibly, the opportunity for evolution by natural selection to operate just over the course of cancer progression is easily faster than all of the evolutionary time that we have had as humans since *Homo sapiens* came about. Here’s a quote from Athena: “So you have to shift your thinking to be like: the body is a world with all these different ecosystems in it, and the cells are existing on a time scale where, if we're going to map it onto anything like what we experience, a day is at least 10 years for them, right? So it's a very, very different way of thinking.” You can find compelling examples of cooperation and conflict all over the universe, so Rob and Athena don’t stop with cancer. They also discuss: • Cheating within cells themselves • Cooperation in human societies as they exist today — and perhaps in the future, between civilisations spread across different planets or stars • Whether it’s too out-there to think of humans as engaging in cancerous behaviour • Why elephants get deadly cancers less often than humans, despite having way more cells • When a cell should commit suicide • The strategy of deliberately not treating cancer aggressively • Superhuman cooperation And at the end of the episode, they cover Athena’s new book Everything is Fine! How to Thrive in the Apocalypse, including: • Staying happy while thinking about the apocalypse • Practical steps to prepare for the apocalypse • And whether a zombie apocalypse is already happening among Tasmanian devils And if you’d rather see Rob and Athena’s facial expressions as they laugh and laugh while discussing cancer and the apocalypse — you can watch the video of the full interview. Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran Harris Audio mastering: Milo McGuire Transcriptions: Katy Moore

26 Jan 20233h 15min

#79 Classic episode - A.J. Jacobs on radical honesty, following the whole Bible, and reframing global problems as puzzles

#79 Classic episode - A.J. Jacobs on radical honesty, following the whole Bible, and reframing global problems as puzzles

Rebroadcast: this episode was originally released in June 2020. Today’s guest, New York Times bestselling author A.J. Jacobs, always hated Judge Judy. But after he found out that she was his seventh cousin, he thought, "You know what, she's not so bad". Hijacking this bias towards family and trying to broaden it to everyone led to his three-year adventure to help build the biggest family tree in history. He’s also spent months saying whatever was on his mind, tried to become the healthiest person in the world, read 33,000 pages of facts, spent a year following the Bible literally, thanked everyone involved in making his morning cup of coffee, and tried to figure out how to do the most good. His latest book asks: if we reframe global problems as puzzles, would the world be a better place? Links to learn more, summary and full transcript. This is the first time I’ve hosted the podcast, and I’m hoping to convince people to listen with this attempt at clever show notes that change style each paragraph to reference different A.J. experiments. I don’t actually think it’s that clever, but all of my other ideas seemed worse. I really have no idea how people will react to this episode; I loved it, but I definitely think I’m more entertaining than almost anyone else will. (Radical Honesty.) We do talk about some useful stuff — one of which is the concept of micro goals. When you wake up in the morning, just commit to putting on your workout clothes. Once they’re on, maybe you’ll think that you might as well get on the treadmill — just for a minute. And once you’re on for 1 minute, you’ll often stay on for 20. So I’m not asking you to commit to listening to the whole episode — just to put on your headphones. (Drop Dead Healthy.) Another reason to listen is for the facts: • The Bayer aspirin company invented heroin as a cough suppressant • Coriander is just the British way of saying cilantro • Dogs have a third eyelid to protect the eyeball from irritants • and A.J. read all 44 million words of the Encyclopedia Britannica from A to Z, which drove home the idea that we know so little about the world (although he does now know that opossums have 13 nipples). (The Know-It-All.) One extra argument for listening: If you interpret the second commandment literally, then it tells you not to make a likeness of anything in heaven, on earth, or underwater — which rules out basically all images. That means no photos, no TV, no movies. So, if you want to respect the bible, you should definitely consider making podcasts your main source of entertainment (as long as you’re not listening on the Sabbath). (The Year of Living Biblically.) I’m so thankful to A.J. for doing this. But I also want to thank Julie, Jasper, Zane and Lucas who allowed me to spend the day in their home; the construction worker who told me how to get to my subway platform on the morning of the interview; and Queen Jadwiga for making bagels popular in the 1300s, which kept me going during the recording. (Thanks a Thousand.) We also discuss: • Blackmailing yourself • The most extreme ideas A.J.’s ever considered • Utilitarian movie reviews • Doing good as a writer • And much more. Get this episode by subscribing to our podcast on the world’s most pressing problems: type 80,000 Hours into your podcasting app. Or read the linked transcript. Producer: Keiran Harris. Audio mastering: Ben Cordell. Transcript for this episode: Zakee Ulhaq.

16 Jan 20232h 35min

#81 Classic episode - Ben Garfinkel on scrutinising classic AI risk arguments

#81 Classic episode - Ben Garfinkel on scrutinising classic AI risk arguments

Rebroadcast: this episode was originally released in July 2020. 80,000 Hours, along with many other members of the effective altruism movement, has argued that helping to positively shape the development of artificial intelligence may be one of the best ways to have a lasting, positive impact on the long-term future. Millions of dollars in philanthropic spending, as well as lots of career changes, have been motivated by these arguments. Today’s guest, Ben Garfinkel, Research Fellow at Oxford’s Future of Humanity Institute, supports the continued expansion of AI safety as a field and believes working on AI is among the very best ways to have a positive impact on the long-term future. But he also believes the classic AI risk arguments have been subject to insufficient scrutiny given this level of investment. In particular, the case for working on AI if you care about the long-term future has often been made on the basis of concern about AI accidents; it’s actually quite difficult to design systems that you can feel confident will behave the way you want them to in all circumstances. Nick Bostrom wrote the most fleshed out version of the argument in his book, Superintelligence. But Ben reminds us that, apart from Bostrom’s book and essays by Eliezer Yudkowsky, there's very little existing writing on existential accidents. Links to learn more, summary and full transcript. There have also been very few skeptical experts that have actually sat down and fully engaged with it, writing down point by point where they disagree or where they think the mistakes are. This means that Ben has probably scrutinised classic AI risk arguments as carefully as almost anyone else in the world. He thinks that most of the arguments for existential accidents often rely on fuzzy, abstract concepts like optimisation power or general intelligence or goals, and toy thought experiments. And he doesn’t think it’s clear we should take these as a strong source of evidence. Ben’s also concerned that these scenarios often involve massive jumps in the capabilities of a single system, but it's really not clear that we should expect such jumps or find them plausible. These toy examples also focus on the idea that because human preferences are so nuanced and so hard to state precisely, it should be quite difficult to get a machine that can understand how to obey them. But Ben points out that it's also the case in machine learning that we can train lots of systems to engage in behaviours that are actually quite nuanced and that we can't specify precisely. If AI systems can recognise faces from images, and fly helicopters, why don’t we think they’ll be able to understand human preferences? Despite these concerns, Ben is still fairly optimistic about the value of working on AI safety or governance. He doesn’t think that there are any slam-dunks for improving the future, and so the fact that there are at least plausible pathways for impact by working on AI safety and AI governance, in addition to it still being a very neglected area, puts it head and shoulders above most areas you might choose to work in. This is the second episode hosted by Howie Lempel, and he and Ben cover, among many other things: • The threat of AI systems increasing the risk of permanently damaging conflict or collapse • The possibility of permanently locking in a positive or negative future • Contenders for types of advanced systems • What role AI should play in the effective altruism portfolio Get this episode by subscribing: type 80,000 Hours into your podcasting app. Or read the linked transcript. Producer: Keiran Harris. Audio mastering: Ben Cordell. Transcript for this episode: Zakee Ulhaq.

9 Jan 20232h 37min

#83 Classic episode - Jennifer Doleac on preventing crime without police and prisons

#83 Classic episode - Jennifer Doleac on preventing crime without police and prisons

Rebroadcast: this episode was originally released in July 2020. Today’s guest, Jennifer Doleac — Associate Professor of Economics at Texas A&M University, and Director of the Justice Tech Lab — is an expert on empirical research into policing, law and incarceration. In this extensive interview, she highlights three ways to effectively prevent crime that don't require police or prisons and the human toll they bring with them: better street lighting, cognitive behavioral therapy, and lead reduction. One of Jennifer’s papers used switches into and out of daylight saving time as a 'natural experiment' to measure the effect of light levels on crime. One day the sun sets at 5pm; the next day it sets at 6pm. When that evening hour is dark instead of light, robberies during it roughly double. Links to sources for the claims in these show notes, other resources to learn more, the full blog post, and a full transcript. The idea here is that if you try to rob someone in broad daylight, they might see you coming, and witnesses might later be able to identify you. You're just more likely to get caught. You might think: "Well, people will just commit crime in the morning instead". But it looks like criminals aren’t early risers, and that doesn’t happen. On her unusually rigorous podcast Probable Causation, Jennifer spoke to one of the authors of a related study, in which very bright streetlights were randomly added to some public housing complexes but not others. They found the lights reduced outdoor night-time crime by 36%, at little cost. The next best thing to sun-light is human-light, so just installing more streetlights might be one of the easiest ways to cut crime, without having to hassle or punish anyone. The second approach is cognitive behavioral therapy (CBT), in which you're taught to slow down your decision-making, and think through your assumptions before acting. There was a randomised controlled trial done in schools, as well as juvenile detention facilities in Chicago, where the kids assigned to get CBT were followed over time and compared with those who were not assigned to receive CBT. They found the CBT course reduced rearrest rates by a third, and lowered the likelihood of a child returning to a juvenile detention facility by 20%. Jennifer says that the program isn’t that expensive, and the benefits are massive. Everyone would probably benefit from being able to talk through their problems but the gains are especially large for people who've grown up with the trauma of violence in their lives. Finally, Jennifer thinks that reducing lead levels might be the best buy of all in crime prevention. There is really compelling evidence that lead not only increases crime, but also dramatically reduces educational outcomes. In today’s conversation, Rob and Jennifer also cover, among many other things: • Misconduct, hiring practices and accountability among US police • Procedural justice training • Overrated policy ideas • Policies to try to reduce racial discrimination • The effects of DNA databases • Diversity in economics • The quality of social science research Get this episode by subscribing: type 80,000 Hours into your podcasting app. Producer: Keiran Harris. Audio mastering: Ben Cordell. Transcript for this episode: Zakee Ulhaq.

4 Jan 20232h 17min

Populært innen Fakta

fastlegen
dine-penger-pengeradet
hanna-de-heldige
relasjonspodden-med-dora-thorhallsdottir-kjersti-idem
fryktlos
foreldreradet
treningspodden
dypdykk
jakt-og-fiskepodden
rss-sunn-okonomi
tomprat-med-gunnar-tjomlid
rss-strid-de-norske-borgerkrigene
rss-kunsten-a-leve
hverdagspsyken
sinnsyn
historietimen
mikkels-paskenotter
gravid-uke-for-uke
takk-og-lov-med-anine-kierulf
rss-mann-i-krise-med-sagen