80,000 Hours Podcast7 Aug 2023

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

In July, OpenAI announced a new team and project: Superalignment. The goal is to figure out how to make superintelligent AI systems aligned and safe to use within four years, and the lab is putting a massive 20% of its computational resources behind the effort.

Today's guest, Jan Leike, is Head of Alignment at OpenAI and will be co-leading the project. As OpenAI puts it, "...the vast power of superintelligence could be very dangerous, and lead to the disempowerment of humanity or even human extinction. ... Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue."

Links to learn more, summary and full transcript.

Given that OpenAI is in the business of developing superintelligent AI, it sees that as a scary problem that urgently has to be fixed. So it’s not just throwing compute at the problem -- it’s also hiring dozens of scientists and engineers to build out the Superalignment team.

Plenty of people are pessimistic that this can be done at all, let alone in four years. But Jan is guardedly optimistic. As he explains:

Honestly, it really feels like we have a real angle of attack on the problem that we can actually iterate on... and I think it's pretty likely going to work, actually. And that's really, really wild, and it's really exciting. It's like we have this hard problem that we've been talking about for years and years and years, and now we have a real shot at actually solving it. And that'd be so good if we did.

Jan thinks that this work is actually the most scientifically interesting part of machine learning. Rather than just throwing more chips and more data at a training run, this work requires actually understanding how these models work and how they think. The answers are likely to be breakthroughs on the level of solving the mysteries of the human brain.

The plan, in a nutshell, is to get AI to help us solve alignment. That might sound a bit crazy -- as one person described it, “like using one fire to put out another fire.”

But Jan’s thinking is this: the core problem is that AI capabilities will keep getting better and the challenge of monitoring cutting-edge models will keep getting harder, while human intelligence stays more or less the same. To have any hope of ensuring safety, we need our ability to monitor, understand, and design ML models to advance at the same pace as the complexity of the models themselves.

And there's an obvious way to do that: get AI to do most of the work, such that the sophistication of the AIs that need aligning, and the sophistication of the AIs doing the aligning, advance in lockstep.

Jan doesn't want to produce machine learning models capable of doing ML research. But such models are coming, whether we like it or not. And at that point Jan wants to make sure we turn them towards useful alignment and safety work, as much or more than we use them to advance AI capabilities.

Jan thinks it's so crazy it just might work. But some critics think it's simply crazy. They ask a wide range of difficult questions, including:

If you don't know how to solve alignment, how can you tell that your alignment assistant AIs are actually acting in your interest rather than working against you? Especially as they could just be pretending to care about what you care about.
How do you know that these technical problems can be solved at all, even in principle?
At the point that models are able to help with alignment, won't they also be so good at improving capabilities that we're in the middle of an explosion in what AI can do?

In today's interview host Rob Wiblin puts these doubts to Jan to hear how he responds to each, and they also cover:

OpenAI's current plans to achieve 'superalignment' and the reasoning behind them
Why alignment work is the most fundamental and scientifically interesting research in ML
The kinds of people he’s excited to hire to join his team and maybe save the world
What most readers misunderstood about the OpenAI announcement
The three ways Jan expects AI to help solve alignment: mechanistic interpretability, generalization, and scalable oversight
What the standard should be for confirming whether Jan's team has succeeded
Whether OpenAI should (or will) commit to stop training more powerful general models if they don't think the alignment problem has been solved
Whether Jan thinks OpenAI has deployed models too quickly or too slowly
The many other actors who also have to do their jobs really well if we're going to have a good AI future
Plenty more

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Upptäck Premium

Prova 14 dagar kostnadsfritt

Skaffa Premium

Avsnitt(304)

#23 - How to actually become an AI alignment researcher, according to Dr Jan Leike

Want to help steer the 21st century’s most transformative technology? First complete an undergrad degree in computer science and mathematics. Prioritize harder courses over easier ones. Publish at least one paper before you apply for a PhD. Find a supervisor who’ll have a lot of time for you. Go to the top conferences and meet your future colleagues. And finally, get yourself hired. That’s Dr Jan Leike’s advice on how to join him as a Research Scientist at DeepMind, the world’s leading AI team. Jan is also a Research Associate at the Future of Humanity Institute at the University of Oxford, and his research aims to make machine learning robustly beneficial. His current focus is getting AI systems to learn good ‘objective functions’ in cases where we can’t easily specify the outcome we actually want. Full transcript, summary and links to learn more. How might you know you’re a good fit for research? Jan says to check whether you get obsessed with puzzles and problems, and find yourself mulling over questions that nobody knows the answer to. To do research in a team you also have to be good at clearly and concisely explaining your new ideas to other people. We also discuss: * Where Jan's views differ from those expressed by Dario Amodei in episode 3 * Why is AGI safety one of the world’s most pressing problems? * Common misconceptions about AI * What are some of the specific things DeepMind is researching? * The ways in which today’s AI systems can fail * What are the best techniques available today for teaching an AI the right objective function? * What’s it like to have some of the world’s greatest minds as coworkers? * Who should do empirical research and who should do theoretical research * What’s the DeepMind application process like? * The importance of researchers being comfortable with the unknown. *The 80,000 Hours Podcast is produced by Keiran Harris.*

16 Mars 201845min

#22 - Leah Utyasheva on the non-profit that figured out how to massively cut suicide rates

How people kill themselves varies enormously depending on which means are most easily available. In the United States, suicide by firearm stands out. In Hong Kong, where most people live in high rise buildings, jumping from a height is more common. And in some countries in Asia and Africa with many poor agricultural communities, the leading means is drinking pesticide. There’s a good chance you’ve never heard of this issue before. And yet, of the 800,000 people who kill themselves globally each year 20% die from pesticide self-poisoning. Full transcript, summary and links to articles discussed in today's show. Research suggests most people who try to kill themselves with pesticides reflect on the decision for less than 30 minutes, and that less than 10% of those who don't die the first time around will try again. Unfortunately, the fatality rate from pesticide ingestion is 40% to 70%. Having such dangerous chemicals near people's homes is therefore an enormous public health issue not only for the direct victims, but also the partners and children they leave behind. Fortunately researchers like Dr Leah Utyasheva have figured out a very cheap way to massively reduce pesticide suicide rates. In this episode, Leah and I discuss: * How do you prevent pesticide suicide and what’s the evidence it works? * How do you know that most people attempting suicide don’t want to die? * What types of events are causing people to have the crises that lead to attempted suicide? * How much money does it cost to save a life in this way? * How do you estimate the probability of getting law reform passed in a particular country? * Have you generally found politicians to be sympathetic to the idea of banning these pesticides? What are their greatest reservations? * The comparison of getting policy change rather than helping person-by-person * The importance of working with locals in places like India and Nepal, rather than coming in exclusively as outsiders * What are the benefits of starting your own non-profit versus joining an existing org and persuading them of the merits of the cause? * Would Leah in general recommend starting a new charity? Is it more exciting than it is scary? * Is it important to have an academic leading this kind of work? * How did The Centre for Pesticide Suicide Prevention get seed funding? * How does the value of saving a life from suicide compare to savings someone from malaria * Leah’s political campaigning for the rights of vulnerable groups in Eastern Europe * What are the biggest downsides of human rights work?

7 Mars 20181h 8min

#21 - Holden Karnofsky on times philanthropy transformed the world & Open Phil’s plan to do the same

The Green Revolution averted mass famine during the 20th century. The contraceptive pill gave women unprecedented freedom in planning their own lives. Both are widely recognised as scientific breakthroughs that transformed the world. But few know that those breakthroughs only happened when they did because of a philanthropist willing to take a risky bet on a new idea. Today’s guest, Holden Karnofsky, has been looking for philanthropy’s biggest success stories because he’s Executive Director of the Open Philanthropy Project, which gives away over $100 million per year - and he’s hungry for big wins. Full transcript, related links, job opportunities and summary of the interview. In the 1940s, poverty reduction overseas was not a big priority for many. But the Rockefeller Foundation decided to fund agricultural scientists to breed much better crops for the developing world - thereby massively increasing their food production. In the 1950s, society was a long way from demanding effective birth control. Activist Margaret Sanger had the idea for the pill, and endocrinologist Gregory Pincus the research team – but they couldn’t proceed without a $40,000 research check from biologist and women’s rights activist Katherine McCormick. In both cases, it was philanthropists rather than governments that led the way. The reason, according to Holden, is that while governments have enormous resources, they’re constrained by only being able to fund reasonably sure bets. Philanthropists can transform the world by filling the gaps government leaves - but to seize that opportunity they have to hire outstanding researchers, think long-term and be willing to fail most of the time. Holden knows more about this type of giving than almost anyone. As founder of GiveWell and then the Open Philanthropy Project, he has been working feverishly since 2007 to find outstanding giving opportunities. This practical experience has made him one of the most influential figures in the development of the school of thought that has come to be known as effective altruism. We’ve recorded this episode now because [the Open Philanthropy Project is hiring](https://www.openphilanthropy.org/get-involved/jobs) for a large number of positions, which we think would allow the right person to have a very large positive influence on the world. They’re looking for a large number of entry lever researchers to train up, 3 specialist researchers into potential risks from advanced artificial intelligence, as well as a Director of Operations, Operations Associate and General Counsel. But the conversation goes well beyond specifics about these jobs. We also discuss: * How did they pick the problems they focus on, and how will they change over time? * What would Holden do differently if he were starting Open Phil again today? * What can we learn from the history of philanthropy? * What makes a good Program Officer. * The importance of not letting hype get ahead of the science in an emerging field. * The importance of honest feedback for philanthropists, and the difficulty getting it. * How do they decide what’s above the bar to fund, and when it’s better to hold onto the money? * How philanthropic funding can most influence politics. * What Holden would say to a new billionaire who wanted to give away most of their wealth. * Why Open Phil is building a research field around the safe development of artificial intelligence * Why they invested in OpenAI. * Academia’s faulty approach to answering practical questions. * What potential utopias do people most want, according to opinion polls? Keiran Harris helped produce today’s episode.

27 Feb 20182h 35min

#20 - Bruce Friedrich on inventing outstanding meat substitutes to end speciesism & factory farming

Before the US Civil War, it was easier for the North to morally oppose slavery. Why? Because unlike the South they weren’t profiting much from its existence. The fight for abolition was partly won because many no longer saw themselves as having a selfish stake in its continuation. Bruce Friedrich, executive director of The Good Food Institute (GFI), thinks the same may be true in the fight against speciesism. 98% of people currently eat meat. But if eating meat stops being part of most people’s daily lives -- it should be a lot easier to convince them that farming practices are just as cruel as they look, and that the suffering of these animals really matters. Full transcript, related links, job opportunities and summary of the interview. That’s why GFI is “working with scientists, investors, and entrepreneurs” to create plant-based meat, dairy and eggs as well as clean meat alternatives to animal products. In 2016, Animal Charity Evaluators named GFI one of its recommended charities. In this interview I’m joined by my colleague Natalie Cargill, and we ask Bruce about: * What’s the best meat replacement product out there right now? * How effective is meat substitute research for people who want to reduce animal suffering as much as possible? * When will we get our hands on clean meat? And why does Bruce call it clean meat, rather than in vitro meat or cultured meat? * What are the challenges of producing something structurally identical to meat? * Can clean meat be healthier than conventional meat? * Do plant-based alternatives have a better shot at success than clean meat? * Is there a concern that, even if the product is perfect, people still won’t eat it? Why might that happen? * What’s it like being a vegan in a family made up largely of hunters and meat-eaters? * What kind of pushback should be expected from the meat industry? Keiran Harris helped produce today’s episode.

19 Feb 20181h 18min

#19 - Samantha Pitts-Kiefer on working next to the White House trying to prevent nuclear war

Rogue elements within a state’s security forces enrich dozens of kilograms of uranium. It’s then assembled into a crude nuclear bomb. The bomb is transported on a civilian aircraft to Washington D.C, and loaded onto a delivery truck. The truck is driven by an American citizen midway between the White House and the Capitol Building. The driver casually steps out of the vehicle, and detonates the weapon. There are more than 80,000 instant deaths. There are also at least 100,000 seriously wounded, with nowhere left to treat them. Full blog post about this episode, including a transcript, summary and links to resources mentioned in the show It’s likely that one of those immediately killed would be Samantha Pitts-Kiefer, who works only one block away from the White House. Samantha serves as Senior Director of The Global Nuclear Policy Program at the Nuclear Threat Initiative, and warns that the chances of a nuclear terrorist attack are alarmingly high. Terrorist groups have expressed a desire for nuclear weapons, and the material required to build those weapons is scattered throughout the world at a diverse range of sites – some of which lack the necessary security. When you combine the massive death toll with the accompanying social panic and economic disruption – the consequences of a nuclear 9/11 would be a disasterare almost unthinkable. And yet, Samantha reminds us – we must confront the possibility. Clearly, this is far from the only nuclear nightmare. We also discuss: * In the case of nuclear war, what fraction of the world's population would die? * What is the biggest nuclear threat? * How concerned should we be about North Korea? * How often has the world experienced nuclear near misses? * How might a conflict between India and Pakistan escalate to the nuclear level? * How quickly must a president make a decision in the result of a suspected first strike? * Are global sources of nuclear material safely secured? * What role does cyber security have in preventing nuclear disasters? * How can we improve relations between nuclear armed states? * What do you think about the campaign for complete nuclear disarmament? * If you could tell the US government to do three things, what are the key priorities today? * Is it practical to get members of congress to pay attention to nuclear risks? * Could modernisation of nuclear weapons actually make the world safer?

14 Feb 20181h 4min

#18 - Ofir Reich on using data science to end poverty & the spurious action-inaction distinction

Ofir Reich started out doing math in the military, before spending 8 years in tech startups - but then made a sharp turn to become a data scientist focussed on helping the global poor. At UC Berkeley’s Center for Effective Global Action he helps prevent tax evasion by identifying fake companies in India, enable Afghanistan to pay its teachers electronically, and raise yields for Ethiopian farmers by messaging them when local conditions make it ideal to apply fertiliser. Or at least that’s the hope - he’s also working on ways to test whether those interventions actually work. Full post about this episode, including a transcript and relevant links to learn more. Why dedicate his life to helping the global poor? Ofir sees little moral difference between harming people and failing to help them. After all, if you had to press a button to keep all of your money from going to charity, and you pressed that button, would that be an action, or an inaction? Is there even an answer? After reflecting on cases like this, he decided that to not engage with a problem is an active choice, one whose consequences he is just as morally responsible for as if he were directly involved. On top of his life philosophy we also discuss: * The benefits of working in a top academic environment * How best to start a career in global development * Are RCTs worth the money? Should we focus on big picture policy change instead? Or more economic theory? * How the delivery standards of nonprofits compare to top universities * Why he doesn’t enjoy living in the San Francisco bay area * How can we fix the problem of most published research being false? * How good a career path is data science? * How important is experience in development versus technical skills? * How he learned much of what he needed to know in the army * How concerned should effective altruists be about burnout? Keiran Harris helped produce today’s episode.

31 Jan 20181h 18min

#17 - Will MacAskill on moral uncertainty, utilitarianism & how to avoid being a moral monster

Immanuel Kant is a profoundly influential figure in modern philosophy, and was one of the earliest proponents for universal democracy and international cooperation. He also thought that women have no place in civil society, that it was okay to kill illegitimate children, and that there was a ranking in the moral worth of different races. Throughout history we’ve consistently believed, as common sense, truly horrifying things by today’s standards. According to University of Oxford Professor Will MacAskill, it’s extremely likely that we’re in the same boat today. If we accept that we’re probably making major moral errors, how should we proceed? Full transcript, key points and links to articles and career guides discussed in the show. If our morality is tied to common sense intuitions, we’re probably just preserving these biases and moral errors. Instead we need to develop a moral view that criticises common sense intuitions, and gives us a chance to move beyond them. And if humanity is going to spread to the stars it could be worth dedicating hundreds or thousands of years to moral reflection, lest we spread our errors far and wide. Will is an Associate Professor in Philosophy at Oxford University, author of Doing Good Better, and one of the co-founders of the effective altruism community. In this interview we discuss a wide range of topics: * How would we go about a ‘long reflection’ to fix our moral errors? * Will’s forthcoming book on how one should reason and act if you don't know which moral theory is correct. What are the practical implications of so-called ‘moral uncertainty’? * If we basically solve existential risks, what does humanity do next? * What are some of Will’s most unusual philosophical positions? * What are the best arguments for and against utilitarianism? * Given disagreements among philosophers, how much should we believe the findings of philosophy as a field? * What are some the biases we should be aware of within academia? * What are some of the downsides of becoming a professor? * What are the merits of becoming a philosopher? * How does the media image of EA differ to the actual goals of the community? * What kinds of things would you like to see the EA community do differently? * How much should we explore potentially controversial ideas? * How focused should we be on diversity? * What are the best arguments against effective altruism? Get free, one-on-one career advice We’ve helped hundreds of people compare their options, get introductions, and find high impact jobs. If you want to work on global priorities research or other important questions in academia, find out if our coaching can help you.

19 Jan 20181h 52min

#16 - Michelle Hutchinson on global priorities research & shaping the ideas of intellectuals

In the 40s and 50s neoliberalism was a fringe movement within economics. But by the 80s it had become a dominant school of thought in public policy, and achieved major policy changes across the English speaking world. How did this happen? In part because its leaders invested heavily in training academics to study and develop their ideas. Whether you think neoliberalism was good or bad, its history demonstrates the impact building a strong intellectual base within universities can have. Michelle Hutchinson is working to get a different set of ideas a hearing in academia by setting up the Global Priorities Institute (GPI) at Oxford University. The Institute, which is currently hiring for three roles, aims to bring together outstanding philosophers and economists to research how to most improve the world. The hope is that it will spark widespread academic engagement with effective altruist thinking, which will hone the ideas and help them gradually percolate into society more broadly. Link to the full blog post about this episode including transcript and links to learn more Its research agenda includes questions like: * How do we compare the good done by focussing on really different types of causes? * How does saving lives actually affect the world relative to other things we could do? * What are the biggest wins governments should be focussed on getting? Before moving to GPI, Michelle was the Executive Director of Giving What We Can and a founding figure of the effective altruism movement. She has a PhD in Applied Ethics from Oxford on prioritization and global health. We discuss: * What is global priorities research and why does it matter? * How is effective altruism seen in academia? Is it important to convince academics of the value of your work, or is it OK to ignore them? * Operating inside a university is quite expensive, so is it even worth doing? Who can pay for this kind of thing? * How hard is it to do something innovative inside a university? How serious are the administrative and other barriers? * Is it harder to fundraise for a new institute, or hire the right people? * Have other social movements benefitted from having a prominent academic arm? * How can people prepare themselves to get research roles at a place like GPI? * Many people want to have roles doing this kind of research. How many are actually cut out for it? What should those who aren’t do instead? * What are the odds of the Institute’s work having an effect on the real world? Get free, one-on-one career advice We’ve helped hundreds of people compare their options, get introductions, and find high impact jobs. If you want to work on global priorities research or other important questions in academia, find out if our coaching can help you.

22 Dec 201755min

Allt en och samma app

Lyssna på dina favoritpoddar och ljudböcker på ett och samma ställe.

Noga utvalt innehåll

Njut av handplockade tips som passar din smak – utan ändlöst scrollande.

Fortsätt när du vill

Fortsätt lyssna där du slutade – även offline.

Premium

99 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill

Prova 14 dagar gratis

Premium

129 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill
Ett extra konto

Prova 14 dagar gratis

Populärt inom Utbildning

rss-bara-en-till-om-missbruk-medberoende-2

rikatillsammans-om-privatekonomi-rikedom-i-livet

rss-broccolipodden-en-podcast-som-inte-handlar-om-broccoli

Berättelserna och rösterna du älskar att lyssna på

Obegränsad lyssning på alla dina favoritpoddar och ljudböcker

Upptäck Premium