15 expert takes on infosec in the age of AI

15 expert takes on infosec in the age of AI

"There’s almost no story of the future going well that doesn’t have a part that’s like '…and no evil person steals the AI weights and goes and does evil stuff.' So it has highlighted the importance of information security: 'You’re training a powerful AI system; you should make it hard for someone to steal' has popped out to me as a thing that just keeps coming up in these stories, keeps being present. It’s hard to tell a story where it’s not a factor. It’s easy to tell a story where it is a factor." — Holden Karnofsky

What happens when a USB cable can secretly control your system? Are we hurtling toward a security nightmare as critical infrastructure connects to the internet? Is it possible to secure AI model weights from sophisticated attackers? And could AI might actually make computer security better rather than worse?

With AI security concerns becoming increasingly urgent, we bring you insights from 15 top experts across information security, AI safety, and governance, examining the challenges of protecting our most powerful AI models and digital infrastructure — including a sneak peek from an episode that hasn’t yet been released with Tom Davidson, where he explains how we should be more worried about “secret loyalties” in AI agents.

You’ll hear:

  • Holden Karnofsky on why every good future relies on strong infosec, and how hard it’s been to hire security experts (from episode #158)
  • Tantum Collins on why infosec might be the rare issue everyone agrees on (episode #166)
  • Nick Joseph on whether AI companies can develop frontier models safely with the current state of information security (episode #197)
  • Sella Nevo on why AI model weights are so valuable to steal, the weaknesses of air-gapped networks, and the risks of USBs (episode #195)
  • Kevin Esvelt on what cryptographers can teach biosecurity experts (episode #164)
  • Lennart Heim on on Rob’s computer security nightmares (episode #155)
  • Zvi Mowshowitz on the insane lack of security mindset at some AI companies (episode #184)
  • Nova DasSarma on the best current defences against well-funded adversaries, politically motivated cyberattacks, and exciting progress in infosecurity (episode #132)
  • Bruce Schneier on whether AI could eliminate software bugs for good, and why it’s bad to hook everything up to the internet (episode #64)
  • Nita Farahany on the dystopian risks of hacked neurotech (episode #174)
  • Vitalik Buterin on how cybersecurity is the key to defence-dominant futures (episode #194)
  • Nathan Labenz on how even internal teams at AI companies may not know what they’re building (episode #176)
  • Allan Dafoe on backdooring your own AI to prevent theft (episode #212)
  • Tom Davidson on how dangerous “secret loyalties” in AI models could be (episode to be released!)
  • Carl Shulman on the challenge of trusting foreign AI models (episode #191, part 2)
  • Plus lots of concrete advice on how to get into this field and find your fit

Check out the full transcript on the 80,000 Hours website.

Chapters:

  • Cold open (00:00:00)
  • Rob's intro (00:00:49)
  • Holden Karnofsky on why infosec could be the issue on which the future of humanity pivots (00:03:21)
  • Tantum Collins on why infosec is a rare AI issue that unifies everyone (00:12:39)
  • Nick Joseph on whether the current state of information security makes it impossible to responsibly train AGI (00:16:23)
  • Nova DasSarma on the best available defences against well-funded adversaries (00:22:10)
  • Sella Nevo on why AI model weights are so valuable to steal (00:28:56)
  • Kevin Esvelt on what cryptographers can teach biosecurity experts (00:32:24)
  • Lennart Heim on the possibility of an autonomously replicating AI computer worm (00:34:56)
  • Zvi Mowshowitz on the absurd lack of security mindset at some AI companies (00:48:22)
  • Sella Nevo on the weaknesses of air-gapped networks and the risks of USB devices (00:49:54)
  • Bruce Schneier on why it’s bad to hook everything up to the internet (00:55:54)
  • Nita Farahany on the possibility of hacking neural implants (01:04:47)
  • Vitalik Buterin on how cybersecurity is the key to defence-dominant futures (01:10:48)
  • Nova DasSarma on exciting progress in information security (01:19:28)
  • Nathan Labenz on how even internal teams at AI companies may not know what they’re building (01:30:47)
  • Allan Dafoe on backdooring your own AI to prevent someone else from stealing it (01:33:51)
  • Tom Davidson on how dangerous “secret loyalties” in AI models could get (01:35:57)
  • Carl Shulman on whether we should be worried about backdoors as governments adopt AI technology (01:52:45)
  • Nova DasSarma on politically motivated cyberattacks (02:03:44)
  • Bruce Schneier on the day-to-day benefits of improved security and recognising that there’s never zero risk (02:07:27)
  • Holden Karnofsky on why it’s so hard to hire security people despite the massive need (02:13:59)
  • Nova DasSarma on practical steps to getting into this field (02:16:37)
  • Bruce Schneier on finding your personal fit in a range of security careers (02:24:42)
  • Rob's outro (02:34:46)

Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Content editing: Katy Moore and Milo McGuire
Transcriptions and web: Katy Moore

Jaksot(293)

We just put up a new compilation of ten core episodes of the show

We just put up a new compilation of ten core episodes of the show

We recently launched a new podcast feed that might be useful to you and people you know. It's called Effective Altruism: Ten Global Problems, and it's a collection of ten top episodes of this show, selected to help listeners quickly get up to speed on ten pressing problems that the effective altruism community is working to solve. It's a companion to our other compilation Effective Altruism: An Introduction, which explores the big picture debates within the community and how to set priorities in order to have the greatest impact.These ten episodes cover: The cheapest ways to improve education in the developing world How dangerous is climate change and what are the most effective ways to reduce it? Using new technologies to prevent another disastrous pandemic Ways to simultaneously reduce both police misconduct and crime All the major approaches being taken to end factory farming How advances in artificial intelligence could go very right or very wrong Other big threats to the future of humanity — such as a nuclear war — and how can we make our species wiser and more resilient One problem few even recognise as a problem at all The selection is ideal for people who are completely new to the effective altruist way of thinking, as well as those who are familiar with effective altruism but new to The 80,000 Hours Podcast.If someone in your life wants to get an understanding of what 80,000 Hours or effective altruism are all about, and prefers to listen to things rather than read, this is a great resource to direct them to.You can find it by searching for effective altruism in whatever podcasting app you use, or by going to 80000hours.org/ten.We'd love to hear how you go listening to it yourself, or sharing it with others in your life. Get in touch by emailing podcast@80000hours.org.

20 Loka 20213min

#113 – Varsha Venugopal on using gossip to help vaccinate every child in India

#113 – Varsha Venugopal on using gossip to help vaccinate every child in India

Our failure to make sure all kids globally get all of their basic vaccinations leads to 1.5 million child deaths every year.According to today’s guest, Varsha Venugopal, for the great majority this has nothing to do with weird conspiracy theories or medical worries — in India 80% of undervaccinated children are already getting some shots. They just aren't getting all of them, for the tragically mundane reason that life can get in the way.Links to learn more, summary and full transcript. As Varsha says, we're all sometimes guilty of "valuing our present very differently from the way we value the future", leading to short-term thinking whether about getting vaccines or going to the gym. So who should we call on to help fix this universal problem? The government, extended family, or maybe village elders? Varsha says that research shows the most influential figures might actually be local gossips. In 2018, Varsha heard about the ideas around effective altruism for the first time. By the end of 2019, she’d gone through Charity Entrepreneurship’s strategy incubation program, and quit her normal, stable job to co-found Suvita, a non-profit focused on improving the uptake of immunization in India, which focuses on two models: 1. Sending SMS reminders directly to parents and carers 2. Gossip The first one is intuitive. You collect birth registers, digitize the paper records, process the data, and send out personalised SMS messages to hundreds of thousands of families. The effect size varies depending on the context but these messages usually increase vaccination rates by 8-18%. The second approach is less intuitive and isn't yet entirely understood either. Here’s what happens: Suvita calls up random households and asks, “if there were an event in town, who would be most likely to tell you about it?” In over 90% of the cases, the households gave both the name and the phone number of a local ‘influencer’. And when tracked down, more than 95% of the most frequently named 'influencers' agreed to become vaccination ambassadors. Those ambassadors then go on to share information about when and where to get vaccinations, in whatever way seems best to them. When tested by a team of top academics at the Poverty Action Lab (J-PAL) it raised vaccination rates by 10 percentage points, or about 27%. The advantage of SMS reminders is that they’re easier to scale up. But Varsha says the ambassador program isn’t actually that far from being a scalable model as well. A phone call to get a name, another call to ask the influencer join, and boom — you might have just covered a whole village rather than just a single family. Varsha says that Suvita has two major challenges on the horizon: 1. Maintaining the same degree of oversight of their surveyors as they attempt to scale up the program, in order to ensure the program continues to work just as well 2. Deciding between focusing on reaching a few more additional districts now vs. making longer term investments which could build up to a future exponential increase. In this episode, Varsha and Rob talk about making these kinds of high-stakes, high-stress decisions, as well as: • How Suvita got started, and their experience with Charity Entrepreneurship • Weaknesses of the J-PAL studies • The importance of co-founders • Deciding how broad a program should be • Varsha’s day-to-day experience • And much moreChapters:Rob’s intro (00:00:00)The interview begins (00:01:47)The problem of undervaccinated kids (00:03:16)Suvita (00:12:47)Evidence on SMS reminders (00:20:30)Gossip intervention (00:28:43)Why parents aren’t already prioritizing vaccinations (00:38:29)Weaknesses of studies (00:43:01)Biggest challenges for Suvita (00:46:05)Staff location (01:06:57)Charity Entrepreneurship (01:14:37)The importance of co-founders (01:23:23)Deciding how broad a program should be (01:28:29)Careers at Suvita (01:34:11)Varsha’s advice (01:42:30)Varsha’s day-to-day experience (01:56:19)Producer: Keiran HarrisAudio mastering: Ben CordellTranscriptions: Katy Moore

18 Loka 20212h 5min

#112 – Carl Shulman on the common-sense case for existential risk work and its practical implications

#112 – Carl Shulman on the common-sense case for existential risk work and its practical implications

Preventing the apocalypse may sound like an idiosyncratic activity, and it sometimes is justified on exotic grounds, such as the potential for humanity to become a galaxy-spanning civilisation.But the policy of US government agencies is already to spend up to $4 million to save the life of a citizen, making the death of all Americans a $1,300,000,000,000,000 disaster.According to Carl Shulman, research associate at Oxford University's Future of Humanity Institute, that means you don’t need any fancy philosophical arguments about the value or size of the future to justify working to reduce existential risk — it passes a mundane cost-benefit analysis whether or not you place any value on the long-term future.Links to learn more, summary and full transcript. The key reason to make it a top priority is factual, not philosophical. That is, the risk of a disaster that kills billions of people alive today is alarmingly high, and it can be reduced at a reasonable cost. A back-of-the-envelope version of the argument runs: • The US government is willing to pay up to $4 million (depending on the agency) to save the life of an American. • So saving all US citizens at any given point in time would be worth $1,300 trillion. • If you believe that the risk of human extinction over the next century is something like one in six (as Toby Ord suggests is a reasonable figure in his book The Precipice), then it would be worth the US government spending up to $2.2 trillion to reduce that risk by just 1%, in terms of American lives saved alone. • Carl thinks it would cost a lot less than that to achieve a 1% risk reduction if the money were spent intelligently. So it easily passes a government cost-benefit test, with a very big benefit-to-cost ratio — likely over 1000:1 today. This argument helped NASA get funding to scan the sky for any asteroids that might be on a collision course with Earth, and it was directly promoted by famous economists like Richard Posner, Larry Summers, and Cass Sunstein. If the case is clear enough, why hasn't it already motivated a lot more spending or regulations to limit existential risks — enough to drive down what any additional efforts would achieve? Carl thinks that one key barrier is that infrequent disasters are rarely politically salient. Research indicates that extra money is spent on flood defences in the years immediately following a massive flood — but as memories fade, that spending quickly dries up. Of course the annual probability of a disaster was the same the whole time; all that changed is what voters had on their minds. Carl expects that all the reasons we didn’t adequately prepare for or respond to COVID-19 — with excess mortality over 15 million and costs well over $10 trillion — bite even harder when it comes to threats we've never faced before, such as engineered pandemics, risks from advanced artificial intelligence, and so on. Today’s episode is in part our way of trying to improve this situation. In today’s wide-ranging conversation, Carl and Rob also cover: • A few reasons Carl isn't excited by 'strong longtermism' • How x-risk reduction compares to GiveWell recommendations • Solutions for asteroids, comets, supervolcanoes, nuclear war, pandemics, and climate change • The history of bioweapons • Whether gain-of-function research is justifiable • Successes and failures around COVID-19 • The history of existential risk • And much moreChapters:Rob’s intro (00:00:00)The interview begins (00:01:34)A few reasons Carl isn't excited by strong longtermism (00:03:47)Longtermism isn’t necessary for wanting to reduce big x-risks (00:08:21)Why we don’t adequately prepare for disasters (00:11:16)International programs to stop asteroids and comets (00:18:55)Costs and political incentives around COVID (00:23:52)How x-risk reduction compares to GiveWell recommendations (00:34:34)Solutions for asteroids, comets, and supervolcanoes (00:50:22)Solutions for climate change (00:54:15)Solutions for nuclear weapons (01:02:18)The history of bioweapons (01:22:41)Gain-of-function research (01:34:22)Solutions for bioweapons and natural pandemics (01:45:31)Successes and failures around COVID-19 (01:58:26)Who to trust going forward (02:09:09)The history of existential risk (02:15:07)The most compelling risks (02:24:59)False alarms about big risks in the past (02:34:22)Suspicious convergence around x-risk reduction (02:49:31)How hard it would be to convince governments (02:57:59)Defensive epistemology (03:04:34)Hinge of history debate (03:16:01)Technological progress can’t keep up for long (03:21:51)Strongest argument against this being a really pivotal time (03:37:29)How Carl unwinds (03:45:30)Producer: Keiran HarrisAudio mastering: Ben CordellTranscriptions: Katy Moore

5 Loka 20213h 48min

#111 – Mushtaq Khan on using institutional economics to predict effective government reforms

#111 – Mushtaq Khan on using institutional economics to predict effective government reforms

If you’re living in the Niger Delta in Nigeria, your best bet at a high-paying career is probably ‘artisanal refining’ — or, in plain language, stealing oil from pipelines. The resulting oil spills damage the environment and cause severe health problems, but the Nigerian government has continually failed in their attempts to stop this theft. They send in the army, and the army gets corrupted. They send in enforcement agencies, and the enforcement agencies get corrupted. What’s happening here? According to Mushtaq Khan, economics professor at SOAS University of London, this is a classic example of ‘networked corruption’. Everyone in the community is benefiting from the criminal enterprise — so much so that the locals would prefer civil war to following the law. It pays vastly better than other local jobs, hotels and restaurants have formed around it, and houses are even powered by the electricity generated from the oil. Links to learn more, summary and full transcript. In today's episode, Mushtaq elaborates on the models he uses to understand these problems and make predictions he can test in the real world. Some of the most important factors shaping the fate of nations are their structures of power: who is powerful, how they are organized, which interest groups can pull in favours with the government, and the constant push and pull between the country's rulers and its ruled. While traditional economic theory has relatively little to say about these topics, institutional economists like Mushtaq have a lot to say, and participate in lively debates about which of their competing ideas best explain the world around us. The issues at stake are nothing less than why some countries are rich and others are poor, why some countries are mostly law abiding while others are not, and why some government programmes improve public welfare while others just enrich the well connected. Mushtaq’s specialties are anti-corruption and industrial policy, where he believes mainstream theory and practice are largely misguided. Mushtaq's rule of thumb is that when the locals most concerned with a specific issue are invested in preserving a status quo they're participating in, they almost always win out. To actually reduce corruption, countries like his native Bangladesh have to follow the same gradual path the U.K. once did: find organizations that benefit from rule-abiding behaviour and are selfishly motivated to promote it, and help them police their peers. Trying to impose a new way of doing things from the top down wasn't how Europe modernised, and it won't work elsewhere either. In cases like oil theft in Nigeria, where no one wants to follow the rules, Mushtaq says corruption may be impossible to solve directly. Instead you have to play a long game, bringing in other employment opportunities, improving health services, and deploying alternative forms of energy — in the hope that one day this will give people a viable alternative to corruption. In this extensive interview Rob and Mushtaq cover this and much more, including: • How does one test theories like this? • Why are companies in some poor countries so much less productive than their peers in rich countries? • Have rich countries just legalized the corruption in their societies? • What are the big live debates in institutional economics? • Should poor countries protect their industries from foreign competition? • How can listeners use these theories to predict which policies will work in their own countries? Chapters:Rob’s intro (00:00:00)The interview begins (00:01:55)Institutional economics (00:15:37)Anti-corruption policies (00:28:45)Capabilities (00:34:51)Why the market doesn’t solve the problem (00:42:29)Industrial policy (00:46:11)South Korea (01:01:31)Chiang Kai-shek (01:16:01)The logic of political survival (01:18:43)Anti-corruption as a design of your policy (01:35:16)Examples of anti-corruption programs with good prospects (01:45:17)The importance of getting overseas influences (01:56:05)Actually capturing the primary effect (02:03:26)How less developed countries could successfully design subsidies (02:15:14)What happens when horizontal policing isn't possible (02:26:34)Rule of law <--> economic development (02:33:40)Violence (02:38:31)How this applies to developed countries (02:48:57)Policies to help left-behind groups (02:55:39)What to study (02:58:50) Producer: Keiran Harris Audio mastering: Ben Cordell Transcriptions: Sofia Davis-Fogel

10 Syys 20213h 20min

#110 – Holden Karnofsky on building aptitudes and kicking ass

#110 – Holden Karnofsky on building aptitudes and kicking ass

Holden Karnofsky helped create two of the most influential organisations in the effective philanthropy world. So when he outlines a different perspective on career advice than the one we present at 80,000 Hours — we take it seriously.Holden disagrees with us on a few specifics, but it's more than that: he prefers a different vibe when making career choices, especially early in one's career.Links to learn more, summary and full transcript. While he might ultimately recommend similar jobs to those we recommend at 80,000 Hours, the reasons are often different. At 80,000 Hours we often talk about ‘paths’ to working on what we currently think of as the most pressing problems in the world. That’s partially because people seem to prefer the most concrete advice possible. But Holden thinks a problem with that kind of advice is that it’s hard to take actions based on it if your job options don’t match well with your plan, and it’s hard to get a reliable signal about whether you're making the right choices. How can you know you’ve chosen the right cause? How can you know the job you’re aiming for will be helpful to that cause? And what if you can’t get a job in this area at all? Holden prefers to focus on ‘aptitudes’ that you can build in all sorts of different roles and cause areas, which can later be applied more directly. Even if the current role doesn’t work out, or your career goes in wacky directions you’d never anticipated (like so many successful careers do), or you change your whole worldview — you’ll still have access to this aptitude. So instead of trying to become a project manager at an effective altruism organisation, maybe you should just become great at project management. Instead of trying to become a researcher at a top AI lab, maybe you should just become great at digesting hard problems. Who knows where these skills will end up being useful down the road? Holden doesn’t think you should spend much time worrying about whether you’re having an impact in the first few years of your career — instead you should just focus on learning to kick ass at something, knowing that most of your impact is going to come decades into your career. He thinks as long as you’ve gotten good at something, there will usually be a lot of ways that you can contribute to solving the biggest problems. But Holden’s most important point, perhaps, is this: Be very careful about following career advice at all. He points out that a career is such a personal thing that it’s very easy for the advice-giver to be oblivious to important factors having to do with your personality and unique situation. He thinks it’s pretty hard for anyone to really have justified empirical beliefs about career choice, and that you should be very hesitant to make a radically different decision than you would have otherwise based on what some person (or website!) tells you to do. Instead, he hopes conversations like these serve as a way of prompting discussion and raising points that you can apply your own personal judgment to. That's why in the end he thinks people should look at their career decisions through his aptitude lens, the '80,000 Hours lens', and ideally several other frameworks as well. Because any one perspective risks missing something important. Holden and Rob also cover: • Ways to be helpful to longtermism outside of careers • Why finding a new cause area might be overrated • Historical events that deserve more attention • And much more Chapters:Rob’s intro (00:00:00)Holden’s current impressions on career choice for longtermists (00:02:34)Aptitude-first vs. career path-first approaches (00:08:46)How to tell if you’re on track (00:16:24)Just try to kick ass in whatever (00:26:00)When not to take the thing you're excited about (00:36:54)Ways to be helpful to longtermism outside of careers (00:41:36)Things 80,000 Hours might be doing wrong (00:44:31)The state of longtermism (00:51:50)Money pits (01:02:10)Broad longtermism (01:06:56)Cause X (01:21:33)Open Philanthropy (01:24:23)COVID and the biorisk portfolio (01:35:09)Has the world gotten better? (01:51:16)Historical events that deserve more attention (01:55:11)Applied epistemology (02:10:55)What Holden has learned from COVID (02:20:55)What Holden has gotten wrong recently (02:32:59)Having a kid (02:39:50)Producer: Keiran HarrisAudio mastering: Ben CordellTranscriptions: Sofia Davis-Fogel

26 Elo 20212h 46min

#109 – Holden Karnofsky on the most important century

#109 – Holden Karnofsky on the most important century

Will the future of humanity be wild, or boring? It's natural to think that if we're trying to be sober and measured, and predict what will really happen rather than spin an exciting story, it's more likely than not to be sort of... dull. But there's also good reason to think that that is simply impossible. The idea that there's a boring future that's internally coherent is an illusion that comes from not inspecting those scenarios too closely. At least that is what Holden Karnofsky — founder of charity evaluator GiveWell and foundation Open Philanthropy — argues in his new article series titled 'The Most Important Century'. He hopes to lay out part of the worldview that's driving the strategy and grantmaking of Open Philanthropy's longtermist team, and encourage more people to join his efforts to positively shape humanity's future. Links to learn more, summary and full transcript. The bind is this. For the first 99% of human history the global economy (initially mostly food production) grew very slowly: under 0.1% a year. But since the industrial revolution around 1800, growth has exploded to over 2% a year. To us in 2020 that sounds perfectly sensible and the natural order of things. But Holden points out that in fact it's not only unprecedented, it also can't continue for long. The power of compounding increases means that to sustain 2% growth for just 10,000 years, 5% as long as humanity has already existed, would require us to turn every individual atom in the galaxy into an economy as large as the Earth's today. Not super likely. So what are the options? First, maybe growth will slow and then stop. In that case we today live in the single miniscule slice in the history of life during which the world rapidly changed due to constant technological advances, before intelligent civilization permanently stagnated or even collapsed. What a wild time to be alive! Alternatively, maybe growth will continue for thousands of years. In that case we are at the very beginning of what would necessarily have to become a stable galaxy-spanning civilization, harnessing the energy of entire stars among other feats of engineering. We would then stand among the first tiny sliver of all the quadrillions of intelligent beings who ever exist. What a wild time to be alive! Isn't there another option where the future feels less remarkable and our current moment not so special? While the full version of the argument above has a number of caveats, the short answer is 'not really'. We might be in a computer simulation and our galactic potential all an illusion, though that's hardly any less weird. And maybe the most exciting events won't happen for generations yet. But on a cosmic scale we'd still be living around the universe's most remarkable time. Holden himself was very reluctant to buy into the idea that today’s civilization is in a strange and privileged position, but has ultimately concluded "all possible views about humanity's future are wild". In the conversation Holden and Rob cover each part of the 'Most Important Century' series, including: • The case that we live in an incredibly important time • How achievable-seeming technology - in particular, mind uploading - could lead to unprecedented productivity, control of the environment, and more • How economic growth is faster than it can be for all that much longer • Forecasting transformative AI • And the implications of living in the most important century Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran Harris Audio mastering: Ben Cordell Transcriptions: Sofia Davis-Fogel

19 Elo 20212h 19min

#108 – Chris Olah on working at top AI labs without an undergrad degree

#108 – Chris Olah on working at top AI labs without an undergrad degree

Chris Olah has had a fascinating and unconventional career path. Most people who want to pursue a research career feel they need a degree to get taken seriously. But Chris not only doesn't have a PhD, but doesn’t even have an undergraduate degree. After dropping out of university to help defend an acquaintance who was facing bogus criminal charges, Chris started independently working on machine learning research, and eventually got an internship at Google Brain, a leading AI research group. In this interview — a follow-up to our episode on his technical work — we discuss what, if anything, can be learned from his unusual career path. Should more people pass on university and just throw themselves at solving a problem they care about? Or would it be foolhardy for others to try to copy a unique case like Chris’? Links to learn more, summary and full transcript. We also cover some of Chris' personal passions over the years, including his attempts to reduce what he calls 'research debt' by starting a new academic journal called Distill, focused just on explaining existing results unusually clearly. As Chris explains, as fields develop they accumulate huge bodies of knowledge that researchers are meant to be familiar with before they start contributing themselves. But the weight of that existing knowledge — and the need to keep up with what everyone else is doing — can become crushing. It can take someone until their 30s or later to earn their stripes, and sometimes a field will split in two just to make it possible for anyone to stay on top of it. If that were unavoidable it would be one thing, but Chris thinks we're nowhere near communicating existing knowledge as well as we could. Incrementally improving an explanation of a technical idea might take a single author weeks to do, but could go on to save a day for thousands, tens of thousands, or hundreds of thousands of students, if it becomes the best option available. Despite that, academics have little incentive to produce outstanding explanations of complex ideas that can speed up the education of everyone coming up in their field. And some even see the process of deciphering bad explanations as a desirable right of passage all should pass through, just as they did. So Chris tried his hand at chipping away at this problem — but concluded the nature of the problem wasn't quite what he originally thought. In this conversation we talk about that, as well as: • Why highly thoughtful cold emails can be surprisingly effective, but average cold emails do little • Strategies for growing as a researcher • Thinking about research as a market • How Chris thinks about writing outstanding explanations • The concept of 'micromarriages' and ‘microbestfriendships’ • And much more. Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran Harris Audio mastering: Ben Cordell Transcriptions: Sofia Davis-Fogel

11 Elo 20211h 33min

#107 – Chris Olah on what the hell is going on inside neural networks

#107 – Chris Olah on what the hell is going on inside neural networks

Big machine learning models can identify plant species better than any human, write passable essays, beat you at a game of Starcraft 2, figure out how a photo of Tobey Maguire and the word 'spider' are related, solve the 60-year-old 'protein folding problem', diagnose some diseases, play romantic matchmaker, write solid computer code, and offer questionable legal advice. Humanity made these amazing and ever-improving tools. So how do our creations work? In short: we don't know. Today's guest, Chris Olah, finds this both absurd and unacceptable. Over the last ten years he has been a leader in the effort to unravel what's really going on inside these black boxes. As part of that effort he helped create the famous DeepDream visualisations at Google Brain, reverse engineered the CLIP image classifier at OpenAI, and is now continuing his work at Anthropic, a new $100 million research company that tries to "co-develop the latest safety techniques alongside scaling of large ML models". Links to learn more, summary and full transcript. Despite having a huge fan base thanks to his explanations of ML and tweets, today's episode is the first long interview Chris has ever given. It features his personal take on what we've learned so far about what ML algorithms are doing, and what's next for this research agenda at Anthropic. His decade of work has borne substantial fruit, producing an approach for looking inside the mess of connections in a neural network and back out what functional role each piece is serving. Among other things, Chris and team found that every visual classifier seems to converge on a number of simple common elements in their early layers — elements so fundamental they may exist in our own visual cortex in some form. They also found networks developing 'multimodal neurons' that would trigger in response to the presence of high-level concepts like 'romance', across both images and text, mimicking the famous 'Halle Berry neuron' from human neuroscience. While reverse engineering how a mind works would make any top-ten list of the most valuable knowledge to pursue for its own sake, Chris's work is also of urgent practical importance. Machine learning models are already being deployed in medicine, business, the military, and the justice system, in ever more powerful roles. The competitive pressure to put them into action as soon as they can turn a profit is great, and only getting greater. But if we don't know what these machines are doing, we can't be confident they'll continue to work the way we want as circumstances change. Before we hand an algorithm the proverbial nuclear codes, we should demand more assurance than "well, it's always worked fine so far". But by peering inside neural networks and figuring out how to 'read their minds' we can potentially foresee future failures and prevent them before they happen. Artificial neural networks may even be a better way to study how our own minds work, given that, unlike a human brain, we can see everything that's happening inside them — and having been posed similar challenges, there's every reason to think evolution and 'gradient descent' often converge on similar solutions. Among other things, Rob and Chris cover: • Why Chris thinks it's necessary to work with the largest models • What fundamental lessons we've learned about how neural networks (and perhaps humans) think • How interpretability research might help make AI safer to deploy, and Chris’ response to skeptics • Why there's such a fuss about 'scaling laws' and what they say about future AI progress Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran Harris Audio mastering: Ben Cordell Transcriptions: Sofia Davis-Fogel

4 Elo 20213h 9min

Suosittua kategoriassa Koulutus

rss-murhan-anatomia
psykopodiaa-podcast
voi-hyvin-meditaatiot-2
jari-sarasvuo-podcast
aamukahvilla
rss-lasnaolon-hetkia-mindfulness-tutuksi
adhd-podi
rss-vegaaneista-tykkaan
rss-duodecim-lehti
rss-narsisti
rss-valo-minussa-2
rss-vapaudu-voimaasi
psykologiaa-ja-kaikenlaista
ihminen-tavattavissa-tommy-hellsten-instituutti
kehossa
rss-elamankoulu
rss-luonnollinen-synnytys-podcast
puhutaan-koiraa
psykologia
itsetuntemus-on-elaman-tarkoitus