Download - Technical AGI safety research outside AI by richard_ngo

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum Top Posts

Education

Technical AGI safety research outside AI by richard_ngo

2021-12-11

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Technical AGI safety research outside AI, published by richard_ngo on the AI Alignment Forum.
Write a Review
I think there are many questions whose answers would be useful for technical AGI safety research, but which will probably require expertise outside AI to answer. In this post I list 30 of them, divided into four categories. Feel free to get in touch if you’d like to discuss these questions and why I think they’re important in more detail. I personally think that making progress on the ones in the first category is particularly vital, and plausibly tractable for researchers from a wide range of academic backgrounds.
Studying and understanding safety problems
How strong are the economic or technological pressures towards building very general AI systems, as opposed to narrow ones? How plausible is the CAIS model of advanced AI capabilities arising from the combination of many narrow services?
What are the most compelling arguments for and against discontinuous versus continuous takeoffs? In particular, how should we think about the analogy from human evolution, and the scalability of intelligence with compute?
What are the tasks via which narrow AI is most likely to have a destabilising impact on society? What might cyber crime look like when many important jobs have been automated?
How plausible are safety concerns about economic dominance by influence-seeking agents, as well as structural loss of control scenarios? Can these be reformulated in terms of standard economic ideas, such as principal-agent problems and the effects of automation?
How can we make the concepts of agency and goal-directed behaviour more specific and useful in the context of AI (e.g. building on Dennett’s work on the intentional stance)? How do they relate to intelligence and the ability to generalise across widely different domains?
What are the strongest arguments that have been made about why advanced AI might pose an existential threat, stated as clearly as possible? How do the different claims relate to each other, and which inferences or assumptions are weakest?
Solving safety problems
What techniques used in studying animal brains and behaviour will be most helpful for analysing AI systems and their behaviour, particularly with the goal of rendering them interpretable?
What is the most important information about deployed AI that decision-makers will need to track, and how can we create interfaces which communicate this effectively, making it visible and salient?
What are the most effective ways to gather huge numbers of human judgments about potential AI behaviour, and how can we ensure that such data is high-quality?
How can we empirically test the debate and factored cognition hypotheses? How plausible are the assumptions about the decomposability of cognitive work via language which underlie debate and iterated distillation and amplification?
How can we distinguish between AIs helping us better understand what we want and AIs changing what we want (both as individuals and as a civilisation)? How easy is the latter to do; and how easy is it for us to identify?
Various questions in decision theory, logical uncertainty and game theory relevant to agent foundations.
How can we create secure containment and supervision protocols to use on AI, which are also robust to external interference?
What are the best communication channels for conveying goals to AI agents? In particular, which ones are most likely to incentivise optimisation of the goal specified through the channel, rather than modification of the communication channel itself?
How closely linked is the human motivational system to our intellectual capabilities - to what extent does the orthogonality thesis apply to human-like brains? What can we learn from the range of variation in human motivational systems (e.g. induced b...