Download - Data on forecasting accuracy across different time horizons and levels of forecaster experience by Charles Dillon

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum Top Posts

Education

Data on forecasting accuracy across different time horizons and levels of forecaster experience by Charles Dillon

2021-12-11

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Data on forecasting accuracy across different time horizons and levels of forecaster experience, published by Charles Dillon on the Effective Altruism Forum.
Key Points
Forecasting well is a valuable skill for many purposes and people, including for EA organisations aiming to identify which areas they should focus on and what the outcomes of various initiatives would be.
There is a limited public record of people making scored forecasts over time horizons greater than ~1 year. Here I use data from PredictionBook and Metaculus to study performance of predictions over different time horizons. I also looked at performance between users with different levels of forecasting practice.
When looking at individual predictors, it seems that a very common failure mode among newer or less dedicated predictors is overconfidence, and that it is more prevalent than underconfidence across most subgroups.
Across both PredictionBook and Metaculus, there seemed to be a significant bias towards overestimating the chances of positive resolution. This effect seemed to get stronger as time to resolution increased.
At least in the PredictionBook data, there was some weak evidence to suggest prediction performance improves with making more predictions, but there were too many confounding factors here to draw any confident conclusion.
The conclusions I was able to draw from this were limited, and working to improve this by expanding the amount and quality of data available for analysis like this seems worth doing.
This post draws a lot on niplav's Range and Forecasting Accuracy, not least for much of the code used to extract the PredictionBook forecasts, and also in identifying the most promising sources of useful data.
I think that this post is probably most useful to individuals making forecasts being aware of common failure modes and attempting to learn from them, and informing decision makers about these failure modes also, rather than attempting to provide those looking to use forecasts with e.g. a transform they should apply to long term forecasts.
Background
There has been a great deal of interest in forecasting in the EA community in recent years, particularly with the prominence of longtermist thinking. It is clearly of great interest that we be well equipped to make predictions about future events, and to understand the accuracy and failure modes of such predictions. Additionally, many of the questions we care most about will have long time horizons, therefore any evidence we can gain which helps us become better at making better long term predictions in particular could be quite valuable (see also Muehlhauser, 2019).
Some potential tools for making longer term forecasts include:
extrapolating from shorter term forecasts expected to be correlated to the long term question - this to an extent transforms the question to a different problem, that of forecasting which intermediate milestones might usefully predict our ultimate questions, and how well.
assuming that those who are well calibrated in the short term will also do well in the long term, and using their forecasts.
The second point here is probably to a certain extent unavoidable, as most forecasters will get few totally independent iterations of making long term forecasts in their lifetimes. One could potentially make hundreds of 10 year predictions now, but lessons cannot be drawn directly from these for 10 years, and if the wrong lessons are learned, it could take another 10 year iteration to realise that. In addition, these hundreds of forecasts may not be independent. I think many forecasts made over longer time horizons will be subject to errors from the same sources, due to society wide effects such as, for example, rates of economic and technological development, a more/less peaceful climate for international relations,...