M

Prediction Resources


Analysis tools

  • Guesstimate

    A simple web-based tool to model uncertainties in calculations. Guesstimate's interface is similar to other spreadsheet tools, such as Excel or Google Sheets. Each model is a grid of cells, and each cell can be filled with a name and value. Functions can be used to connect cells together to represent more complex quantities.

    For example, consider the question series about the Fermi paradox. We may use the Drake equation (a "back of the envelope" estimation to find out if there is intelligent life in the Milky Way other than us humans) to estimate the number of intelligent civilizations in our milky verse based on 7 different variables (see drake equation). Each guess has its own uncertainties, and with Guesstimate you can multiply the guesses and their uncertainties together to get a probability distribution of the number of intelligent civilizations. See the following model by a Guesstimate user on this probability. Also check out public models.

    Don't forget to post your models in the comments of questions for others to see!

  • Spreadsheets such as Excel or Google Sheets

    For both theoretical modelling and basic statistical analysis. Spreadsheets offer similar options to Guesstimate, as you can create theoretical models to factorize questions, produce estimates for subquestions, and run basic Monte Carlo simulations (see here for an example of such simulation). Secondly, basic statistical analysis (descriptive statistics, correlations, regressions and so on) is convenient in Excel (see here for more information). Finally, spreadsheets created on Google Sheets can also be shared in the comments, to allow others to view your work.

  • Statistical Software

    Like R, for more advanced statistical computing (linear and nonlinear modeling, classic statistical tests, time-series analysis, classification, clustering) and graphics. You can download it here for free.

  • Probability Distribution Calculators

    Such as the Normal distribution calculator, the Binomial distribution calculator, and the Poisson distribution calculator. Lastly, check out this Bayes Rule Calculator for updating your credence for yes/no questions given new information.

  • HASH

    System modeling software can generate and inform forecasts of complex systems. HASH can be used to represent complex systems and run "what-if" scenarios, to hone your intuitions and improve your predictions.

Tutorials, textbooks and other resources

  • Join the Social Science Prediction Platform, which supports the "systematic collection and assessment of expert forecasts of the effects of untested social programs." It is designed to assist policy makers and social scientists by improving the accuracy of forecasts, thereby leading to more effective decision-making and improvements to experimental design and analysis.

  • Play Calibrate Your Judgment, an interactive calibration tutorial produced by the OpenPhilanthropy Project. This is perhaps the most useful free online calibration training currently available. Note that you must sign in with a GuidedTrack, Facebook, or Google account, so that the application can track your performance over time.

  • AI Impact's Evidence on good forecasting practices from the Good Judgment Project summarizes the findings of the Good Judgment Project, the winning team in IARPA's 2011-2015 forecasting tournament. The article describes the various correlates of successful forecasting as well as the heuristics, forecasting methodologies, philosophical outlooks, thinking styles that were associated with better predictions. Furthermore, it includes a helpful "recipe" for making predictions that describes how superforecasters (top 0.2% of forecasters) go about making their predictions.

  • Forecasting: Principles and Practice provides a comprehensive introduction to forecasting methods and present enough information about each method for readers to use them sensibly. The book is easy to read, is concise and presumes only basic statistics knowledge.

    The book presents key concepts of forecasting. From judgmental forecasting (which can be useful when you have no or few data) to simple/multiple regression, time series decomposition, exponential smoothing (ETS), and a few more advanced topics such as Neural Networks (all in R). The book is optimized for providing useful advice on the making of predictions, and does not attempt to give a thorough discussion of the theoretical details behind each method.

  • Open Textbooks on Forecasting and Related Courses by Francis Diebold, and especially his Time-Series Econometrics: Forecasting, which provides an upper-level undergraduate / masters-level introduction to forecasting, broadly defined to include all aspects of predictive modeling, in economics and related fields. Having used this book for my macroeconometrics course, I highly recommend this book especially for the modeling of autoregressive processes for making point and density forecasts (which are especially useful to numeric-range predictions on Metaculus).

    The topics covered include: regression from a predictive viewpoint; conditional expectations vs. linear projections; decision environment and loss function; the forecast object, statement, horizon and information set; the parsimony principle, relationships among point, interval and density forecasts, and much more. The book can be found here, and the lecture slides covering material in the book can be found here. Diebold's resources are licensed under Creative Commons.

Research on forecasting

Below is a small selection from the extensive research literature on forecasting.

Tips on how to become a better predictor

  • Avoid overconfidence.

    Overconfidence is a common finding in the forecasting research literature, and is found to be present in a 2016 analysis of Metaculus predictions. Overconfidence comes in many forms, such as overconfidence in intuitive judgements, explicit models, or (your or other's) domain-specific expertise.

    Generally overconfidence leads people to:

    1. neglect decision aids or other assistance, thereby increasing the likelihood of a poor decision. In experimental studies of postdiction in which each were provided decision aids, subject-level expertise (and thereby confidence) was found to be correlated with lower levels of use of reliable decision aids, and worse predictions overall.
    2. make predictions contrary to the base rate. The base rate is the prevalence of a condition in the population under investigation. To expect the future to be substantially different from the past, one must have good evidence that i) some process crucial to bringing the usual result about will fail, and ii) the replacement process will produce a different outcome. Bayes rule teaches us that to predict unlikely events we must have highly diagnostic information (information that you'd be unlikely to observe in the usual case) whilst often predictors rely on their confidence rather than diagnosticity of evidence in going against the base rate.

    To counteract overconfidence forecasters should heed five principles:(1) Consider alternatives, especially in novel or unprecedented situations for which data is lacking;(2) List reasons why the forecast might be wrong;(3) In group interaction, appoint a devil's advocate (or play the devil's advocate in the comment section!);(4) Obtain feedback about predictions (by posting it in the comments for example);(5) Treat the feedback you receive as valuable information.

  • Break seemingly intractable problems into tractable sub-problems.

    This is Fermi-style thinking. Enrico Fermi designed the first atomic reactor. When he wasn't doing that he loved to tackle challenging questions such as "How many piano tuners are in Chicago?" At first glance, this seems very difficult. Fermi started by decomposing the problem into smaller parts and putting them into the buckets of knowable and unknowable. By working at a problem this way you expose what you don't know or, as Tetlock (2016) puts it, you "flush ignorance into the open."

  • Discover the relevant base rate.

    A Metaculus time lord knows that there is nothing truly new under the sun. So, the best of forecasters often conduct creative searches for comparison classes even for seemingly unique events and pose the question: How often do things of this sort happen in situations of this sort? Identify comparison classes for events, and let your predictions be informed by the base-rate of occurrence in this class of events. This is often easier and more effective then it is to understand the event's working from first-principles.

  • Combine systematic 'model-thinking' approach with an intuition-based approach

    Whilst it might be often good to use systematic 'model-thinking' approach that uses explicit theoretical or statistical reasoning, you should generally also use an intuition-based approach to predicting. When these two approaches yield different answers, think carefully about whether your question is the type of question that is better answered with intuitive judgments or with systematic modelling, and combine the two answers accordingly to inform your prediction. According to Kahneman, intuitive judgements about some subject likely to be accurate only when the following three conditions hold:

    • The relevant subject exhibits a large degree of regularity
    • One has had sufficient amount of exposure to this subject to have been able to pick up the relevant regularities
    • One has received enough feedback to evaluate previous intuitive judgments
  • Look for the errors behind your mistakes.

    It's easy to justify or rationalize your failure. Don't. Own it and evaluate your track record (both resolution and calibration) and compare this the community track record. You want to learn where you went wrong and determine ways to get better. And don't just look at failures. Evaluate successes as well so you can determine whether you used reliable techniques for producing forecasts or whether you were just plain lucky. For example, if you have an average log-score above 0.2, this might be evidence of overconfidence; in which case you should follow the tips on counteracting overconfidence presented above.

  • Share your work in the question's comments section.

    Sharing your theoretical reasoning (such as posting your Guesstimate model), statistical reasoning, information/data sources, or dependencies with others is good practice not just because you're providing a valuable public good for our understanding of the future, but also because others may supplement your work with additional insight.

Data Sources

General Data Sources (in no particular order)

Data ServiceOrganizationTopicsSizeEase of UseComments
Public Data ExplorerGoogleAll topicsVery large

Public Data Explorer aggregates public data from 113 dataset providers (such as international organizations, national statistical offices, non-governmental organizations, and research institutions)
Very Easy

This is a good place to start with your search for data, since many datasets are available which are often straightforward to find. There are sometimes also great visualizations
This is perhaps the best place to look for public data and forecasts provided from third-party data providers

Highly recommended also is the International Futures Forecasting Data on long-term forecasting and global trend analysis available on the Public Data Explorer

Macroeconomic & Financial Only Data Sources (in no particular order)

Data ServiceOrganizationTopicsSizeEase of UseComments
Bureau of Economic AnalysisU.S. Department of CommerceOfficial macroeconomic and industry statistics, most notably reports about the gross domestic product (GDP) of the United States, as well as personal income, corporate profits and government spendingLargeEasy