All it takes to improve forecasting is KEEP SCORE
Will Syria’s President Assad still be in power at the end of next year?
Will Russia and China hold joint naval exercises in the Mediterranean in the next six months?
Will the Oil Volatility Index fall below 25 in 2016?
Will the Arctic sea ice mass be lower next summer than it was last summer?
Five hundred such questions of geopolitical import were posed in tournament mode to thousands of amateur forecasters by IARPA—the Intelligence Advanced Research Projects Activity—between 2011 and 2015.
(Tetlock mentioned that senior US intelligence officials opposed the project, but younger-generation staff were able to push it through.)
Extremely careful score was kept, and before long the most adept amateur “superforecasters” were doing 30 percent better than professional intelligence officers with access to classified information.
They were also better than prediction markets and drastically better than famous pundits and politicians, who Tetlock described as engaging in deliberately vague “ideological kabuki dance."
What made the amateurs so powerful was Tetlock’s insistence that they score geopolitical predictions the way meteorologists score weather predictions and then learn how to improve their scores accordingly.
Meteorologists predict in percentages—“there is a 70 percent chance of rain on Thursday.”
It takes time and statistics to find out how good a particular meteorologist is.
If 7 out of 10 such times it in fact rained, the meteorologist gets a high score for calibration (the right percentage) and for resolution (it mostly did rain).
Superforecasters, remarkably, assigned probability estimates of 72-76 percent to things that happened and 24-28 percent to things that didn’t.
How did they do that?
They learned, Tetlock said, to avoid falling for the “gambler’s fallacy”—detecting nonexistent patterns.
They learned objectivity—the aggressive open-mindedness it takes to set aside personal theories of public events.
They learned to not overcompensate for previous mistakes—the way American intelligence professionals overcompensated for the false negative of 9/11 with the false positive of mass weapons in Saddam’s Iraq.
They learned to analyze from the outside in—Assad is a dictator; most dictators stay in office a very long time; consider any current news out of Syria in that light.
And they learned to balance between over-adjustment to new evidence (“This changes everything”) and under-adjustment (“This is just a blip”), and between overconfidence ("100 percent!”) and over-timidity (“Um, 50 percent”).
“You only win a forecasting tournament,” Tetlock said, “by being decisive—justifiably decisive."
Much of the best forecasting came from teams that learned to collaborate adroitly.
Diversity on the teams helped.
One important trick was to give extra weight to the best individual forecasters.
Another was to “extremize” to compensate for the conservatism of aggregate forecasts—if everyone says the chances are around 66 percent, then the real chances are probably higher.
In the Q & A following his talk Tetlock was asked if the US intelligence community would incorporate the lessons of its forecasting tournament.
He said he is cautiously optimistic.
Pressed for a number, he declared, “Ten years from now I would offer the probability of .7 that there will be ten times more numerical probability estimates in national intelligence estimates than there were in 2005.”
Asked about long-term forecasting, he replied, “Here’s my long-term prediction for Long Now.
When the Long Now audience of 2515 looks back on the audience of 2015, their level of contempt for how we go about judging political debate will be roughly comparable to the level of contempt we have for the 1692 Salem witch trials."