Russ Roberts: Well, I want to talk about, in the psychology literature particularly, this issue of priming, that was recently talked about. But we can talk about lots of things. So, what do you want to talk about? Andrew Gelman: I wanted to give two—I want to give three examples. So the first example was something that really matters, and there’s a lot of belief that early childhood intervention should work. Although the number of 42% sounds a little high. But, and also where people also actually care about how much it helps. It’s not enough to say—even if you could somehow prove beyond a shred of a doubt that it’s had a positive effect, you’d need to know how much of an effect it is. Because it’s always being compared to other potential uses of tax dollars. Or individual dollars. So, I want to bring up two other examples. So, the second example is from a few years ago. There was a psychologist at Cornell University who did an experiment on Cornell students of ESP (Extra Sensory Perception). And he wrote a paper finding that these students could fortell the future. And it was one of these lab experiments—I don’t remember the details but they could click on something and somehow you—it was one of these things where you could only know the right answer after you clicked on it. And he felt, he claimed that they were predicting the future. And if you look carefully— Russ Roberts: Andrew, I’ve got to say before you go on—when I saw the study, the articles on that, I thought it was from the Onion. But, it’s evidently a real paper. Any one of these, by the way, strikes me as an Onion article—that people named Dennis are more likely to be dentists. Andrew Gelman: I was going to get to that one. Russ Roberts: Yeah, well, go ahead. Go with the ESP first. Andrew Gelman: So, the early childhood
intervention is certainly no Onion article. The ESP article—it was published in the Journal of Personality and Social Psychology, which is one of the top journals in the field. Now, when it came out, the take on it was that it was an impeccably-done study; and, sure, like people—most people didn’t believe it. I don’t even think the Editor of the journal believed it. They published it nonetheless. Why did they publish it? Part of it is, like, we’re scientists and we don’t want to be suppressing stuff just because we don’t believe it. But part of it was the take on it—which I disagree with, by the way. But at the time, the take on it was that this was an impeccably-done study, was high quality research; it had to be published because if you are publishing these other things you have to publish this, too. And there’s something wrong. Like, once it came out, there’s obviously something wrong there. Like, what did they do wrong? It was like
a big mystery. Oh, and by the way: The paper was featured, among other places, completely uncritically on the Freakonomics blog. Russ Roberts: I’m sure it made the front page of newspapers and the nightly news— Andrew Gelman: It was on the front page of the New York Times. Yeah. So in the newspaper—they were more careful in the newspaper than in Freakonomics, and they wrote something like, ‘People don’t really believe it, but this is a conundrum.’ If you look at the
paper carefully, it had so many forking paths: there’s so much p-hacking—almost every paragraph in the results section—they try one thing, it doesn’t work. They try something else. It’s
the opposite of a controlled study. The experiment was controlled: they randomly assigned treatments. But then the analysis was completely uncontrolled. It’s super-clear that they had many more than 20 things they could have done for every section, for every experiment. It’s not at all a surprise that they could have got statistical significance. And what’s funny is when it came out, a lot of
people—like, the journal editor—were like, ‘Oh, this is solid work.’ Well, like, that’s what people do in psychology. This is
a standard thing. But when you look at it carefully it’s completely—it was terrible. Russ Roberts: So, in that example—I mean, what’s interesting about
that for me is that you say, ‘In the results it was clear to you.’ But of course in retrospect, in many published studies—the phrase I like is ‘We don’t get to be in the kitchen with the statistician, the economist, the psychologist. We don’t know what was accepted and rejected.’ So, one of my favorites is baseball players whose names start with ‘K’ are more likely to strike out. Well, did you look at basketball players and see if their names start with ‘A’ are more likely to have assists? Did you look at—how many things did you look at? And if you don’t tell me that—‘K’ is the scoring letter for strike-out, for those listening at home who are not from America; or who don’t follow baseball; or who don’t score—keep track of the game via scorecard; ‘K’ is a shorthand abbreviation for strikeout—which, of course, is funny because I’m sure some athletes don’t know that either. But the claim was that they are more likely to strike out. I don’t know the full range of things that the author has tested for unless they give me what I’ve started to call the Go-Pro—you wear the HeadCam [head camera]—and I get to see all your regressions; and all your different specifications; and all the assumptions you made about sample; and who you excluded; and what outliers. Now, sometimes you get some of that detail. Sometimes authors will tell you. Andrew Gelman: This is like cops—like research [? audio garbled—Econlib Ed.] Russ Roberts: Exactly. Andrew Gelman: So, it’s actually worse than that. It’s not just all the analyses you did. It’s all the analysis you could have done. And so, some people wrote a paper, and they had a statistically significant result, and I didn’t believe it; and I gave all these reasons, and I said how it’s the garden of forking paths: If you had seen other data you would have done—you could have done your analysis differently. And they were very indignant. And they said, ‘How can you dismiss what we did based on—and your assumption’—that’s me—‘how can I dismiss what they did based on my assumption about what they would have done, had the data had been different? That seems super-unfair.’ Russ Roberts: It does. Andrew Gelman: Like, how is it that I come in from the outside? And the answer is that, if you report a p-value in your paper—a probability that a result would have been more extreme, had the data come from, at random—your p-value is literally a statement about what you would have done had the data been different. So the burden is on you. So, to get back to the person in the, you know, who bugs you at the cocktail party, if someone says, ‘This is statistically significant, the p-value is less than .05; therefore had the data been noise it’s very unlikely we would have seen this,’ they are making a statement saying, ‘Had the data looked different, we would have done the exact same analysis.’ They are making a statement about what they would have done. So, the GoPro wasn’t even quite enough. Because my take on it is people navigate their data. So, you see an interesting pattern in some data, and then you go test it. It’s not—like, the thing with the assists, the letter ‘A’, whatever—maybe they never did that. However, had the data been different maybe they would have looked at something different. They would have been able— Russ Roberts: And someone, I didn’t read carefully in this, but someone did write a response to that article saying that it turned out that people with the letter ‘O’ struck out even more often. What do you do with that? Which is a different variation on that, all the possible things you could have looked at. Andrew Gelman: Well, they also found that—my favorite was that lawyers—they felt, they looked the number of lawyers named ‘Laura,’ and the number of dentists named ‘Dennis’. And there are about twice as many lawyers named ‘Laura’ and dentists named Dennis as you would expect if the names were just at random. And I believe this. So, when I— Russ Roberts: Twice as much! How could you—it’s obviously not random! Andrew Gelman: Well, no. Well, twice as much—well, yeah. Twice as much is first— Russ Roberts: Twice as much as what? Andrew Gelman: It’s not as ridiculous as you might think. So, it goes like this. Very few people are dentists. So, if like 1% of the people named ‘Dennis’ decide to become dentists, that will be enough to double the number of dentists named ‘Dennis.’ Because it’s a rare career choice. So, it’s, in some way it’s not the most implausible story in the world. It actually takes only a small number of people to choose their career based on their name for it to completely do this to this to the statistics. But—and I bought it. I was writing about it. But then someone pointed out that people named ‘Laura’—the name ‘Laura’ and ‘Dennis’ were actually quite popular many years ago—like I guess when we were kids or even before then. And when the study was done, where the lawyers and dentists in the study were mostly middle-aged people. So, in fact they hadn’t corrected for the age distribution. So there was something that they hadn’t thought of. It was an uncontrolled study. So, I bring up the ESP only because that’s a case where, like, it’s pretty plausible that it was just noise. And then when you look carefully at what they did, it’s pretty clear that they did just zillions of different analyses.