After writing in praise of the honesty and accuracy of fivethirtyeight’s results, I felt uncomfortable about the asymmetry in the way I’d treated Democrats and Republicans in the evaluation. In the plots I made, low-probability Democratic predictions that went wrong pop out on the left-hand side, whereas low-probability Republican predictions that went wrong would get buried in the smooth glide down to zero on the right-hand side. So I decided, what I’m really interested in are all low-probability predictions, and I should treat them symmetrically.
For each district there is a predicted loser (PL), with probability smaller than 1/2. In about one third of the districts the PL was assigned a probability of 0. The expected number of PLs (EPL) who would win is simply the sum of all the predicted win probabilities that are smaller than 1/2. (Where multiple candidates from the same party are in the race, I’ve combined them.) The 538 EPL was 21.85. The actual number of winning PLs was 13.
What I am testing is whether 538 made enough wrong predictions. This is the opposite of the usual evaluation, which gives points for getting predictions right. But when measured by their own predictions, the number of districts that went the opposite of the way they said was a lot lower than they said it would be. That is prima facie evidence that the PL win probabilities were being padded somewhat. To be more precise, under the 538 model the number of winning PLs should be approximately Poisson distributed with parameter 21.85, meaning that the probability of only 13 PLs winning is 0.030. Which is kind of low, but still pretty impressive, given all the complications of the prediction game.
Below I show plots of the errors for various scenarios, measuring the cumulative error for these symmetric low predictions. (I’ve added an “Extra Tarnished” scenario, with the transformation based on the even more extreme beta(.25,.25).) I show it first without adjusting for the total number of predicted winning PLs:
We see that tarnished predictions predict a lot more PL victories than we actually see. The actual predictions are just slightly more than you should expect, but suspiciously one-sided — that is, all in the direction of over predicting PL victories, consistent with padding the margins slightly, erring in the direction of claiming uncertainty.
And here is an image more like the ones I had before, where all the predictions are normalised to correspond to the same number of predicted wins:
New York Republican Representative Lee Zeldin was asked by reporter Tara Golshan how he felt about the fact that polls seem to show that a large majority of Americans — and even of Republican voters — oppose the Republican plan to reduce corporate tax rates. His response:
What I have come in contact with would reflect different numbers. So it would be interesting to see an accurate poll of 100 million Americans. But sometimes the polls get done of 1,000 [people].
Yes, that does seem suspicious, only asking 1,000 people… The 100 million people he has come in contact with are probably more typical.
Something I’ve been thinking about since the Brexit vote: There was a prevailing sentiment at the time that the British people are inherently conservative, and so would never vote to upend the international order. In fact, they did, by a small but decisive margin. But how was this “conservatism” imagined to act? The difference between 52-48 for Leave and 48-52 is happening in the minds of 4% of the population who might have decided the other way. Except that there’s nothing to tell them that they are on the margin. If you are negotiating over a policy, even if you start with some strategically maximum demand, you can look at where you are and step back if it appears you’ve crossed a dangerous line.
A referendum offers two alternatives, and one of them has to win. (Of course, a weird thing about the Brexit vote is that only one side — Remain — had a clear proposal. Every Leave voter was voting for the Leave in his mind. In retrospect, the Leave campaign is trying to stretch the mantle of democratic legitimation over their maximal demands.) There is no feedback mechanism that tells an individual “conservative” voter that the line is being crossed. Continue reading “Opinion polling can’t stabilise democracy”
From political journalist Simon Maloy in Salon
People often note that public opinion of Hillary tends to be ossified after more than two decades spent continuously in the national political spotlight, but Trump’s unique and unrelenting awfulness as a candidate represents a timely opportunity to get voters to start thinking more positively about Hillary Clinton.
It amazes me that people can make claims like this, in light of the fact that her net favourability in Gallup polls has shifted by almost 50 points in three years. (She was at +31 in April 2015.)
People talk about Hillary Clinton’s poll-reported unpopularity as though it represented some natural fact about her. A failure of character, or a judgement on her weakness as a politician or human being. But it hasn’t always been that way. Just to check my memory, I looked up Gallup’s record: In April 2013 64 percent of Americans surveyed had a favorable impression of her, as against 31 percent with an unfavorable impression. In May 2016 it was nearly reversed: 39 percent favorable, 54 percent unfavorable. Were there dastardly revelations about her character or public conduct in the interim? Or did she just happen to be the frontrunner in an ideologically heated Democratic primary? (By pure coincidence, the last time her relative favorability was negative was October 2000. I can’t remember what was going on then…)
As for Donald Trump (“Businessman Donald Trump”, as Gallup terms him) there has been only one Gallup survey — in June 2005 — that gave him a positive margin (51 to 38, so it wasn’t even close). Otherwise, every Gallup survey since they first asked about him in 1999 has negative favorability, usually by a wide margin.
The New Republic has published a film review by Yishai Schwartz under the portentous title “The Edward Snowden Documentary Accidentally Exposes His Lies”. While I generally support — and indeed, am grateful — for what Snowden has done, I am also sensitive to the problems of democratic governance raised by depending on individuals to decide that conscience commands them to break the law. We are certainly treading on procedural thin ice, and our only recourse, despite the commendable wish of Snowden himself, as well as Greenwald, to push personalities into the background, is to think carefully about the motives — and the honesty — of the man who carried out the spying. So in principle I was very interested in what Schwartz has to say.
Right up front Schwartz states what he considers to be the central dishonesty of Snowden’s case:
Throughout this film, as he does elsewhere, Snowden couches his policy disagreements in grandiose terms of democratic theory. But Snowden clearly doesn’t actually give a damn for democratic norms. Transparency and the need for public debate are his battle-cry. But early in the film, he explains that his decision to begin leaking was motivated by his opposition to drone strikes. Snowden is welcome to his opinion on drone strikes, but the program has been the subject of extensive and fierce public debate. This is a debate that, thus far, Snowden’s and his allies have lost. The president’s current drone strikes enjoy overwhelming public support.
“Democratic theory” is a bit ambivalent about where the rights of democratic majorities to annihilate the rights — and, indeed, the lives — of individuals, but the reference to “overwhelming” public support is supposed to bridge that gap. So how overwhelming is that support? Commendably, Schwartz includes a link to his source, a Gallup poll that finds 65% of Americans surveyed support “airstrikes in other countries against suspected terrorists”. Now, just stopping right there for a minute, in my home state of California, 65% support isn’t even enough to pass a local bond measure. So it’s not clear that it should be seen as enough to trump all other arguments about democratic legitimacy.
Furthermore, if you read down to the next line, you find that when the targets to be exterminated are referred to as “US citizens living abroad who are suspected terrorists” the support falls to 42%. Not so overwhelming. (Support falls even further when the airstrikes are to occur “in the US”, but since that hasn’t happened, and would conspicuously arouse public debate if it did, it’s probably not all that relevant.) Not to mention that Snowden almost surely did not mean that he was just striking out at random to undermine a government whose drone policies he disapproves of; but rather, that democratic support for policies of targeted killing might be different if the public were aware of the implications of ongoing practices of mass surveillance. Continue reading “The force of “overwhelming””
Quinnipiac has published a poll purporting to find the following facts:
- 55 percent of Americans say Edward Snowden is a “whistle-blower”, as opposed to 34 percent calling him a “traitor”;
- voters say 45 – 40 percent the government’s anti-terrorism efforts go too far restricting civil liberties, a reversal from a January 10, 2010 survey … when voters said 63 – 25 percent that such activities didn’t go far enough to adequately protect the country.
- While voters support the phone-scanning program 51 – 45 percent and say 54 – 40 percent that it “is necessary to keep Americans safe,” they also say 53 – 44 percent that the program “is too much intrusion into Americans’ personal privacy”.
Now, the most striking thing to me is that 88 percent of the people surveyed in January 2010 thought they knew enough about the government’s intrusion on personal privacy to even formulate an opinion — in particular, that 63 percent thought they knew enough about the scope to say that it didn’t go far enough.
But even more interesting is the formulation of the question that got 54% to agree that “the phone-scanning program” is “necessary”. (It is noteworthy that at least 4% of those surveyed both support the program and believe that it is “too much intrusion”. They must have a different concept than I have of either the word “support” or “too much”.) What they were asked was
Do you support or oppose the federal government program in which all phone calls are scanned to see if any calls are going to a phone number linked to terrorism?
Now, if you put it that way, I’d kind of support it myself. “Scanning” sounds pretty innocuous, and “phone numbers linked to terrorism” sound pretty ominous. But that’s only a small part of what’s being done. They are receiving all metadata — that’s a lot more than just a phone number — and storing them, presumably, forever. They are data-mining to try to identify patterns. They are already, or are preparing to, store the content of all communications, so they may be examined in depth if there is sufficient reason in the future.
And how much of this is this about terrorism? We don’t know. And even if it is about terrorism right now, it won’t take long before enthusiastic or corrupt government officials think of all kinds of other legitimate purposes of government that could be promoted by just breaking down some of the petty bureaucratic restrictions on use of the data.
To put it in the crassest terms: This sort of unfocused big-data espionage may be marginally useful for catching terrorists, but it seems certain to be far more useful for pressuring or destroying political opponents of the anti-terror policies.
I probably shouldn’t be spending so much of my time thinking about U.S. election polls: I have no special expertise, and everyone else in the country has lost interest by now. But I’ve just gotten some new information about a question that was puzzling me throughout the recent election campaign: What do pollsters mean when they refer to a likely voter screen? Continue reading “Screens or Weights?”
As a sometime demographer myself, I am fascinated by the prominence of “demographics” as an explanatory concept in the recent presidential election, now already slipping away into hazy memory. Recent political journalism would barely stand without this conceptual crutch, as here and here and here. A bit more nuance here. Some pushback from the NY Times here.
The crassest expression of this concept came in an article yesterday by (formerly?) respected conservative journalist Michael Barone, explaining why he was no longer confident that Mitt Romney would win the election by a large margin. Recall that several days before the election, despite the contrary evidence of what tens of thousands of voters were actually telling pollsters, he predicted 315 electoral votes for Romney, saying “Fundamentals usually prevail in American elections. That’s bad news for Barack Obama.” In retrospect, he says,
I was wrong because the outcome of the election was not determined, as I thought it would be, by fundamentals…. I think fundamentals were trumped by mechanics and, to a lesser extent, by demographics.
Continue reading “Are you demographic?”
Who speaks for statistics?
Ace forensic psephologist Nate Silver has attracted quite a bit of attention lately, with his 4+-year-old blog devoted to his statistical model that is intended to provide a synoptic view of the entire range of public data to produce a single probabilistic prediction of the outcome. Now, there are some clear criticisms that could be made of his approach, and of his results — in particular, the obvious failure of his successive predictions to be martingales, as they would have to be if they were appropriately using all current information — but he has been remarkably clear and open about his procedures and principles, and his reasoning on matters large and small seems generally sound, if not necessarily compelling. It’s funny that his conclusions should arouse any controversy at all, given that they are hardly different (as Silver himself is quick to acknowledge) from the conclusions one would draw from a simplistic combination of poll results. His main contribution is in giving careful answers to the obvious critiques that could be proposed: What’s a reasonable estimate for the difference between state poll results and the actual election result? How correlated are polling errors? What’s the best way to average polls of varying qualities done over multiple days? And so on. In the end, the answer doesn’t differ much from what anyone with number sense would come up with in a few hours, but you don’t know that for sure until you do it. And Silver’s reputation derives from the sense and good care that he takes in posing these questions and resolving them.
(The failure of the martingale property is actually evidence of his honesty in following the model that he set up back in the spring. He clearly would have been capable of recognising the trends that other people can see in the predictions, and introducing an ad hoc correction. He didn’t do that.) Continue reading “The Silver standard”