Rehabilitating the single-factor models

Lots of people — myself included — have mocked the penchant of a certain kind of political scientist who like to say that all the superficial busyness of election campaigns is just a distraction, it matters not at all, nor do the candidates really. Presidential elections are decided by the “fundamentals” — usually one or two economic variables. Except that the models work much better for the past than they do for the present or future, and so end up with lots of corrections: So much for an ongoing war, so much for incumbency, or for a party having been in office too long, and so on. They seem kind of ridiculous. Obviously people care who the candidates were. And, of course, these experts agreed that those things weren’t irrelevant, they just tended to cancel each other out, because both major parties choose reasonably competent candidates who run competent campaigns.

And last year they said the fundamentals mildly favoured the Republican to win a modest victory. But the Republicans chose a ridiculous candidate who ran a flagrantly incompetent campaign. So of course this couldn’t be a test of the “fundamentals” theory. But after all that, the Republican won a modest victory. Kind of makes you think…

Making sense of the predictions

I absolutely agree that Sam Wang and the Princeton Election Consortium have a good argument for there being a 99+% chance of Clinton winning. Unfortunately, I think there’s only about a 50% chance of his argument being right. It could also be that Nate Silver is right, that there is a 65% chance. Putting that all together, I come down right about where the NY Times is, at about an 85% chance of escaping apocalypse.

I’ve written a bit more about how I think about the likelihoods here. But a fundamental problem with the PEC estimate is that it clearly puts very little weight on the possibility of model failure. A fundamental problem with the 538 estimates is that they are very clearly not martingales. That is, they are not consistent predictions of the future based on all available information. One way of saying this is to note that a few weeks ago Clinton had close to a 90% estimated victory probability. Now it’s 65%. That seems like a modest change, but if the first estimate was correct, the current estimate reflects an event that had less than a 1 in 3 chance of happening. So we’re more than halfway there. But does anyone really think that the events of the past month have been that unlikely?

All-you-can-eat

Ars Technica reports on testimony by Mediacom, a large US cable company, explaining why they should not be required to stop capping data usage:

People thus shouldn’t complain when Internet providers impose data caps and charge more when customers go over them, he wrote. “Even though virtually every other industry prices its products and services in the same way, some people think that ISPs should be the exception and run their businesses like an all-you-can-eat buffet.”

“Virtually every other industry”… Yes, it’s pretty hard to think of any industry that offers all-you-can-eat buffets. Who could possibly afford to offer all-you-can-eat? It’s a fantasy.

Statistics and causal truth: Police edition

As usual, Andrew Sullivan — who has now returned temporarily to blogging, attracted like a moth to the Trump conflagration — manages to take a common, superficially convincing argument, and express it with moral fervour and personal conviction that makes the tenuous logic really conspicuous. In this case, it’s the argument based on the much-discussed study by Roland G. Fryer, Jr. of the rate of various violent outcomes of police stops, finding that black people are more likely than white to be physically abused by police, but not more likely to be shot.

(Here’s an excellent NY Times report, and  the original study.)

…the Black Lives Matter activists, whose core and central argument is that black men are disproportionately killed by cops. The best data shows this is false…  I find [the study] conclusive. Feelings do not, er, trump data in a deliberative democracy. A reader writes:

I understand that there has been the recent study suggesting that given an interaction with a police officer occurs, then the police officer is no more likely to use a gun with a black person than with a white person. However, given that many black men have a much higher rate of interaction with police (such as, anecdotally, Philando Castile, with 52 traffic stops), then is it not fair to say that black men are disproportionately killed by cops?
The point is that there is no evidence of individual racism in these police encounters, despite the impression from many chilling phone videos. The structural bias still exists as a whole, as I said, but the narrative about cops being more likely to kill a black member of the public when encountering him is false.

I have no criticism to make of the study — I have not analysed it in any depth, but it seems credibly and even impressively done — even if I find the premise absurd, that a single study of such a complex phenomenon could be “conclusive”. But they do not “trump” the data that black people make up 13% of the US population, but 31% of those killed during an arrest, and 42% of those killed during an arrest when unarmed. The point is, what these facts (and many others, including the others) mean jointly depends on what we think is the reason for black people being so much more likely to be arrested.

Continue reading “Statistics and causal truth: Police edition”

Don Quixote on sampling bias

Continuing my series on modern themes that were already thoroughly treated in Don Quixote, here is the passage where Don Quixote and Sancho Panza discuss whether it is better to be a knight errant or a monk:

“Señor, it is better to be an humble little friar of no matter what order, than a valiant knight-errant; with God a couple of dozen of penance lashings are of more avail than two thousand lance-thrusts, be they given to giants, or monsters, or dragons.”

“All that is true,” returned Don Quixote, “but we cannot all be friars, and many are the ways by which God takes his own to heaven; chivalry is a religion, there are sainted knights in glory.”

“Yes,” said Sancho, “but I have heard say that there are more friars in heaven than knights-errant.”

“That,” said Don Quixote, “is because those in religious orders are more numerous than knights.”

Early greenhouse

I read a novel that I’d known about for a long time, but had never gotten around to: Ursula K. Le Guin’s The Lathe of Heaven. I was startled to discover that an essentially background point of the plot of this novel, published in 1971, was the destruction of the Earth’s environment by the greenhouse effect. This has already taken place before the events of the novel, set in the early twenty-first century.

Rain was an old Portland tradition, but the warmth — 70ºF on the second of March — was modern, a result of air pollution. Urban and industrial effluvia had not been controlled soon enough to reverse the cumulative trends already at work in the mid-twentieth century; it would take several centuries for the CO2 to clear out of the air, if it ever did. New York was going to be one of the larger casualties of the Greenhouse Effect, as the polar ice kept melting and the sea kept rising.

This is only incidental to the themes of the novel, which grapples with the structure of reality and the nature of dreams. But it amazed me to see global warming being confidently projected into our future, at a time when — as the climate-change skeptics never tire of pointing out — discussions of climate change tended to refer to the danger of a new Ice Age.

At least, that is my memory. According to the Google Books NGram viewer, though, the “greenhouse effect” was as mentioned in books around 1970 as frequently as it is today; and, oddly, it has declined substantially from a peak three times as high in the early 1990s.Screenshot 2015-12-16 14.22.01

For example, a 1966 book titled Living on Less begins its section on “The Environment” by discussing global warming, and launches right into a description of the greenhouse effect that sounds very similar to what you might read today.

Primary sex ratio, the short version

Five months after our article with Orzack et al. appeared in PNAS, showing that the primary sex ratio (the fraction of boys conceived) is close to 50%, contradicting centuries of supposition that it was substantially higher (more male-biased), Bill Stubblefield, Jim Zuckerman and I have published a popular account of the research in Nautilus. It was an interesting experience, the back and forth with an editor to make something comprehensible and gripping for a general audience.

I didn’t end up exactly as we would have liked, but it was probably better — as an effort to explain the science and the background to a general audience — than what we would have produced entirely on our own. The layout and graphics are also very well done.

It’s now been condensed down to three paragraphs on Gizmodo. They even condensed the illustration.

Annie Get Your Prior

I was reading Sharon McGrayne’s wonderful popular (no, really!) book on the history of Bayesian statistics. At one point it is mentioned that George Box wrote a song for a departmental Christmas party

There’s no theorem like Bayes’ Theorem
Like no theorem I know…

A bit later we read of Howard Raiffa and Robert Schlaifer singing

Anything frequentists can do, Bayesians do better

(More or less… the exact text is not reproduced.) So it seems the underappreciated role of Irving Berlin in the development of Bayesian thought has yet to be adumbrated. Perhaps researchers will some day uncover such hits manqués as “How High is the Bayes Factor?”, “I’m Dreaming of a Conjugate Prior”, or even “Bayes Bless America”.

The last unbreakable code?

I noticed a brief article in The Guardian with the captivating headline “Can Google be taught poetry?”.

By feeding poems to the robots, the researchers want to “teach the database the metaphors” that humans associate with pictures, “and see what happens,” explains Corey Pressman from Neologic Labs, who are behind the project, along with Webvisions and Arizona State University….

The hope is that, with a big enough dataset, “we’ll be delighted to see we can teach the robots metaphors, that computers can be more like us, rather than the other way around,” says Pressman. “I’d like them to meet us more halfway.”

That sounds utopian, magnificent, turning away from the harsh and narrow-minded informaticism to grand humane concerns. And yet, it reminded me of a recent article in the New Yorker “Why Jihadists Write Poetry”:

Analysts have generally ignored these texts, as if poetry were a colorful but ultimately distracting by-product of jihad. But this is a mistake. It is impossible to understand jihadism—its objectives, its appeal for new recruits, and its durability—without examining its culture. This culture finds expression in a number of forms, including anthems and documentary videos, but poetry is its heart. And, unlike the videos of beheadings and burnings, which are made primarily for foreign consumption, poetry provides a window onto the movement talking to itself. It is in verse that militants most clearly articulate the fantasy life of jihad.

Whatever the motives of Neologic Labs — and I’m guessing they have a pitch to investors that doesn’t rely upon the self-actualisation of smartphones, nor on the profits to be turned from improving the quality of poetry — can we doubt that sooner or later this technology is going to be applied to improving the quality of government surveillance, escaping the literal to follow human prey down into the warrens of metaphor and allusion. It will start with terrorists, but that’s not where it will stop.

Imagine, just to begin with, China equipping its internet with a cybernetic real-time censor that can’t be fooled by symbolic language or references to obscure rock lyrics, which the software will be more familiar with than any fan. Protest movements will be extinguished before people are even aware that they were ever part of a movement.