Is it better if they spy accurately?

There’s a fascinating article in the Guardian about how Berlin has become a centre for “digital exiles”, people — mainly Americans — whose online activism has put them in the crosshairs of various security services, leading to low-level harassment, or occasionally high-level harassment, such as this frightening story

Anne Roth, a political scientist who’s now a researcher on the German NSA inquiry, tells me perhaps the most chilling story. How she and her husband and their two children – then aged two and four – were caught in a “data mesh”. How an algorithm identified her husband, an academic sociologist who specialises in issues such as gentrification, as a terrorist suspect on the basis of seven words he’d used in various academic papers.

Seven words? “Identification was one. Framework was another. Marxist-Leninist was another, but you know he’s a sociologist… ” It was enough for them to be placed under surveillance for a year. And then, at dawn, one day in 2007, armed police burst into their Berlin home and arrested him on suspicion of carrying out terrorist attacks.

But what was the evidence, I say? And Roth tells me. “It was his metadata. It was who he called. It was the fact that he was a political activist. That he used encryption techniques – this was seen as highly suspicious. That sometimes he would go out and not take his cellphone with him… ”

He was freed three weeks later after an international outcry, but the episode has left its marks. “Even in the bathroom, I’d be wondering: is there a camera in here?”

This highlights a dichotomy that I’ve never seen well formulated, that pertains to many legal questions concerning damage inflicted by publication or withholding of information: Are we worried about true information or false information? Is it more disturbing to think that governments are collecting vast amounts of private and intimate information about our lives, or that much of that information (or the inferences that also count as information) is wrong?

As long as the security services are still in their Keystone Cops phase, and haven’t really figured out how to deploy the information effectively, it’s easier to get aroused by the errors, as in the above. When they have learned to apply the information without conspicuous blunders, then the real damage will be done by the ruthless application of broadly correct knowledge of everyone’s private business, and the crushing certainty everyone has that we have no privacy.

It’s probably a theorem that there is a maximally awful level of inaccuracy. If the information is completely accurate, then at least we avoid the injustice of false accusation. If the information is all bogus, then people will ignore it. Somewhere in between people get used to trusting the information, and will act crushingly on the spurious as well as the accurate indications. What is that level? It’s actually amazing how much tolerance people have for errors in an information source before they will ignore it — cf., tabloid newspapers, astrology, economic forecasts — particularly if it’s a secret source that seems to give them some private inside knowledge.

On a somewhat related note, Chris Bertram at Crooked Timber has given concise expression to a reaction that I think many people have had to the revelations of pervasive electronic espionage by Western democratic governments against their own citizens:

 It isn’t long since the comprehensive surveillance of citizens… was emblematic of how communist states would trample on the inalienable rights of people in pursuit of state security. Today we know that our states do the same. I’m not making the argument that Western liberal democracies are “as bad” as those states were,… but I note that these kinds of violations were not seen back then as being impermissible because those states were so bad in other ways — undemocratic, dirigiste — but rather were portrayed to exemplify exactly why those regimes were unacceptable.

 

Bayesian theology

I was reading (finally, after seven years in Oxford) Thomas Hardy’s Jude the Obscure, and found the following quote from John Henry Newman:

My argument was … that absolute certitude as to the truths of natural theology was the result of an assemblage of concurring and converging probabilities … that probabilities which did not reach to logical certainty might create a mental certitude.

 

The paradoxes of adultery, Renaissance edition

An example that is frequently cited in elementary statistics courses for the unreliability of survey data, is that when people are surveyed about their sexual history, men report more lifetime female partners on average than women report male partners. (A high-quality example is this UK survey from 1992, where men reported 9.9 female partners on average, while women averaged 3.4 male partners. It’s possible to tinker around the edges with effects of changes over time, and age differences between men and women in sexual relationships, but the contradiction is really inescapable. One thing that is quite striking in this survey is the difference between the cross-sectional and longitudinal pictures, which I’ve discussed before. For example, men’s lifetime numbers of sexual partners increase with age — as they must, longitudinally — but among the women the smallest average number of lifetime sex partners is in the oldest group.)

In any case, I was reminded of this when reading Stephen Greenblatt’s popular book on the rediscovery of De rerum naturae in the early 15th century by the apostolic secretary Poggio Bracciolini, and the return of Epicurean philosophy more generally into European thought. He cites a story from Poggio’s Liber Facetiarum a sort of jokebook based on his experiences in the papal court, about

dumb priests, who baffled by the fact that nearly all the women in confession say that they have been faithful in matrimony, and nearly all the men confess to extramarital affairs, cannot for the life of them figure out who the women are with whom the men have sinned.

The CDC misunderstand screening too

Last week I mocked the Spanish health authorities who refused to treat an Ebola-exposed nurse as a probable Ebola case until her fever had crossed the screening threshold of 38 degrees Celsius (or, in the absurdly precise American translation, 100.4 degrees Fahrenheit). Well, apparently the Centers for Disease Control in the US aren’t any better:

Before flying from Cleveland to Dallas on Monday, Vinson called the CDC to report an elevated temperature of 99.5 Fahrenheit. She informed the agency that she was getting on a plane, the official said, and she wasn’t told not to board the aircraft.

The CDC is now considering putting 76 health care workers at Texas Health Presbyterian Dallas hospital on the TSA’s no-fly list, an official familiar with the situation said.

The official also said the CDC is considering lowering the fever threshold that would be considered a possible sign of Ebola. The current threshold is 100.4 degrees Fahrenheit.

Most disturbing is the fact that they don’t seem capable of combining factors. Would it be so hard to have a rule like, For most people, let’s hold off on the hazmat suits until your fever goes above 38. But if you’ve been cleaning up the vomit of an Ebola patient for the past week, and you have any elevated temperature at all — let’s say 37.2 — it would be a good idea to get you under observation.

The tyranny of the 95%

The president of the National Academy of Science is being quoted spouting dangerous nonsense. Well, maybe not so dangerous, but really nonsense.

I found this by way of Jonathan Chait, a generally insightful and well-informed political journalist, who weighed in recently on the political response to the IPCC report on climate change. US Republican Party big shot Paul Ryan, asked whether he believes that human activity has contributed to global warming, replied recently “I don’t know the answer to that question. I don’t think science does, either.” Chait rightly takes him to task for this ridiculous dodge (though he ignores the fact that Ryan was asked about his beliefs, so that his skepticism may reflect a commendable awareness of the cognitive theories of Stephen Stich, and his need to reflect upon the impossibility of speaking scientifically, or introspecting coherently, about the contents of beliefs), but the form of his criticism left me troubled:

In fact, science does know the answer. Climate scientists believe with a 95 percent level of certainty (the same level of certainty as their belief in the dangers of cigarette smoking) that human activity is contributing to climate change.

Tracking through his links, I found that he’d copied this comparison between climate change and the hazards of smoking pretty much verbatim from another blog, and that it ultimately derived from this “explanation” from the AP:

Some climate-change deniers have looked at 95 percent and scoffed. After all, most people wouldn’t get on a plane that had only a 95 percent certainty of landing safely, risk experts say.

But in science, 95 percent certainty is often considered the gold standard for certainty.

[…]

The president of the prestigious National Academy of Sciences, Ralph Cicerone, and more than a dozen other scientists contacted by the AP said the 95 percent certainty regarding climate change is most similar to the confidence scientists have in the decades’ worth of evidence that cigarettes are deadly.

Far be it from me to challenge the president of the National Academy of Sciences, particularly if it’s the “prestigious” National Academy of Sciences, or more than a dozen other scientists, but the technical term for this is “bollocks”. Continue reading “The tyranny of the 95%”

False positives, false confidence, and ebola

Designing a screening test is hard. You have a large population, almost all of whom do not have whichever condition you’re searching for. Thus, even with a tiny probability of error, most of the cases you pick up will be incorrect — false positives, in the jargon. So you try to set the bar reasonably high; but set it too high and you’ll miss most of the real cases — false negatives.

On the other hand, if you have a suspicion of the condition in a particular case, it’s much easier. You can set the threshold much lower without being swamped by false positives. What would be really dumb is to use the same threshold from the screening test to judge a case where there are individual grounds for suspicion. But that’s apparently what doctors in Spain did with the nurse who was infected with Ebola. From the Daily Beast:

When Teresa Romero Ramos, the Spanish nurse now afflicted with the deadly Ebola virus first felt feverish on September 30, she reportedly called her family doctor and told him she had been working with Ebola patients just like Thomas Eric Duncan who died today in Dallas. Her fever was low-grade, just 38 degrees Celsius (100 degrees Fahrenheit), far enough below the 38.6-degree Ebola red alert temperature to not cause alarm. Her doctor told her to take two aspirin, keep an eye on her fever and keep in touch.

She was caring for Ebola patients, she developed a fever, but they decided not to treat it like a possible case of Ebola because her fever was 0.6 degrees below the screening threshold for Ebola.

A failure of elementary statistical understanding, and who knows how many lives it will cost.

Absence of correlation does not imply absence of causation

By way of Andrew Sullivan we have this attempt by Philip N. Cohen to apply statistics to answer the question: does texting while driving cause accidents? Or rather, he marshals data to ridicule the new book by Matt Richtel on a supposed epidemic of traffic fatalities, particularly among teens, caused by texting while driving. He has some good points about the complexity of the evidence, and a good general point that people like to fixate on some supposed problem with current cars or driving practices, to distract their attention from the fact that automobiles are inherently dangerous, so that the main thing that causes more fatalities is more driving. But then he has this weird scatterplot, that is supposed to be a visual knock-down argument:

We need about two phones per person to eliminate traffic fatalities...
We need about two phones per person to eliminate traffic fatalities…

So, basically no correlation between the number of of phone subscriptions in a state and the number of traffic fatalities. So, what does that prove? Pretty much nothing, I would say. It’s notable that there is really very little variation in the number of mobile phones among the states, and at the lowest level there’s still almost one per person. (Furthermore, I would guess that most of the adults with no mobile phone are poor, and likely don’t have an automobile either.) Once you have one mobile phone, there’s no reason to think that a second one will substantially

Whether X causes Y is a separate question from whether variation in X is linked to variation in Y. You’d like to think that a sociologist like Cohen would know this. A well-known example: No one would doubt that human intelligence is a product of the human brain (most directly). But variations in intelligence are uncorrelated with variations in brain size. (Which doesn’t rule out the possibility that more subtle measurements could find a physical correlate.) This is particularly true with causes that are saturated, as with the one phone per person level.

You might imagine a Cohen-like war-crimes investigator deciding that the victims were not killed by bullets, because we find no correlation between the number of bullets in a gun and the fate of the corresponding victim.

Just to be clear: I’m not claiming that evidence like this could never be relevant. But when you’re clearly in the saturation region, with a covariate that is only loosely connected to the factor in question, it’s obviously just misleading.

It’s a good thing they didn’t stop at 12…

The BBC reports today on the most recent THE global university rankings. The article is illustrated with a grinning, texting stock-photo student (I’m genuinely baffled as to what value these atmospheric photos are thought to add to news article) above the caption

The rankings rate universities worldwide on 13 measures, including teaching.

Wow! These rankings of higher education institutions were pretty thorough, if they even went so far as to include the quality of TEACHING among their 13 factors! If they’d had sufficient bandwidth for 14 factors they might have ranked them on the quality of their wine collections. Then Oxford would have come out tops for sure.

Devices like this one are sometimes still used to watch the BBC!
Devices like this one are sometimes still used to watch the BBC!

Low unemployment rates for math/stat PhDs

I was interested to read of a recent NSF study, that found only 2.1% unemployment in the US for people with doctoral degrees in science, engineering, and health fields. That’s only about 1/3 the rate in the general population over age 25. But I found even more striking that within that group, those with doctorates in mathematics and statistics had lower unemployment than those in any other field, at 1.2%.

Cornpone opinions in academia

I was commenting recently on the attempt by University of Illinois (Urbana-Champaign) Chancellor Phyllis Wise to explain to all of us addleheaded profs that her ability (and that of US employers more generally) to fire people for expressing their opinions really has nothing at all to do with freedom of speech or academic freedom:

People are mixing up this individual personnel issue with the whole question of freedom of speech and academic freedom.

Political scientist Corey Robin has taken up the same quote, and explained how pervasive it is, and how fundamental it is to the machinery of repression in the US. It seems like one of those dogmas that is patently absurd to the uninitiated, but for those inside the machine (and by “the machine”, I mean simply mainstream American thinking about politics) it is self-evident.

Robin has nothing on Mark Twain, who wrote more than a century ago:

It is by the goodness of God that in our country we have those three unspeakably precious things: freedom of speech, freedom of conscience, and the prudence never to practice either of them.

He explained at greater length in his great essay “Corn-pone Opinions”, telling of a young slave whom he knew in his boyhood, who told him

“You tell me whar a man gits his corn pone, en I’ll tell you what his ‘pinions is.”

I can never forget it. It was deeply impressed upon me. By my mother. Not upon my memory, but elsewhere. She had slipped in upon me while I was absorbed and not watching. The black philosopher’s idea was that a man is not independent, and cannot afford views which might interfere with his bread and butter. If he would prosper, he must train with the majority; in matters of large moment, like politics and religion, he must think and feel with the bulk of his neighbors, or suffer damage in his social standing and in his business prosperities. He must restrict himself to corn-pone opinions — at least on the surface. He must get his opinions from other people; he must reason out none for himself; he must have no first-hand views.