False positives, false confidence, and ebola

Designing a screening test is hard. You have a large population, almost all of whom do not have whichever condition you’re searching for. Thus, even with a tiny probability of error, most of the cases you pick up will be incorrect — false positives, in the jargon. So you try to set the bar reasonably high; but set it too high and you’ll miss most of the real cases — false negatives.

On the other hand, if you have a suspicion of the condition in a particular case, it’s much easier. You can set the threshold much lower without being swamped by false positives. What would be really dumb is to use the same threshold from the screening test to judge a case where there are individual grounds for suspicion. But that’s apparently what doctors in Spain did with the nurse who was infected with Ebola. From the Daily Beast:

When Teresa Romero Ramos, the Spanish nurse now afflicted with the deadly Ebola virus first felt feverish on September 30, she reportedly called her family doctor and told him she had been working with Ebola patients just like Thomas Eric Duncan who died today in Dallas. Her fever was low-grade, just 38 degrees Celsius (100 degrees Fahrenheit), far enough below the 38.6-degree Ebola red alert temperature to not cause alarm. Her doctor told her to take two aspirin, keep an eye on her fever and keep in touch.

She was caring for Ebola patients, she developed a fever, but they decided not to treat it like a possible case of Ebola because her fever was 0.6 degrees below the screening threshold for Ebola.

A failure of elementary statistical understanding, and who knows how many lives it will cost.

Absence of correlation does not imply absence of causation

By way of Andrew Sullivan we have this attempt by Philip N. Cohen to apply statistics to answer the question: does texting while driving cause accidents? Or rather, he marshals data to ridicule the new book by Matt Richtel on a supposed epidemic of traffic fatalities, particularly among teens, caused by texting while driving. He has some good points about the complexity of the evidence, and a good general point that people like to fixate on some supposed problem with current cars or driving practices, to distract their attention from the fact that automobiles are inherently dangerous, so that the main thing that causes more fatalities is more driving. But then he has this weird scatterplot, that is supposed to be a visual knock-down argument:

We need about two phones per person to eliminate traffic fatalities...
We need about two phones per person to eliminate traffic fatalities…

So, basically no correlation between the number of of phone subscriptions in a state and the number of traffic fatalities. So, what does that prove? Pretty much nothing, I would say. It’s notable that there is really very little variation in the number of mobile phones among the states, and at the lowest level there’s still almost one per person. (Furthermore, I would guess that most of the adults with no mobile phone are poor, and likely don’t have an automobile either.) Once you have one mobile phone, there’s no reason to think that a second one will substantially

Whether X causes Y is a separate question from whether variation in X is linked to variation in Y. You’d like to think that a sociologist like Cohen would know this. A well-known example: No one would doubt that human intelligence is a product of the human brain (most directly). But variations in intelligence are uncorrelated with variations in brain size. (Which doesn’t rule out the possibility that more subtle measurements could find a physical correlate.) This is particularly true with causes that are saturated, as with the one phone per person level.

You might imagine a Cohen-like war-crimes investigator deciding that the victims were not killed by bullets, because we find no correlation between the number of bullets in a gun and the fate of the corresponding victim.

Just to be clear: I’m not claiming that evidence like this could never be relevant. But when you’re clearly in the saturation region, with a covariate that is only loosely connected to the factor in question, it’s obviously just misleading.

It’s a good thing they didn’t stop at 12…

The BBC reports today on the most recent THE global university rankings. The article is illustrated with a grinning, texting stock-photo student (I’m genuinely baffled as to what value these atmospheric photos are thought to add to news article) above the caption

The rankings rate universities worldwide on 13 measures, including teaching.

Wow! These rankings of higher education institutions were pretty thorough, if they even went so far as to include the quality of TEACHING among their 13 factors! If they’d had sufficient bandwidth for 14 factors they might have ranked them on the quality of their wine collections. Then Oxford would have come out tops for sure.

Devices like this one are sometimes still used to watch the BBC!
Devices like this one are sometimes still used to watch the BBC!

Frege and sexual abuse

Slate’s Amanda Hess has written about the case of Retaeh Parsons, a Nova Scotia girl who committed suicide last year, four years after being the victim of bullying over a photograph of her being sexually assaulted. She became famous across Canada after the police originally refused to prosecute those who assaulted her. The national, and then international, outcry, inspired some creativity among the reluctant police, who have now successfully prosecuted one of the perpetrators for child pornography.

The main point of the article was to comment on how

the judge in the case has barred Canadian journalists and everyday citizens from repeating the girl’s name in newspapers, on television, over the radio, and on social media. He cited a portion of Canadian criminal code that bans the publication of a child pornography victim’s name in connection to any legal proceeding connected to that alleged crime.

She quotes a Halifax reporter Ryan Van Horne on the perverse effect:

If you say the name “Rehtaeh” in Nova Scotia… you’ll be met with “instant recognition” of the case and all of the issues it represents. But when Van Horne asks locals, “You know that victim in that high-profile child pornography case?” he draws blanks. The famous circumstances surrounding Rehtaeh Parsons’ bullying and death don’t fit the traditional conception of a child pornography case, which makes linking the two difficult if reporters aren’t allowed to use her name and photograph.

This sounds like a horrible version of Frege’s Morning-Star/Evening-Star puzzle: News media (including social media) are allowed to talk about Retaeh Parsons (the famous child victim of sexual abuse and online harassment); and they are allowed to talk about the victim in that high-profile child pornography case. But they are barred from talking about Retaeh Parsons as the victim in that child pornography case. In Fregian terms, it’s as though we banned any reference to the “morning star”, but were still allowed to talk about the evening star.

Of course, there’s nothing terribly unusual here: Often important privacy concerns turn on concealing the identity of what appear to be two different individuals. It only seems so perverse here because the person whose privacy would implicitly be protected is 1) famous for her role in this case; and 2) deceased, which means that the only people whose privacy is being protected are the police officials who screwed up so badly in the first place.