Occasional reflections on Life, the World, and Mathematics

Posts tagged ‘statistics’

Montaigne on random controlled experiments

In the past I’ve read a few individual individual essays by Montaigne, but lately I’ve been really enjoying reading them systematically — partly listening to the English-language audiobook, partly reading the lovely annotated French edition by Jean Céard et al. It’s fascinating to see the blend of inaccessibly foreign worldview with ideas that seem at times astoundingly modern. For example, in the essay titled “On the resemblence of children to their fathers” (which seems to have almost nothing at all to say about the resemblence of children to their fathers), in the course of disparaging contemporary medicine Montaigne suddenly anticipates the need for random controlled trials — while at the same time despairing of such a daunting intellectual project. After acknowledging a few minor cases in which physicians seem to have learned something from experience he continues

Mais en la plus part des autres experiences, à quoy ils disent avoir esté conduis par la fortune, et n’avoir eu autre guide que le hazard, je trouve le progrez de cette information incroyable. J’imagine l’homme, regardant au tour de luy le nombre infiny des choses, plantes, animaux, metaulx. Je ne sçay par où luy faire commencer son essay : et quand sa premiere fantasie se jettera sur la corne d’un elan, à quoy il faut prester une creance bien molle et aisée : il se trouve encore autant empesché en sa seconde operation. Il luy est proposé tant de maladies, et tant de circonstances, qu’avant qu’il soit venu à la certitude de ce poinct, où doit joindre la perfection de son experience, le sens humain y perd son Latin : et avant qu’il ait trouvé parmy cette infinité de choses, que c’est cette corne : parmy cette infinité de maladies, l’epilepsie : tant de complexions, au melancholique : tant de saisons, en hyver : tant de nations, au François : tant d’aages, en la vieillesse : tant de mutations celestes, en la conjonction de Venus et de Saturne : tant de parties du corps au doigt. A tout cela n’estant guidé ny d’argument, ny de conjecture, ny d’exemple, ny d’inspiration divine, ains du seul mouvement de la fortune, il faudroit que ce fust par une fortune, parfaictement artificielle, reglée et methodique Et puis, quand la guerison fut faicte, comment se peut il asseurer, que ce ne fust, que le mal estoit arrivé à sa periode ; ou un effect du hazard ? ou l’operation de quelque autre chose, qu’il eust ou mangé, ou beu, ou touché ce jour là ? ou le merite des prieres de sa mere-grand ? Davantage, quand cette preuve auroit esté parfaicte, combien de fois fut elle reiterée ? et cette longue cordée de fortunes et de rencontres, r’enfilée, pour en conclure une regle.

But in most other experiences, where they claim to have been led by accidents, having no other guide than chance, I find the progress of this information hard to believe. I imagine a man looking about him at the infinite number of things, plants, animals, metals. I don’t where he would start. And when his first whim took him to an elk horn, which might be easy to believe in, he would find his second step blocked: There are so many diseases, so many individual circumstances, that before he could arrive at any certainty on this point, he will have arrived at the end of human sense: before he could find, among this infinity of things, that it is this horn; among the infinity of diseases, epilepsy; among the individual conditions, the melancholic temperament; among all the ages, the elderly; among all the astrological conditions, the conjunction of Venus and Saturn; among all the parts of the body, the finger. And all of this, being led by no argument, by no prior examples, by no divine inspiration, but purely by chance, it must be achieved by the most completely artificial, methodical and regulated turn of chance. And suppose the cure has been accomplished, how could you tell whether the disease might not have simply run its course, or the improvement occurred purely by chance? Or if it might not have been the effect of some other factor, something he ate, or drank, or touched on that day? Or the merit of his grandmother’s prayers? And if you could provide complete proof in one case, how many times would you need to repeat the trial, and this long series of random encounters, before you could conclusively determine the rule.

Post-existing climate conditions

According to the NY Times, insurers have been taking advantage of climate-change fears to raise prices for flood insurance. Now that the presidential election has conclusively proved that the greenhouse effect is a Chinese hoax to make Americans look stupid less productive, I think the Congress needs to move beyond minor defensive measures like abandoning the Paris accord, and move instead to aggressively defend Americans’ God-given right to build decadent structures in flood zones: Just as health insurers are now prohibited from inquiring about or taking account of “pre-existing conditions”, flood insurers need to be prohibited from taking account of (hoax) research about “post-existing” (future) climate conditions in determining flood insurance prices. Prices may be based only on past flood records.

This can be combined into a single consumer-rights bill with Mike Pence’s initiative to ban life insurance premiums that discriminate against tobacco users. As Pence wrote in 2000,

Time for a quick reality check. Despite the hysteria from the political class and the media, smoking doesn’t kill… Nine out of ten smokers do not contract lung cancer.

What’s all this hysteria for? Smoking is even safer than Russian Roulette. (Five out of six players don’t get shot!)

Making sense of the predictions

I absolutely agree that Sam Wang and the Princeton Election Consortium have a good argument for there being a 99+% chance of Trump winning. Unfortunately, I think there’s only about a 50% chance of his argument being right. It could also be that Nate Silver is right, that there is a 65% chance. Putting that all together, I come down right about where the NY Times is, at about an 85% chance of escaping apocalypse.

I’ve written a bit more about how I think about the likelihoods here. But a fundamental problem with the PEC estimate is that it clearly puts very little weight on the possibility of model failure. A fundamental problem with the 538 estimates is that they are very clearly not martingales. That is, they are not consistent predictions of the future based on all available information. One way of saying this is to note that a few weeks ago Clinton had close to a 90% estimated victory probability. Now it’s 65%. That seems like a modest change, but if the first estimate was correct, the current estimate reflects an event that had less than a 1 in 3 chance of happening. So we’re more than halfway there. But does anyone really think that the events of the past month have been that unlikely?

4 digits of separation

Conspiracy theorists are working overtime to discredit all the women who report having been molested by Donald Trump. (Trump’s near-legendary non-disclosure and non-disparagement clauses in all contracts, which pretty much exclude reports from any woman who ever worked for him — and even campaign volunteers — are the only thing keeping the numbers reasonably manageable.) The “pussy” video that kicked this all off was released as part of a joint plot by international Zionists and the gnomes of Zurich. And the woman who was groped while sitting next to Trump on a plane was lying (because supposedly first-class armrests in 1980s planes didn’t go up) and was an agent of the Clinton Foundation, since her telephone number (a convenient excuse for exposing her private information) is identical to one for a staff member at the foundation. Except,

While the article Delauzon’s tweet linked to claims that Leeds shared a phone number with the Clinton Foundation, the two phone numbers differed by several digits.

But obviously the story doesn’t end there. Granted, she was not actually working for the Clinton Foundation. You have to ask yourself, what are the odds that someone who was supposedly not connected at all with that organisation would happen to have a telephone number that was so similar. The question answers itself.

Donald Trump’s prodigious prostate

Let us accept for a moment the claim that Donald Trump’s medical condition is uniformly excellent. You might still expect that random medical test results should be about average for a healthy man. (Not meaning BP or heart rate or god-forbid testosterone, which you would expect to reflect his hyperpowerful masculinity.) I was looking at this report, released a few weeks ago, which included Trump’s test result for PSA (prostate-specific antigen). High levels can be signs of an enlarged prostate, or prostate cancer. But Trump’s doctor reports his level at 0.15. According to this study men over 70 with normal prostate have a median level of 1.9 (it doesn’t seem to depend much on age above 70). If we make the very conservative assumption that the distribution falls off linearly from 1.9 down to 0, we would estimate that less than 1% of men have PSA scores below 0.2.

Maybe Trump has no prostate.

Alternatively, he should really be compared with a younger reference group, because his pact with Mephistopheles and/or regular consumption of the blood of virgins keeps him youthful.

Statistics and causal truth: Police edition

As usual, Andrew Sullivan — who has now returned temporarily to blogging, attracted like a moth to the Trump conflagration — manages to take a common, superficially convincing argument, and express it with moral fervour and personal conviction that makes the tenuous logic really conspicuous. In this case, it’s the argument based on the much-discussed study by Roland G. Fryer, Jr. of the rate of various violent outcomes of police stops, finding that black people are more likely than white to be physically abused by police, but not more likely to be shot.

(Here’s an excellent NY Times report, and  the original study.)

…the Black Lives Matter activists, whose core and central argument is that black men are disproportionately killed by cops. The best data shows this is false…  I find [the study] conclusive. Feelings do not, er, trump data in a deliberative democracy. A reader writes:

I understand that there has been the recent study suggesting that given an interaction with a police officer occurs, then the police officer is no more likely to use a gun with a black person than with a white person. However, given that many black men have a much higher rate of interaction with police (such as, anecdotally, Philando Castile, with 52 traffic stops), then is it not fair to say that black men are disproportionately killed by cops?
The point is that there is no evidence of individual racism in these police encounters, despite the impression from many chilling phone videos. The structural bias still exists as a whole, as I said, but the narrative about cops being more likely to kill a black member of the public when encountering him is false.

I have no criticism to make of the study — I have not analysed it in any depth, but it seems credibly and even impressively done — even if I find the premise absurd, that a single study of such a complex phenomenon could be “conclusive”. But they do not “trump” the data that black people make up 13% of the US population, but 31% of those killed during an arrest, and 42% of those killed during an arrest when unarmed. The point is, what these facts (and many others, including the others) mean jointly depends on what we think is the reason for black people being so much more likely to be arrested.

(more…)

More damned statistics

The news website Vox published this chart last year, by Dara Lind, based on FBI data on people killed by police during arrest. The most chilling thing about it is that refined statistical analysis on people killed by police is possible, with all kinds of elaborate subgroup analyses. That’s because there were 426 cases in that year. In general I’m all in favour of more data, which makes it possible to study problems in a more refined way, but I’m happy that the statistics gathered by the Independent Police Complaints Commission don’t have much to work with: In the same year there were 15 deaths in or following police custody in England and Wales.

UPDATE:  I thought the US number seemed surprisingly small — only about 5 times the UK number on a per capita basis, despite the fact that British police don’t routinely carry firearms. In fact, The Guardian’s documentation of all police killings in the US lists 1146 people killed by police in 2015. I presume this has to do with the fact that the FBI statistic only counts people killed during arrest.

Damned statistics

At a conference talk on the “reproducibility crisis” in psychology, the speaker quoted a relative issuing the commonplace anti-statistics apothegm “You can prove anything with statistics”. It’s a funny sort of claim, because it is self-undermining. Outside of a seminar on Popperian scientific philosophy no one would say “you can prove anything with numerology” or “you can prove anything with astrology”. Those who are not in thrall to these methods of divination find them either entertaining or ridiculous, but 

Is it because statistics is too abstruse for ordinary people to criticise? No one says, you can prove anything with quantum mechanics. Or, for that matter, mathematics.

Statistics is sufficiently precise and rigid and generally reliable to be authoritative, but leaves enough flexibility for experts to disagree and for manipulative misapplications to still hew close to standard procedure, and sufficiently abstruse that most people can’t figure out whether they’re being manipulated.

An interesting parallel is the Shakespearean dictum “The devil can cite scripture for his purpose.”

Don Quixote on sampling bias

Continuing my series on modern themes that were already thoroughly treated in Don Quixote, here is the passage where Don Quixote and Sancho Panza discuss whether it is better to be a knight errant or a monk:

“Señor, it is better to be an humble little friar of no matter what order, than a valiant knight-errant; with God a couple of dozen of penance lashings are of more avail than two thousand lance-thrusts, be they given to giants, or monsters, or dragons.”

“All that is true,” returned Don Quixote, “but we cannot all be friars, and many are the ways by which God takes his own to heaven; chivalry is a religion, there are sainted knights in glory.”

“Yes,” said Sancho, “but I have heard say that there are more friars in heaven than knights-errant.”

“That,” said Don Quixote, “is because those in religious orders are more numerous than knights.”

Rapid growth

A lot of EU citizens who live in Britain are worried that they will be forced out if the UK voters decide next month to withdraw from the EU. The Leave campaign dismisses this, and all concerns that anyone might have about this radical step, as “Project Fear”:

Clearly any EU citizen that is legally here if we come out of the EU would absolutely have the right to remain here. Any other suggestion is just absurd.

Given that the main point of Brexit is to reduce immigration from the Continent, and given that tempers are likely to flare when the fate of said migrants (on both sides) are negotiated, and given that current UK law clearly would not give most of the EU citizens who are here the right to permanent residency, it’s clearly not absurd to worry. To adapt an old saw, even those whom political campaigners are trying to make paranoid, have real reasons to worry.

Well, from the NY Times, here’s some non-evidence:

Rose Carey, the head of immigration at Charles Russell Speechlys, a global law firm based in London, said she had seen an “unprecedented amount” of applications for British citizenship in the last few months.

“Historically, E.U. nationals didn’t really bother applying for a British passport,” she said. “It used to be a couple hundred a year to now five queries a week.”

From a couple of hundred a year to five a week — that’s pretty rapid growth!

Tag Cloud