Occasional reflections on Life, the World, and Mathematics

Posts tagged ‘statistics’

Parliamentary mortality

An article in the New Statesman raised the question of whether the Conservatives could lose their hold on power via by-elections over the next few years only to dismiss the possibility because by-elections simply don’t happen frequently enough. The reason? Reduced mortality rates. Quite sensible, but then this strange claim was made:

In 1992-7, the last time that the Conservatives had seven by-elections in a parliament, life expectancy was 15 years lower than it is today.

Ummm… If life expectancy had increased by 15 years over the last 20 years, we’d be getting close to achieving mortality escape velocity. In fact, the increase has been about 5 years for men and 4 years for women.

But that raised for me the somewhat morbid question: How many MPs would be expected to die in the next 5 years? Approximate age distribution of MPs is available here. It’s for the last parliament, but I’ll assume it remains pretty similar. It’s interesting that Labour had twice as large a proportion (25% vs 12%) in the over-60 category. In addition, I’ll make the following assumptions:

  1. Within coarse age categories the distribution is the same between parties. (This is required to deal with the fact that the numbers by party are only divided into three age categories.)
  2. Since I don’t have detailed mortality data by class or occupation, I’ll simply treat them as being 5 years younger than their calendar age, since that’s the difference in median age at death between men in managerial occupations and average occupations.
  3. I assume women to have the same age distribution as men.
  4. I’m using 2013 mortality rates from the Human Mortality Database.

My calculations say that the expected number of deaths over the next 5 years is about 6.4 Conservatives and 6.5 Labour. So we can estimate that the probability of at least 7 by-elections due to deceased Tory MPs is just a shade under 50%.

Good words

There has been a lot of reporting on this recent poll, where people were asked what word first came to mind when they thought of President Trump. Here are the top 20 responses (from 1,079 American adults surveyed):

idiot         39
incompetent   31
liar          30
leader        25
unqualified   25
president     22
strong        21
businessman   18
ignorant      16
egotistical   15
asshole       13
stupid        13
arrogant      12
trying        12
bully         11
business      11
narcissist    11
successful    11
disgusting    10
great         10

The fact that idiot, incompetent, and liar head the list isn’t great for him. But Kevin Drum helpfully coded the words into “good” and “bad”:

What strikes me is that even the “good” words aren’t really very good. If you’re asked what word first comes to mind when you think of President Trump and you answer president, that sounds to me more passive-aggressive than positive. Similarly, you need a particular ideological bent to consider businessman and business to be inherently positive qualities. Leader — I don’t know, I guess der Führer is a positive figure for those who admire that sort of thing. Myself, I prefer to know where we’re being led. If we include that one, there are 4 positive words, 4 neutral words, and 12 negative. (I’m including trying as neutral because I don’t know if people mean “working hard to do his job well”, which sounds like at least a back-handed compliment, or “trying my patience”.)

Montaigne on random controlled experiments

In the past I’ve read a few individual individual essays by Montaigne, but lately I’ve been really enjoying reading them systematically — partly listening to the English-language audiobook, partly reading the lovely annotated French edition by Jean Céard et al. It’s fascinating to see the blend of inaccessibly foreign worldview with ideas that seem at times astoundingly modern. For example, in the essay titled “On the resemblence of children to their fathers” (which seems to have almost nothing at all to say about the resemblence of children to their fathers), in the course of disparaging contemporary medicine Montaigne suddenly anticipates the need for random controlled trials — while at the same time despairing of such a daunting intellectual project. After acknowledging a few minor cases in which physicians seem to have learned something from experience he continues

Mais en la plus part des autres experiences, à quoy ils disent avoir esté conduis par la fortune, et n’avoir eu autre guide que le hazard, je trouve le progrez de cette information incroyable. J’imagine l’homme, regardant au tour de luy le nombre infiny des choses, plantes, animaux, metaulx. Je ne sçay par où luy faire commencer son essay : et quand sa premiere fantasie se jettera sur la corne d’un elan, à quoy il faut prester une creance bien molle et aisée : il se trouve encore autant empesché en sa seconde operation. Il luy est proposé tant de maladies, et tant de circonstances, qu’avant qu’il soit venu à la certitude de ce poinct, où doit joindre la perfection de son experience, le sens humain y perd son Latin : et avant qu’il ait trouvé parmy cette infinité de choses, que c’est cette corne : parmy cette infinité de maladies, l’epilepsie : tant de complexions, au melancholique : tant de saisons, en hyver : tant de nations, au François : tant d’aages, en la vieillesse : tant de mutations celestes, en la conjonction de Venus et de Saturne : tant de parties du corps au doigt. A tout cela n’estant guidé ny d’argument, ny de conjecture, ny d’exemple, ny d’inspiration divine, ains du seul mouvement de la fortune, il faudroit que ce fust par une fortune, parfaictement artificielle, reglée et methodique Et puis, quand la guerison fut faicte, comment se peut il asseurer, que ce ne fust, que le mal estoit arrivé à sa periode ; ou un effect du hazard ? ou l’operation de quelque autre chose, qu’il eust ou mangé, ou beu, ou touché ce jour là ? ou le merite des prieres de sa mere-grand ? Davantage, quand cette preuve auroit esté parfaicte, combien de fois fut elle reiterée ? et cette longue cordée de fortunes et de rencontres, r’enfilée, pour en conclure une regle.

But in most other experiences, where they claim to have been led by accidents, having no other guide than chance, I find the progress of this information hard to believe. I imagine a man looking about him at the infinite number of things, plants, animals, metals. I don’t where he would start. And when his first whim took him to an elk horn, which might be easy to believe in, he would find his second step blocked: There are so many diseases, so many individual circumstances, that before he could arrive at any certainty on this point, he will have arrived at the end of human sense: before he could find, among this infinity of things, that it is this horn; among the infinity of diseases, epilepsy; among the individual conditions, the melancholic temperament; among all the ages, the elderly; among all the astrological conditions, the conjunction of Venus and Saturn; among all the parts of the body, the finger. And all of this, being led by no argument, by no prior examples, by no divine inspiration, but purely by chance, it must be achieved by the most completely artificial, methodical and regulated turn of chance. And suppose the cure has been accomplished, how could you tell whether the disease might not have simply run its course, or the improvement occurred purely by chance? Or if it might not have been the effect of some other factor, something he ate, or drank, or touched on that day? Or the merit of his grandmother’s prayers? And if you could provide complete proof in one case, how many times would you need to repeat the trial, and this long series of random encounters, before you could conclusively determine the rule.

Post-existing climate conditions

According to the NY Times, insurers have been taking advantage of climate-change fears to raise prices for flood insurance. Now that the presidential election has conclusively proved that the greenhouse effect is a Chinese hoax to make Americans look stupid less productive, I think the Congress needs to move beyond minor defensive measures like abandoning the Paris accord, and move instead to aggressively defend Americans’ God-given right to build decadent structures in flood zones: Just as health insurers are now prohibited from inquiring about or taking account of “pre-existing conditions”, flood insurers need to be prohibited from taking account of (hoax) research about “post-existing” (future) climate conditions in determining flood insurance prices. Prices may be based only on past flood records.

This can be combined into a single consumer-rights bill with Mike Pence’s initiative to ban life insurance premiums that discriminate against tobacco users. As Pence wrote in 2000,

Time for a quick reality check. Despite the hysteria from the political class and the media, smoking doesn’t kill… Nine out of ten smokers do not contract lung cancer.

What’s all this hysteria for? Smoking is even safer than Russian Roulette. (Five out of six players don’t get shot!)

Making sense of the predictions

I absolutely agree that Sam Wang and the Princeton Election Consortium have a good argument for there being a 99+% chance of Clinton winning. Unfortunately, I think there’s only about a 50% chance of his argument being right. It could also be that Nate Silver is right, that there is a 65% chance. Putting that all together, I come down right about where the NY Times is, at about an 85% chance of escaping apocalypse.

I’ve written a bit more about how I think about the likelihoods here. But a fundamental problem with the PEC estimate is that it clearly puts very little weight on the possibility of model failure. A fundamental problem with the 538 estimates is that they are very clearly not martingales. That is, they are not consistent predictions of the future based on all available information. One way of saying this is to note that a few weeks ago Clinton had close to a 90% estimated victory probability. Now it’s 65%. That seems like a modest change, but if the first estimate was correct, the current estimate reflects an event that had less than a 1 in 3 chance of happening. So we’re more than halfway there. But does anyone really think that the events of the past month have been that unlikely?

4 digits of separation

Conspiracy theorists are working overtime to discredit all the women who report having been molested by Donald Trump. (Trump’s near-legendary non-disclosure and non-disparagement clauses in all contracts, which pretty much exclude reports from any woman who ever worked for him — and even campaign volunteers — are the only thing keeping the numbers reasonably manageable.) The “pussy” video that kicked this all off was released as part of a joint plot by international Zionists and the gnomes of Zurich. And the woman who was groped while sitting next to Trump on a plane was lying (because supposedly first-class armrests in 1980s planes didn’t go up) and was an agent of the Clinton Foundation, since her telephone number (a convenient excuse for exposing her private information) is identical to one for a staff member at the foundation. Except,

While the article Delauzon’s tweet linked to claims that Leeds shared a phone number with the Clinton Foundation, the two phone numbers differed by several digits.

But obviously the story doesn’t end there. Granted, she was not actually working for the Clinton Foundation. You have to ask yourself, what are the odds that someone who was supposedly not connected at all with that organisation would happen to have a telephone number that was so similar. The question answers itself.

Donald Trump’s prodigious prostate

Let us accept for a moment the claim that Donald Trump’s medical condition is uniformly excellent. You might still expect that random medical test results should be about average for a healthy man. (Not meaning BP or heart rate or god-forbid testosterone, which you would expect to reflect his hyperpowerful masculinity.) I was looking at this report, released a few weeks ago, which included Trump’s test result for PSA (prostate-specific antigen). High levels can be signs of an enlarged prostate, or prostate cancer. But Trump’s doctor reports his level at 0.15. According to this study men over 70 with normal prostate have a median level of 1.9 (it doesn’t seem to depend much on age above 70). If we make the very conservative assumption that the distribution falls off linearly from 1.9 down to 0, we would estimate that less than 1% of men have PSA scores below 0.2.

Maybe Trump has no prostate.

Alternatively, he should really be compared with a younger reference group, because his pact with Mephistopheles and/or regular consumption of the blood of virgins keeps him youthful.

Statistics and causal truth: Police edition

As usual, Andrew Sullivan — who has now returned temporarily to blogging, attracted like a moth to the Trump conflagration — manages to take a common, superficially convincing argument, and express it with moral fervour and personal conviction that makes the tenuous logic really conspicuous. In this case, it’s the argument based on the much-discussed study by Roland G. Fryer, Jr. of the rate of various violent outcomes of police stops, finding that black people are more likely than white to be physically abused by police, but not more likely to be shot.

(Here’s an excellent NY Times report, and  the original study.)

…the Black Lives Matter activists, whose core and central argument is that black men are disproportionately killed by cops. The best data shows this is false…  I find [the study] conclusive. Feelings do not, er, trump data in a deliberative democracy. A reader writes:

I understand that there has been the recent study suggesting that given an interaction with a police officer occurs, then the police officer is no more likely to use a gun with a black person than with a white person. However, given that many black men have a much higher rate of interaction with police (such as, anecdotally, Philando Castile, with 52 traffic stops), then is it not fair to say that black men are disproportionately killed by cops?
The point is that there is no evidence of individual racism in these police encounters, despite the impression from many chilling phone videos. The structural bias still exists as a whole, as I said, but the narrative about cops being more likely to kill a black member of the public when encountering him is false.

I have no criticism to make of the study — I have not analysed it in any depth, but it seems credibly and even impressively done — even if I find the premise absurd, that a single study of such a complex phenomenon could be “conclusive”. But they do not “trump” the data that black people make up 13% of the US population, but 31% of those killed during an arrest, and 42% of those killed during an arrest when unarmed. The point is, what these facts (and many others, including the others) mean jointly depends on what we think is the reason for black people being so much more likely to be arrested.

(more…)

More damned statistics

The news website Vox published this chart last year, by Dara Lind, based on FBI data on people killed by police during arrest. The most chilling thing about it is that refined statistical analysis on people killed by police is possible, with all kinds of elaborate subgroup analyses. That’s because there were 426 cases in that year. In general I’m all in favour of more data, which makes it possible to study problems in a more refined way, but I’m happy that the statistics gathered by the Independent Police Complaints Commission don’t have much to work with: In the same year there were 15 deaths in or following police custody in England and Wales.

UPDATE:  I thought the US number seemed surprisingly small — only about 5 times the UK number on a per capita basis, despite the fact that British police don’t routinely carry firearms. In fact, The Guardian’s documentation of all police killings in the US lists 1146 people killed by police in 2015. I presume this has to do with the fact that the FBI statistic only counts people killed during arrest.

Damned statistics

At a conference talk on the “reproducibility crisis” in psychology, the speaker quoted a relative issuing the commonplace anti-statistics apothegm “You can prove anything with statistics”. It’s a funny sort of claim, because it is self-undermining. Outside of a seminar on Popperian scientific philosophy no one would say “you can prove anything with numerology” or “you can prove anything with astrology”. Those who are not in thrall to these methods of divination find them either entertaining or ridiculous, but 

Is it because statistics is too abstruse for ordinary people to criticise? No one says, you can prove anything with quantum mechanics. Or, for that matter, mathematics.

Statistics is sufficiently precise and rigid and generally reliable to be authoritative, but leaves enough flexibility for experts to disagree and for manipulative misapplications to still hew close to standard procedure, and sufficiently abstruse that most people can’t figure out whether they’re being manipulated.

An interesting parallel is the Shakespearean dictum “The devil can cite scripture for his purpose.”

Tag Cloud