Occasional reflections on Life, the World, and Mathematics

Posts tagged ‘statistics’

Natural frequencies and individual propensities

I’ve just been reading Gerd Gigerenzer’s book Reckoning with Risk, about risk communication, mainly a plaidoyer for the use of “natural frequencies” in place of probabilities: Statements in the form “In how many cases out of 100 similar cases of X would you expect Y to happen”. He cites one study forensic psychiatry experts who were presented with a case study, and asked to estimate the likelihood of the individual being violent in the next six months. Half the subjects were asked “What is the probability that this person will commit a violent act in the next six months?” The other half were asked “How many out of 100 women like this patient would commit a violent act in the next six months?” Looking at these questions, it was obvious to me that the latter question would elicit lower estimates. Which is indeed what happened: The average response to the first question was about 0.3; the average response to the second was about 20.

What surprised me was that Gigerenzer seemed perplexed by this consistent difference in one direction (though, obviously, not by the fact that the experts were confused by the probability statement). He suggested that those answering the first question were thinking about the same patient being released multiple times, which didn’t make much sense to me.

What I think is that the experts were thinking of the individual probability as a hidden fact, not a statistical statement. Asked to estimate this unknown probability it seems natural that they would be cautious: thinking it’s somewhere between 10 and 30 percent they would not want to underestimate this individual’s probability, and so would conservatively state the upper end. This is perfectly consistent with them thinking that, averaged over 100 cases they could confidently state that about 20 would commit a violent act.

Male nurses and politically incorrect comments on gender

I was just reading this article by journalist Conor Friedersdorf, complaining about how Canadian psychologist Jordan Peterson is being unfairly treated by journalists, who try to twist his subtle anti-feminist arguments into crude anti-feminist slurs. He certainly has a point. But then one comes to comments like this

[Interviewer]: Is gender equality desirable?

Peterson: If it means equality of outcome then it is almost certainly undesirable. That’s already been demonstrated in Scandinavia. Men and women won’t sort themselves into the same categories if you leave them to do it of their own accord. It’s 20 to 1 female nurses to male, something like that. And approximately the same male engineers to female engineers. That’s a consequence of the free choice of men and women in the societies that have gone farther than any other societies to make gender equality the purpose of the law. Those are ineradicable differences––you can eradicate them with tremendous social pressure, and tyranny, but if you leave men and women to make their own choices you will not get equal outcomes.

20 to 1? That seems really high. For nurses and for engineers. So I decided to do something rude, and check the numbers. For nurses, I found these statistics. There’s a lot of variation in Scandinavia. In Denmark it seems like about 20:1 female to male. But in Norway it’s 9:1. In Iceland it’s 100:1. Looking further afield, in Israel and Italy 20% of nurses are male. And in the Netherlands nearly 25%. This does not look like an ineradicable difference to me. It looks like path dependence and social context.

What about engineers? Here Peterson is, to use the technical term, talking out of his ass. There is no country in the EU with such an extreme gender imbalance for engineers: The most extreme is the UK, with about a 10:1 male to female ratio. In Sweden it’s 3:1, in Norway 4:1, and in Denmark 5:1. In Latvia the fraction of female engineers is up to 30%.

I think, if you want to make provocative “I’m just trying to be rational here” public arguments, you kind of have an obligation not to make up your supporting facts.

Why people hate statisticians

Andrew Dilnot, former head of the UK Statistics Authority and current warden (no really!) of Nuffield College, gave a talk here last week, at our annual event honouring Florence Nightingale qua statistician. The ostensible title was “Numbers and Public policy: Why statistics really matter”, but the title should have been “Why people hate statisticians”. This was one of the most extreme versions I’ve ever seen of a speaker shopping trite (mostly right-wing) political talking points by dressing them up in statistics to make the dubious assertions seem irrefutable, and to make the trivially obvious look ingenious.

I don’t have the slides from the talk, but video of a similar talk is available here. He spent quite a bit of his talk trying to debunk the Occupy Movement’s slogan that inequality has been increasing. The 90:10 ratio bounced along near 3 for a while, then rose to 4 during the 1980s (the Thatcher years… who knew?!), and hasn’t moved much since. Case closed. Oh, but wait, what about other measures of inequality, you may ask. And since you might ask, he had to set up some straw men to knock down. He showed the same pattern for five other measures of inequality. Case really closed.

Except that these five were all measuring the same thing, more or less. The argument people like Piketty have been making is not that the 90th percentile has been doing so much better than the 10th percentile, but that increases in wealth have been concentrated in ever smaller fractions of the population. None of the measures he looked was designed capture that process. The Gini coefficient, which looks like it measures the whole distribution, because it is a population average is actually extremely insensitive to extreme concentration at the high end. Suppose the top 1% has 20% of the income. Changes of distribution within the top 1% cannot shift the Gini coefficient by more than about 3% of its current value. He also showed the 95:5 ratio, and low-and-behold, that kept rising through the 90s, then stopped. All consistent with the main critique of rising income inequality.

Since he’s obviously not stupid, and obviously understands economics much better than I do, it’s hard to avoid thinking that this was all smoke and mirrors, intended to lull people to sleep about rising inequality, under the cover of technocratic expertise. It’s a well-known trick: Ignore the strongest criticism of your point of view, and give lots of details about weak arguments. Mathematical details are best. “Just do the math” is a nice slogan. Sometimes simple (or complex) calculations can really shed light on a problem that looks to be inextricably bound up with political interests and ideologies. But sometimes not. And sometimes you just have to accept that a political economic argument needs to be melded with statistical reasoning, and you have to be open about the entirety of the argument. (more…)

Small samples

New York Republican Representative Lee Zeldin was asked by reporter Tara Golshan how he felt about the fact that polls seem to show that a large majority of Americans — and even of Republican voters — oppose the Republican plan to reduce corporate tax rates. His response:

What I have come in contact with would reflect different numbers. So it would be interesting to see an accurate poll of 100 million Americans. But sometimes the polls get done of 1,000 [people].

Yes, that does seem suspicious, only asking 1,000 people… The 100 million people he has come in contact with are probably more typical.

Parliamentary mortality

An article in the New Statesman raised the question of whether the Conservatives could lose their hold on power via by-elections over the next few years only to dismiss the possibility because by-elections simply don’t happen frequently enough. The reason? Reduced mortality rates. Quite sensible, but then this strange claim was made:

In 1992-7, the last time that the Conservatives had seven by-elections in a parliament, life expectancy was 15 years lower than it is today.

Ummm… If life expectancy had increased by 15 years over the last 20 years, we’d be getting close to achieving mortality escape velocity. In fact, the increase has been about 5 years for men and 4 years for women.

But that raised for me the somewhat morbid question: How many MPs would be expected to die in the next 5 years? Approximate age distribution of MPs is available here. It’s for the last parliament, but I’ll assume it remains pretty similar. It’s interesting that Labour had twice as large a proportion (25% vs 12%) in the over-60 category. In addition, I’ll make the following assumptions:

  1. Within coarse age categories the distribution is the same between parties. (This is required to deal with the fact that the numbers by party are only divided into three age categories.)
  2. Since I don’t have detailed mortality data by class or occupation, I’ll simply treat them as being 5 years younger than their calendar age, since that’s the difference in median age at death between men in managerial occupations and average occupations.
  3. I assume women to have the same age distribution as men.
  4. I’m using 2013 mortality rates from the Human Mortality Database.

My calculations say that the expected number of deaths over the next 5 years is about 6.4 Conservatives and 6.5 Labour. So we can estimate that the probability of at least 7 by-elections due to deceased Tory MPs is just a shade under 50%.

Good words

There has been a lot of reporting on this recent poll, where people were asked what word first came to mind when they thought of President Trump. Here are the top 20 responses (from 1,079 American adults surveyed):

idiot         39
incompetent   31
liar          30
leader        25
unqualified   25
president     22
strong        21
businessman   18
ignorant      16
egotistical   15
asshole       13
stupid        13
arrogant      12
trying        12
bully         11
business      11
narcissist    11
successful    11
disgusting    10
great         10

The fact that idiot, incompetent, and liar head the list isn’t great for him. But Kevin Drum helpfully coded the words into “good” and “bad”:

What strikes me is that even the “good” words aren’t really very good. If you’re asked what word first comes to mind when you think of President Trump and you answer president, that sounds to me more passive-aggressive than positive. Similarly, you need a particular ideological bent to consider businessman and business to be inherently positive qualities. Leader — I don’t know, I guess der Führer is a positive figure for those who admire that sort of thing. Myself, I prefer to know where we’re being led. If we include that one, there are 4 positive words, 4 neutral words, and 12 negative. (I’m including trying as neutral because I don’t know if people mean “working hard to do his job well”, which sounds like at least a back-handed compliment, or “trying my patience”.)

Montaigne on random controlled experiments

In the past I’ve read a few individual individual essays by Montaigne, but lately I’ve been really enjoying reading them systematically — partly listening to the English-language audiobook, partly reading the lovely annotated French edition by Jean Céard et al. It’s fascinating to see the blend of inaccessibly foreign worldview with ideas that seem at times astoundingly modern. For example, in the essay titled “On the resemblence of children to their fathers” (which seems to have almost nothing at all to say about the resemblence of children to their fathers), in the course of disparaging contemporary medicine Montaigne suddenly anticipates the need for random controlled trials — while at the same time despairing of such a daunting intellectual project. After acknowledging a few minor cases in which physicians seem to have learned something from experience he continues

Mais en la plus part des autres experiences, à quoy ils disent avoir esté conduis par la fortune, et n’avoir eu autre guide que le hazard, je trouve le progrez de cette information incroyable. J’imagine l’homme, regardant au tour de luy le nombre infiny des choses, plantes, animaux, metaulx. Je ne sçay par où luy faire commencer son essay : et quand sa premiere fantasie se jettera sur la corne d’un elan, à quoy il faut prester une creance bien molle et aisée : il se trouve encore autant empesché en sa seconde operation. Il luy est proposé tant de maladies, et tant de circonstances, qu’avant qu’il soit venu à la certitude de ce poinct, où doit joindre la perfection de son experience, le sens humain y perd son Latin : et avant qu’il ait trouvé parmy cette infinité de choses, que c’est cette corne : parmy cette infinité de maladies, l’epilepsie : tant de complexions, au melancholique : tant de saisons, en hyver : tant de nations, au François : tant d’aages, en la vieillesse : tant de mutations celestes, en la conjonction de Venus et de Saturne : tant de parties du corps au doigt. A tout cela n’estant guidé ny d’argument, ny de conjecture, ny d’exemple, ny d’inspiration divine, ains du seul mouvement de la fortune, il faudroit que ce fust par une fortune, parfaictement artificielle, reglée et methodique Et puis, quand la guerison fut faicte, comment se peut il asseurer, que ce ne fust, que le mal estoit arrivé à sa periode ; ou un effect du hazard ? ou l’operation de quelque autre chose, qu’il eust ou mangé, ou beu, ou touché ce jour là ? ou le merite des prieres de sa mere-grand ? Davantage, quand cette preuve auroit esté parfaicte, combien de fois fut elle reiterée ? et cette longue cordée de fortunes et de rencontres, r’enfilée, pour en conclure une regle.

But in most other experiences, where they claim to have been led by accidents, having no other guide than chance, I find the progress of this information hard to believe. I imagine a man looking about him at the infinite number of things, plants, animals, metals. I don’t where he would start. And when his first whim took him to an elk horn, which might be easy to believe in, he would find his second step blocked: There are so many diseases, so many individual circumstances, that before he could arrive at any certainty on this point, he will have arrived at the end of human sense: before he could find, among this infinity of things, that it is this horn; among the infinity of diseases, epilepsy; among the individual conditions, the melancholic temperament; among all the ages, the elderly; among all the astrological conditions, the conjunction of Venus and Saturn; among all the parts of the body, the finger. And all of this, being led by no argument, by no prior examples, by no divine inspiration, but purely by chance, it must be achieved by the most completely artificial, methodical and regulated turn of chance. And suppose the cure has been accomplished, how could you tell whether the disease might not have simply run its course, or the improvement occurred purely by chance? Or if it might not have been the effect of some other factor, something he ate, or drank, or touched on that day? Or the merit of his grandmother’s prayers? And if you could provide complete proof in one case, how many times would you need to repeat the trial, and this long series of random encounters, before you could conclusively determine the rule.

Post-existing climate conditions

According to the NY Times, insurers have been taking advantage of climate-change fears to raise prices for flood insurance. Now that the presidential election has conclusively proved that the greenhouse effect is a Chinese hoax to make Americans look stupid less productive, I think the Congress needs to move beyond minor defensive measures like abandoning the Paris accord, and move instead to aggressively defend Americans’ God-given right to build decadent structures in flood zones: Just as health insurers are now prohibited from inquiring about or taking account of “pre-existing conditions”, flood insurers need to be prohibited from taking account of (hoax) research about “post-existing” (future) climate conditions in determining flood insurance prices. Prices may be based only on past flood records.

This can be combined into a single consumer-rights bill with Mike Pence’s initiative to ban life insurance premiums that discriminate against tobacco users. As Pence wrote in 2000,

Time for a quick reality check. Despite the hysteria from the political class and the media, smoking doesn’t kill… Nine out of ten smokers do not contract lung cancer.

What’s all this hysteria for? Smoking is even safer than Russian Roulette. (Five out of six players don’t get shot!)

Making sense of the predictions

I absolutely agree that Sam Wang and the Princeton Election Consortium have a good argument for there being a 99+% chance of Clinton winning. Unfortunately, I think there’s only about a 50% chance of his argument being right. It could also be that Nate Silver is right, that there is a 65% chance. Putting that all together, I come down right about where the NY Times is, at about an 85% chance of escaping apocalypse.

I’ve written a bit more about how I think about the likelihoods here. But a fundamental problem with the PEC estimate is that it clearly puts very little weight on the possibility of model failure. A fundamental problem with the 538 estimates is that they are very clearly not martingales. That is, they are not consistent predictions of the future based on all available information. One way of saying this is to note that a few weeks ago Clinton had close to a 90% estimated victory probability. Now it’s 65%. That seems like a modest change, but if the first estimate was correct, the current estimate reflects an event that had less than a 1 in 3 chance of happening. So we’re more than halfway there. But does anyone really think that the events of the past month have been that unlikely?

4 digits of separation

Conspiracy theorists are working overtime to discredit all the women who report having been molested by Donald Trump. (Trump’s near-legendary non-disclosure and non-disparagement clauses in all contracts, which pretty much exclude reports from any woman who ever worked for him — and even campaign volunteers — are the only thing keeping the numbers reasonably manageable.) The “pussy” video that kicked this all off was released as part of a joint plot by international Zionists and the gnomes of Zurich. And the woman who was groped while sitting next to Trump on a plane was lying (because supposedly first-class armrests in 1980s planes didn’t go up) and was an agent of the Clinton Foundation, since her telephone number (a convenient excuse for exposing her private information) is identical to one for a staff member at the foundation. Except,

While the article Delauzon’s tweet linked to claims that Leeds shared a phone number with the Clinton Foundation, the two phone numbers differed by several digits.

But obviously the story doesn’t end there. Granted, she was not actually working for the Clinton Foundation. You have to ask yourself, what are the odds that someone who was supposedly not connected at all with that organisation would happen to have a telephone number that was so similar. The question answers itself.

Tag Cloud