Oxbridge Delenda Est

I’ve been thinking for a long time that for all their merits as individual institutions, and all the advantages they offer to their faculty (like myself), students, and alumni (like myself), the hierarchical structure of tertiary education that defines their role, from which they benefit, and which they nurture, is fundamentally destructive.

I wrote an essay on this theme, and it has now appeared in the political magazine Current Affairs.

The poisoned roots of German anti-vax sentiment

I’ve long thought it odd that Germany, where the politics is generally fairly rational, and science education in particular is generally quite good, has such broad acceptance of homeopathy and a variety of other forms of quackery, and a special word — Schulmedizin — “academic medicine” — to express a dismissive attitude toward what elsewhere would be called just “medicine”, or perhaps “evidence-based medicine”. I was recently looking into the history of this, and found that attacks on Schulmedizin — or “verjudete Schulmedizin” (jewified academic medicine) — were as much a part of the Nazi state science policy as “German mathematics” and “Arian physics”.

Medicine in the Third Reich remained a weird mixture of modern virology and pseudo-scientific “racial hygiene”. The celebrated physician Erwin Liek wrote

Es ist mein Glaube, dass das deutsche Volk berufen ist, nach und nach eine ganz neue, rein deutsche Heilkunst zu entwickeln.
(It is my belief, that the German people has a calling, gradually to develop a pure German art of healing.)

Liek was appealing for a synthesis of Schulmedizin with traditional German treatment. As with Arian physics*, and the Nazi state was careful not to push the healthy German understanding so far as to undermine important technology and industry. But the appeal to average people’s intuitive discomfort with modern science was a powerful propaganda tool that they couldn’t resist using, as in this 1933 cartoon “The vaccination” from Der Stürmer that shows an innocent blond arian mother uncomfortably watching her baby being vaccinated by a fiendish Jewish doctor. The caption reads “This puts me in a strange mood/Poison and Jews never do good.”

1933 Cartoon from Der Stürmer: Blond German mother looking concerned as a beastly Jewish doctor vaccinates her baby. Caption: "This puts me in a strange mood/Poison and Jews are seldom good."
1933 Der Stürmer cartoon “The vaccination”.

Today’s anti-vaxers fulminating against Schulmedizin and the Giftspritze (poison shot) are not necessarily being consciously anti-Semitic, but the vocabulary and the paranoid conspiracy thinking are surely not unconnected.

* Heisenberg was famously proud of having protected “Jewish physics” from being banned at his university, considering himself a hero for continuing to teach relativity theory, even while not objecting to the expulsion of the Jewish physicists, and agreeing not to attach their names to their work. Once when I was browsing in the science section of a Berlin bookstore in the early 1990s a man started chatting with me, telling me that he had worked for decades as a radio engineer in the GDR, and then going on to a long monologue apropos of nothing about how wonderful Heisenberg was, and how he had courageously defended German science during the Third Reich.

Gender and the Metropolis (algorithm)

I’ve always heard of the Metropolis algorithm having been invented for H-bomb calculations by Nicholas Metropolis and Edward Teller. But I was just looking at the original paper, and discovered that there are five authors: Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller. Particularly striking having two repeated surnames, and a bit of research uncovers that these were two married couples: Arianna Rosenbluth and Marshall Rosenbluth, and Augusta Teller and Edward Teller. In particular, Arianna Rosenbluth (née Wright) appears to have been a formidable character, according to her Wikipedia page: She completed her physics PhD at Harvard at the age of 22.

In keeping with the 1950s conception of computer programming as women’s work, the two women were responsible, in particular, for all the programming — a heroic undertaking in those pre-programming language days, on the MANIAC I — and Rosenbluth in particular did all the programming for the final paper.

And also in keeping with the expectations of the time, and more depressingly, according to the Wikipedia article “After the birth of her first child, Arianna left research to focus on raising her family.”

Can’t look away

Many years ago I read to my daughter a children’s book in which a little girl learning to ride a bicycle keeps running into objects like trees and lampposts. A bicycle instructor explains to her that when you become too fixated on an obstacle it exerts a strong psychological pull, so that the very exigency of evading it leads to a crash.

I used to wonder whether this was a real phenomenon. I don’t anymore…

Actually, I’ve long thought the second Iraq War was an example of the same phenomenon. There was no possibility that there wouldn’t be a war, because once they’d started to consider it Bush and Blair couldn’t bare not to see how it would turn out.

The first principle of statistical inference

When I first started teaching basic statistics, I thought about how to explain the importance of statistical hypothesis testing. I focused on a textbook example (specifically, Freedman, Pisani, Purves Statistics, 3rd ed., sec 28.2) of a data set that seems to show more women being right-handed than men. I pointed out that we could think of many possible explanations: Girls are pressured more to conform, women are more rational — hence left-brain-centred. But before we invest too much time and credibility in abstruse theories to explain the phenomenon, we should first make sure that the phenomenon is real, that it’s not just the kind of fluctuation that could happen by accident. (It turns out that the phenomenon is real. I don’t know if either of my explanations is valid, or if anyone has a more plausible theory.)

I thought if this when I heard about the strange Oxford-AstraZeneca vaccine serendipity that was announced this week. The third vaccine success announced in as many weeks, the researchers announced that they had found about a 70% efficacy, which is good, but not nearly as impressive as the 95% efficacy of the mRNA vaccines announced earlier in the month. But the strange thing was, they found that a subset of the test subjects who received only a half dose at the first injection, and a full dose later, showed a 90% efficacy. Experts have been all over the news media trying to explain how some weird idiosyncrasies of the human immune system and the chimpanzee adenovirus vector could make a smaller dose more effective. Here’s a summary from Science:

Researchers especially want to know why the half-dose prime would lead to a better outcome. The leading hypothesis is that people develop immune responses against adenoviruses, and the higher first dose could have spurred such a strong attack that it compromised the adenovirus’ ability to deliver the spike gene to the body with the booster shot. “I would bet on that being a contributor but not the whole story,” says Adrian Hill, director of Oxford’s Jenner Institute, which designed the vaccine…

Some evidence also suggests that slowly escalating the dose of a vaccine more closely mimics a natural viral infection, leading to a more robust immune response. “It’s not really mechanistically pinned down exactly how it works,” Hill says.

Because the different dosing schemes likely led to different immune responses, Hill says researchers have a chance to suss out the mechanism by comparing vaccinated participants’ antibody and T cell levels. The 62% efficacy, he says, “is a blessing in disguise.”

Others have pointed out that the populations receiving the full dose and the half dose were substantially different: The half dose was given by accident to a couple of thousand subjects at the start of the British arm of the study. These were exclusively younger, healthier individuals, something that could also explain the higher efficacy, in a less benedictory fashion.

But before we start arguing over these very interesting explanations, much less trying to use them to “suss out the mechanisms” the question they should be asking is, is the effect real? The Science article quotes immunologist John Moore asking “Was that a real, statistically robust 90%?” To ask that question is to answer it resoundingly: No.

They haven’t provided much data, but the AstraZeneca press release does give enough clues:

One dosing regimen (n=2,741) showed vaccine efficacy of 90% when AZD1222 was given as a half dose, followed by a full dose at least one month apart, and another dosing regimen (n=8,895) showed 62% efficacy when given as two full doses at least one month apart. The combined analysis from both dosing regimens (n=11,636) resulted in an average efficacy of 70%. All results were statistically significant (p<=0.0001)

Note two tricks they play here. First of all, they give those (n=big number) which makes it seem reassuringly like they have an impressively big study. But these are the numbers of people vaccinated, which is completely irrelevant for judging the uncertainty in the estimate of efficacy. The reason you need such huge numbers of subjects is so that you can get moderately large numbers where it counts: the number of subjects who become infected. Further, while it is surely true that the “results” were highly statistically significant — that is, the efficacy in each individual group was not zero — this tells us nothing about whether we can be confident that the efficacy is actually higher than what has been considered the minimum acceptable level of 50%, or — and this is crucial for the point at issue here — whether the two groups were different from each other.

They report a total of 131 cases. They don’t say how many cases were in each group, but if we assume that there were equal numbers of subjects getting the vaccine and the treatment in all groups then we can back-calculate the rest. We end up with 98 cases in the full-dose group (of which 27 received the vaccine) and 33 cases in the half-dose group, of which 3 received the vaccine. Just 33! Using the Clopper-Pearson exact method, we obtain 90% confidence intervals of (.781,.975) for the efficacy of the half dose and (.641, .798) for the efficacy of the full dose. Clearly some overlap there, and not much to justify drawing substantive conclusions from the difference between the two groups — which may actually be zero, or close to 0.

The return of quota sampling

Everyone knows about the famous Dewey Defeats Truman headline fiasco, and that the Chicago Daily Tribune was inspired to its premature announcement by erroneous pre-election polls. But why were the polls so wrong?

The Social Science Research Council set up a committee to investigate the polling failure. Their report, published in 1949, listed a number of faults, including disparaging the very notion of trying to predict the outcome of a close election. But one important methodological criticism — and the one that significantly influenced the later development of political polling, and became the primary lesson in statistics textbooks — was the critique of quota sampling. (An accessible summary of lessons from the 1948 polling fiasco by the renowned psychologist Rensis Likert was published just a month after the election in Scientific American.)

Serious polling at the time was divided between two general methodologies: random sampling and quota sampling. Random sampling, as the name implies, works by attempting to select from the population of potential voters entirely at random, with each voter equally likely to be selected. This was still considered too theoretically novel to be widely used, whereas quota sampling had been established by Gallup since the mid-1930s. In quota sampling the voting population is modelled by demographic characteristics, based on census data, and each interviewer is assigned a quota to fill of respondents in each category: 51 women and 49 men, say, a certain number in the age range 21-34, or specific numbers in each “economic class” — of which Roper, for example, had five, one of which in the 1940s was “Negro”. The interviewers were allowed great latitude in filling their quotas, finding people at home or on the street.

In a sense, we have returned to quota sampling, in the more sophisticated version of “weighted probability sampling”. Since hardly anyone responds to a survey — response rates are typically no more than about 5% — there’s no way the people who do respond can be representative of the whole population. So pollsters model the population — or the supposed voting population — and reweight the responses they do get proportionately, according to demographic characteristics. If Black women over age 50 are thought to be equally common in the voting population as white men under age 30, but we have twice as many of the former as the latter, we count the responses of the latter twice as much as the former in the final estimates. It’s just a way of making a quota sample after the fact, without the stress of specifically looking for representatives of particular demographic groups.

Consequently, it has most of the deficiencies of a quota sample. The difficulty of modelling the electorate is one that has gotten quite a bit of attention in the modern context: We know fairly precisely how demographic groups are distributed in the population, but we can only theorise about how they will be distributed among voters at the next election. At the same time, it is straightforward to construct these theories, to describe them, and to test them after the fact. The more serious problem — and the one that was emphasised in the commission report in 1948, but has been less emphasised recently — is in the nature of how the quotas are filled. The reason for probability sampling is that taking whichever respondents are easiest to get — a “sample of convenience” — is sure to give you a biased sample. If you sample people from telephone directories in 1936 then it’s easy to see how they end up biased against the favoured candidate of the poor. If you take a sample of convenience within a small demographic group, such as middle-income people, then it won’t be easy to recognise how the sample is biased, but it may still be biased.

For whatever reason, in the 1930s and 1940s, within each demographic group the Republicans were easier for the interviewers to contact than the Democrats. Maybe they were just culturally more like the interviewers, so easier for them to walk up to on the street. And it may very well be that within each demographic group today Democrats are more likely to respond to a poll than Republicans. And if there is such an effect, it’s hard to correct for it, except by simply discounting Democrats by a certain factor based on past experience. (In fact, these effects can be measured in polling fluctuations, where events in the news lead one side or the other to feel discouraged, and to be less likely to respond to the polls. Studies have suggested that this effect explains much of the short-term fluctuation in election polls during a campaign.)

Interestingly, one of the problems that the commission found with the 1948 polling with relevance for the Trump era was the failure to consider education as a significant demographic variable.

All of the major polling organizations interviewed more people with college education than the actual proportion in the adult population over 21 and too few people with grade school education only.

Exotic animal farming

I remember when people were muttering about Covid-19 being all the fault of the weird Chinese and their weird obsession with eating weird animals like pangolins.

So now we have a second version of Covid, that may start a completely novel pandemic, and it comes from the weird Europeans and their weird obsession with wearing the fur of weird animals like minks. Apparently, it was well known that Covid was spreading widely among the minks, but the animals were too valuable to give up on, so they tried to get away with just culling the obviously sick ones. And now we can just hope that they can get the new plague out of Denmark under control before it becomes a second pandemic.

But the people who advocate just giving up on eating and wearing animals are still treated as something between dreamy mystics and lunatics…

Less than zero, part 2

In a long-ago post I wrote about how huge debts don’t make you poor, and illustrated this with the story of real-estate mogul Donald Trump. Negative large fortunes are closer to positive large fortunes than either is to zero. (I later had to correct my interpretation later, on discovering that the counterintuitive behaviour of Trump’s creditors was largely a reflection of their involvement in money laundering.)

Now we learn from the N Y Times that Trump has been paying $750 in federal income tax each year as president. Presumably that’s just an arbitrary number that he made up so that he could say it wasn’t zero. (Apparently even Trump has some limits to his his explicit lying.)

But here’s the thing: $750 is probably worse than $0. People have been assuming he wasn’t paying taxes. It sounds like a general insult. $750 is too specific (as well as being too small). The number becomes a shorthand for his tax-dodging, as well as inviting people to compare their own tax bills to Trump’s.

This demonstrates again how absurdly miserly Donald Trump, above and beyond his criminality. He had to choose an amount to pay purely for the symbolism of possibly needing to tell average Americans how much he had paid. He could certainly have afforded not to choose an amount large enough that even Americans of modest means would find risible. At least four figures…