When I first started teaching basic statistics, I thought about how to explain the importance of statistical hypothesis testing. I focused on a textbook example (specifically, Freedman, Pisani, Purves Statistics, 3rd ed., sec 28.2) of a data set that seems to show more women being right-handed than men. I pointed out that we could think of many possible explanations: Girls are pressured more to conform, women are more rational — hence left-brain-centred. But before we invest too much time and credibility in abstruse theories to explain the phenomenon, we should first make sure that the phenomenon is real, that it’s not just the kind of fluctuation that could happen by accident. (It turns out that the phenomenon is real. I don’t know if either of my explanations is valid, or if anyone has a more plausible theory.)

I thought if this when I heard about the strange Oxford-AstraZeneca vaccine serendipity that was announced this week. The third vaccine success announced in as many weeks, the researchers announced that they had found about a 70% efficacy, which is good, but not nearly as impressive as the 95% efficacy of the mRNA vaccines announced earlier in the month. But the strange thing was, they found that a subset of the test subjects who received only a half dose at the first injection, and a full dose later, showed a 90% efficacy. Experts have been all over the news media trying to explain how some weird idiosyncrasies of the human immune system and the chimpanzee adenovirus vector could make a smaller dose more effective. Here’s a summary from Science:

Researchers especially want to know why the half-dose prime would lead to a better outcome. The leading hypothesis is that people develop immune responses against adenoviruses, and the higher first dose could have spurred such a strong attack that it compromised the adenovirus’ ability to deliver the spike gene to the body with the booster shot. “I would bet on that being a contributor but not the whole story,” says Adrian Hill, director of Oxford’s Jenner Institute, which designed the vaccine…

Some evidence also suggests that slowly escalating the dose of a vaccine more closely mimics a natural viral infection, leading to a more robust immune response. “It’s not really mechanistically pinned down exactly how it works,” Hill says.

Because the different dosing schemes likely led to different immune responses, Hill says researchers have a chance to suss out the mechanism by comparing vaccinated participants’ antibody and T cell levels. The 62% efficacy, he says, “is a blessing in disguise.”

Others have pointed out that the populations receiving the full dose and the half dose were substantially different: The half dose was given by accident to a couple of thousand subjects at the start of the British arm of the study. These were exclusively younger, healthier individuals, something that could also explain the higher efficacy, in a less benedictory fashion.

But before we start arguing over these very interesting explanations, much less trying to use them to “suss out the mechanisms” the question they should be asking is, is the effect real? The *Science* article quotes immunologist John Moore asking “Was that a real, statistically robust 90%?” To ask that question is to answer it resoundingly: No.

They haven’t provided much data, but the AstraZeneca press release does give enough clues:

One dosing regimen (n=2,741) showed vaccine efficacy of 90% when AZD1222 was given as a half dose, followed by a full dose at least one month apart, and another dosing regimen (n=8,895) showed 62% efficacy when given as two full doses at least one month apart. The combined analysis from both dosing regimens (n=11,636) resulted in an average efficacy of 70%. All results were statistically significant (p<=0.0001)

Note two tricks they play here. First of all, they give those (n=big number) which makes it seem reassuringly like they have an impressively big study. But these are the numbers of people vaccinated, which is completely irrelevant for judging the uncertainty in the estimate of efficacy. The reason you need such huge numbers of subjects is so that you can get moderately large numbers where it counts: the number of subjects who become infected. Further, while it is surely true that the “results” were highly statistically significant — that is, the efficacy in each individual group was not zero — this tells us nothing about whether we can be confident that the efficacy is actually higher than what has been considered the minimum acceptable level of 50%, or — and this is crucial for the point at issue here — whether the two groups were different from each other.

They report a total of 131 cases. They don’t say how many cases were in each group, but if we assume that there were equal numbers of subjects getting the vaccine and the treatment in all groups then we can back-calculate the rest. We end up with 98 cases in the full-dose group (of which 27 received the vaccine) and 33 cases in the half-dose group, of which 3 received the vaccine. Just 33! Using the Clopper-Pearson exact method, we obtain 90% confidence intervals of (.781,.975) for the efficacy of the half dose and (.641, .798) for the efficacy of the full dose. Clearly some overlap there, and not much to justify drawing substantive conclusions from the difference between the two groups — which may actually be zero, or close to 0.