The return of quota sampling

Everyone knows about the famous Dewey Defeats Truman headline fiasco, and that the Chicago Daily Tribune was inspired to its premature announcement by erroneous pre-election polls. But why were the polls so wrong?

The Social Science Research Council set up a committee to investigate the polling failure. Their report, published in 1949, listed a number of faults, including disparaging the very notion of trying to predict the outcome of a close election. But one important methodological criticism — and the one that significantly influenced the later development of political polling, and became the primary lesson in statistics textbooks — was the critique of quota sampling. (An accessible summary of lessons from the 1948 polling fiasco by the renowned psychologist Rensis Likert was published just a month after the election in Scientific American.)

Serious polling at the time was divided between two general methodologies: random sampling and quota sampling. Random sampling, as the name implies, works by attempting to select from the population of potential voters entirely at random, with each voter equally likely to be selected. This was still considered too theoretically novel to be widely used, whereas quota sampling had been established by Gallup since the mid-1930s. In quota sampling the voting population is modelled by demographic characteristics, based on census data, and each interviewer is assigned a quota to fill of respondents in each category: 51 women and 49 men, say, a certain number in the age range 21-34, or specific numbers in each “economic class” — of which Roper, for example, had five, one of which in the 1940s was “Negro”. The interviewers were allowed great latitude in filling their quotas, finding people at home or on the street.

In a sense, we have returned to quota sampling, in the more sophisticated version of “weighted probability sampling”. Since hardly anyone responds to a survey — response rates are typically no more than about 5% — there’s no way the people who do respond can be representative of the whole population. So pollsters model the population — or the supposed voting population — and reweight the responses they do get proportionately, according to demographic characteristics. If Black women over age 50 are thought to be equally common in the voting population as white men under age 30, but we have twice as many of the former as the latter, we count the responses of the latter twice as much as the former in the final estimates. It’s just a way of making a quota sample after the fact, without the stress of specifically looking for representatives of particular demographic groups.

Consequently, it has most of the deficiencies of a quota sample. The difficulty of modelling the electorate is one that has gotten quite a bit of attention in the modern context: We know fairly precisely how demographic groups are distributed in the population, but we can only theorise about how they will be distributed among voters at the next election. At the same time, it is straightforward to construct these theories, to describe them, and to test them after the fact. The more serious problem — and the one that was emphasised in the commission report in 1948, but has been less emphasised recently — is in the nature of how the quotas are filled. The reason for probability sampling is that taking whichever respondents are easiest to get — a “sample of convenience” — is sure to give you a biased sample. If you sample people from telephone directories in 1936 then it’s easy to see how they end up biased against the favoured candidate of the poor. If you take a sample of convenience within a small demographic group, such as middle-income people, then it won’t be easy to recognise how the sample is biased, but it may still be biased.

For whatever reason, in the 1930s and 1940s, within each demographic group the Republicans were easier for the interviewers to contact than the Democrats. Maybe they were just culturally more like the interviewers, so easier for them to walk up to on the street. And it may very well be that within each demographic group today Democrats are more likely to respond to a poll than Republicans. And if there is such an effect, it’s hard to correct for it, except by simply discounting Democrats by a certain factor based on past experience. (In fact, these effects can be measured in polling fluctuations, where events in the news lead one side or the other to feel discouraged, and to be less likely to respond to the polls. Studies have suggested that this effect explains much of the short-term fluctuation in election polls during a campaign.)

Interestingly, one of the problems that the commission found with the 1948 polling with relevance for the Trump era was the failure to consider education as a significant demographic variable.

All of the major polling organizations interviewed more people with college education than the actual proportion in the adult population over 21 and too few people with grade school education only.

Exotic animal farming

I remember when people were muttering about Covid-19 being all the fault of the weird Chinese and their weird obsession with eating weird animals like pangolins.

So now we have a second version of Covid, that may start a completely novel pandemic, and it comes from the weird Europeans and their weird obsession with wearing the fur of weird animals like minks. Apparently, it was well known that Covid was spreading widely among the minks, but the animals were too valuable to give up on, so they tried to get away with just culling the obviously sick ones. And now we can just hope that they can get the new plague out of Denmark under control before it becomes a second pandemic.

But the people who advocate just giving up on eating and wearing animals are still treated as something between dreamy mystics and lunatics…

Less than zero, part 2

In a long-ago post I wrote about how huge debts don’t make you poor, and illustrated this with the story of real-estate mogul Donald Trump. Negative large fortunes are closer to positive large fortunes than either is to zero. (I later had to correct my interpretation later, on discovering that the counterintuitive behaviour of Trump’s creditors was largely a reflection of their involvement in money laundering.)

Now we learn from the N Y Times that Trump has been paying $750 in federal income tax each year as president. Presumably that’s just an arbitrary number that he made up so that he could say it wasn’t zero. (Apparently even Trump has some limits to his his explicit lying.)

But here’s the thing: $750 is probably worse than $0. People have been assuming he wasn’t paying taxes. It sounds like a general insult. $750 is too specific (as well as being too small). The number becomes a shorthand for his tax-dodging, as well as inviting people to compare their own tax bills to Trump’s.

This demonstrates again how absurdly miserly Donald Trump, above and beyond his criminality. He had to choose an amount to pay purely for the symbolism of possibly needing to tell average Americans how much he had paid. He could certainly have afforded not to choose an amount large enough that even Americans of modest means would find risible. At least four figures…

The opposite of a superficial lie

“The opposite of a fact is a falsehood. But the opposite of a profound truth may very well be another profound truth.”

Niels Bohr

The news media have gotten themselves tangled up, from the beginning of the Trump era, in the epistemological question of whether any statement can objectively be called a lie. Yes, Trump says things that are untrue, that contradict objectively known facts, but are they lies? Does he have the appropriate mens rea to lie, the intention to deceive, or is that just a partisan insult?

The opposite problem has gotten too little attention. Just because Donald Trump says something that corresponds to objective facts, one cannot infer that he is speaking the truth. (We don’t really have a word in English to correspond to the opposite of lie, in this dichotomy.) A good example is the controversy over Trump’s private and public comments on the incipient Coronavirus pandemic in February and March of this year. On February 7, 2020, Trump told Woodward

You just breathe the air and that’s how it’s passed. And so that’s a very tricky one. That’s a very delicate one. It’s also more deadly than even your strenuous flus.

This is quite an accurate statement, and also very different than what he was saying publicly. On February 10 he said, in a campaign speech,

I think the virus is going to be — it’s going to be fine.

And February 26 in an official White House pandemic task force briefing:

The 15 [case count in the U.S.] within a couple of days is going to be down to close to zero. … This is a flu. This is like a flu.

When you see that someone has been saying one thing in public and something completely different in private, it’s natural to interpret the former as lying and the latter as the secret truth — particularly when, as in this case, the private statement is known to be, in fact, true, and the public statement false. And particularly when the speaker later says

I wanted to always play it down. I still like playing it down, because I don’t want to create a panic.

With Trump, though, this interpretation is likely false.

The thing is, while his statement of February 7 was true, he could not have known it was true. No one knew it was true. We can see any number of statements by responsible public-health officials making similar statements at the time. For example, Anthony Fauci on February 19:

Fauci doesn’t want people to worry about coronavirus, the danger of which is “just minuscule.” But he does want them to take precautions against the “influenza outbreak, which is having its second wave.”

“We have more kids dying of flu this year at this time than in the last decade or more,” he said. “At the same time people are worrying about going to a Chinese restaurant. The threat is (we have) a pretty bad influenza season, particularly dangerous for our children.”

And it’s not just Americans under the thumb of Trump. February 6, the day before Trump’s remark to Woodward, the head of the infectious disease clinic at a major Munich hospital, where some of the first German Covid-19 patients were being treated, told the press that “Corona is definitely not more dangerous than influenza,” and criticised the panic that was coming from exaggerated estimates of mortality rates.

Researchers were posting their data and models in real time, but there just wasn’t enough understanding possible then. This is the kind of issue where the secret information that a government has access to is of particularly limited value.

So how are we to interpret Trump’s statements? I think the key is that Trump is not a liar per se, he is a conman and a bullshitter, someone to whom the truth of his statements is completely irrelevant.

In early February he probably did receive a briefing where the possibility that the novel coronavirus was highly lethal and airborne was raised as one possibility, as well as the possibility that it was mild and would disappear on its own. .In talking to elite journalist Bob Woodward he delivered up the most frightening version, not because he believed it was true, but because it seemed most impressive, making him seem like the mighty keeper of dangerous secrets. When talking to the public he said something different, because he had other motives. It’s purely coincidence that what he said in private turned out to be true.

It would be poetic justice of Trump were to be damaged by the bad luck of one time accidentally having told the truth.

The model didn’t fail us, we failed the model

THE ALGORITHM. It’s all anyone can talk about, when they’re talking about universities these days. Illustrative of the unique ability of the current UK government to take a challenging societal problem in hand, and transform it into a flaming chaos that simultaneously exacerbates divisions and satisfies no one.

In this case, it’s about the assignment of marks in A-levels (18 year-olds) released last week, and GCSEs (16-year-olds) still ahead this week. Scotland had its own small version of the fiasco that played out earlier in the week for their own Scottish Higher exams, but the UK government, responsible for English A-levels, managed not only not to learn from the Scottish situation and change course early, it managed to parlay the political challenge into a systemic disaster for higher education that will now roll on for at least the next year or two.

Like any great governing disaster, this one has been years in the making. Pupils doing A levels used to have intermediate exams — AS levels — after the first year of their two-year course, as well as significant amounts of coursework that counted for a substantial portion of their final marks. AS levels were progressively eliminated and coursework reduced over the past decade in England (but not Wales), as the Conservatives seem to have believed that todays pupils were being inappropriately coddled by having too little stress and uncontrollable randomness in their lives, leaving several weeks of exams right at the end of their course as the only determinant of the marks that would decide high-stakes competition for university places. Then they cancelled the exams, in a panicked response to the first wave of Covid-19. Leaving them with nothing.

Weirdly, it’s not as though they don’t have frequent exams during their school (and university) time. But these exams are called “mock exams”, and don’t count for anything in the end.

Which brings us to THE ALGORITHM. How do you assign marks to students when you don’t have any exams? Teachers have quite a lot of information, even if it doesn’t formally count for anything in the regular process. (Weirdly, teachers are regularly expected to produce “predicted grades” based on mock exams, coursework, and general impressions, because the official marks arrive too late for university admissions.) But on average they tend to be overly optimistic — or, one might also say, either generous or strategic, since the university admission offer that results from an overpredicted A-level grade is not necessarily withdrawn when the exam result exceeds it, whereas the university place that is lost from an underprediction is almost impossible to make good.

If you were a mindless machine-learning bot trying to optimise the accuracy of prediction of missing marks in an overall minimum-mean-error way, you would take data about each student’s family income, ethnicity, sex, parents’ occupations, and region, all of which are likely to be correlated to exam scores. But that would seem outrageously biased: Why should the young person with wealthy parents get higher A-level marks than the one with poor parents, after they had the same mock exam grades? The machine-learning answer is, because that’s what’s happened with real grades in the past. The wealthy family is likely to provide more support, maybe tutors, a more stable environment for studying for the exams. The child of the poor family may have been working hard since year 12, but there’s a much higher chance that the family would have had a crisis — maybe a parent losing a job, illness, homelessness — that would have distracted from exam preparation and led to underperformance at the exam. And since that might have happened in reality, that needs to be reflected in our optimal prediction algorithm.

But that looks bad, so the Ofqual boffins used past school performance as a proxy. Effectively, they said that each school gets the same marks this year that they got last year. Teacher evaluations were used to rank the students in each subject, to decide which students get the school’s quota of A*’s, etc. If you made the bad life choice to go to a low-performing school where no one in living memory has scored better than B in chemistry, then B is the ceiling for your marks, no matter what scores you may personally have been achieving on your mockeries.

Averaged over the whole population of English students your misfortune is just a small blemish on an overall excellent prediction.

It’s a good illustration of the problems of ethical machine learning. People say, if you don’t want your algorithm to be biased based on gender, don’t include gender information in the dataset. But if you instead include height information, say, the algorithm will learn all the gender bias in the training set and assign it to the height variable.

Just to rub salt in the wounds, there was an extra fillip for students in small — heavily private — schools: Since average performance fluctuates more in small groups, courses with 15 or fewer students had their (generally higher) teacher predictions more heavily weighted in their final marks, and those with 5 or fewer received their teacher predictions unfiltered.

Now, this way of using past school performance seems… surprising, to those of us who have been involved in UK university admissions in the past, given the extent of government and public outrage every year when the elite universities once again draw their intake from a very small sliver of UK secondary schools, predominantly private schools. You might think that this outrage reflects a belief that the differences in average exam performance, that drive most of the differential in university admissions, are unfair, that they do not accurately represent student ability, performance, and potential. If you believed that, you might propose a very different way of using school performance to assign marks, namely: Every school gets the same proportion of A*, A, B, etc., to be allocated within each subject according to the teacher rankings. I’m not advocating this method, but it is no more extreme, in its own way, than the application of past school performance that was actually implemented.

To the extent that A-level marks are primarily a tool for sorting graduates for university admissions, this would function somewhat similarly to the practice of some US states, of guaranteeing admission to their state universities to a certain percentile of every high school in the state. This leverages housing and school segregation to benefit equality, as opposed to the opposite.

The fact that my algorithm seems obviously unfair to individuals, while the other algorithm was seen as not only credible but actually self-evident, reflects nothing but naked ideology about the nature of class.

Education minister (a position whose relationship to that of education secretary confuses me) Nick Gibb responded to the fiasco thus:

So the model itself was fair, it was very popular, it was widely consulted upon. The problem arose in the way in which the three phases of the application of that model – the historic data of the school, the prior attainment of the cohort of pupils at the school, and then the national standard correction – it’s that element of the application of the model that I think there is a concern.”

The minister went on: “The application of the model is a regulatory approach and it’s the development of that that emerged on the Thursday when the algorithm was published. And at that stage it became clear that there were some results that were being published on Thursday and Friday that were just not right and they were not what the model had intended.”

The poor misunderstood beast. It meant well…

Count no statue happy…

Count no man happy till he dies.

Sophocles, Oedipus the King (trans. Robert Fagles)

Must no one at all, then, be called happy while he lives; must we, as Solon says, see the end? Even if we are to lay down this doctrine, is it also the case that a man is happy when he is dead? […] for both evil and good are thought to exist for a dead man, as much as for one who is alive but not aware of them; e.g. honours and dishonours and the good or bad fortunes of children and in general of descendants.

Aristotle, Nichomachean Ethics, Book 1 (trans. W D Ross)

In all of the discussion of racist statues one fundamental point is rarely mentioned: Above all, public statues represent the unwillingness of “great men” to simply go away. Those who bestrode their narrow world like a Colossus are loath to let death remove them from the scene, so like the stuffed dodo in a diorama they have their effigies propped up in the public square.

While they lived they received the adulation of the crowds, and the opprobrium of their opponents. If the great one’s supporters need a public icon as a focus for their devotions, the icon will have to continue to participate in the hurly-burly of public life, including the scrutiny of their lives and deeds brought on by shifting ethical standards. If Winston Churchill were alive today he would rightly have paint and rotten tomatoes flung at him by those appalled at his racist ideas and actions. Reasonable can believe that his near-genocidal actions in Bengal, among others places inhabited by darker-skinned people, are more significant than a few well-crafted speeches that bucked up the spirits of the Island Race. Reasonable people did think so during his life. The place where one is beyond praise or blame is called the grave, and no one is suggesting disinterring WC’s bones — though an earlier generation of Tories did exactly that with Oliver Cromwell, after the tide of history turned against him.

His supporters are welcome to hide his statues away in private shrines, or public museums. If you put them up in public you have to accept that people are going to continue to engage with them. Sometimes angrily. Sometimes disorderly.

Plagues and statues

I’ve been reading Camus’ La Peste, hoping to obtain some insight into one of the great crises of the present, and finding him commenting on a completely different one. At the height of the epidemic of the novel, the narrator comments on the aspect of the silent, immobilised city, and expresses resentment toward the statues that are permanently in that condition.

La grande cité silencieuse n’était plus alors qu’un assemblage de cubes massifs et inertes, entre lesquels les effigies taciturnes de bienfaiteurs oubliés ou d’anciens grands hommes étouffés à jamais dans le bronze s’essayaient seules, avec leurs faux visages de pierre ou de fer, à évoquer une image dégradée de ce qui avait été l’homme. Ces idoles médiocres trônaient sous un ciel épais, dans les carrefours sans vie, brutes insensibles qui figuraient assez bien le règne immobile où nous étions entrés ou du moins son ordre ultime, celui d’une nécropole où la peste, la pierre et la nuit auraient fait taire enfin toute voix.

The huge, silent city had become nothing more than a collection of solid, inert cubes, where the taciturn effigies of forgotten benefactors or ancient great men were suffocated forever in bronze, evoking a solitary, degraded image of what man had once been. These mediocre idols, enthroned under a thick sky, in the lifeless crossroads, unfeeling beasts that symbolised well the immobilised realm we had entered, or at least its ultimate order, that of a necropolis where plague, stone, and night would have finally silences any voice.

I’ve commented before on how odd it is that, just because some of our ancestors chose to cast their images in heavy bronze or marble and plonk them down at significant sites in our cities, that we should feel obliged to keep them there. But I assumed that the current attacks on statues of racists was unrelated to the pandemic situation, mere coincidence of crises, except perhaps that the lockdown left people with lots of pent-up energy.

But maybe there’s something about coping with an epidemic that inspires iconoclasm?

Adrift on the Covid Sea

Political leaders in many countries — but particularly in the US and UK — are in thrall above all to the myth of progress. Catastrophes may happen, but then they get better. And to superficial characters like Johnson and Trump, the improvements seem automatic. It’s like a law of nature.

So, we find ourselves having temporarily stemmed the flood of Covid infections, with governments laying out fantastic plans for “reopening”. Even though nothing significant has changed. The only thing that could make this work — absent a vaccine — would be an efficient contact tracing system or a highly effective treatment for the disease. None of which we have. But we still have a timeline for opening up pubs and cinemas (though less important facilities like schools are still closed, at least for many year groups).

It’s like we had been adrift for days in a lifeboat on the open ocean, carefully conserving our supplies. And there’s still no rescue in sight, but Captain Johnson announces that since we’re all hungry from limiting our food rations, and the situation has now stabilised, we will now be transitioning toward full rations.