The return of quota sampling

Everyone knows about the famous Dewey Defeats Truman headline fiasco, and that the Chicago Daily Tribune was inspired to its premature announcement by erroneous pre-election polls. But why were the polls so wrong?

The Social Science Research Council set up a committee to investigate the polling failure. Their report, published in 1949, listed a number of faults, including disparaging the very notion of trying to predict the outcome of a close election. But one important methodological criticism — and the one that significantly influenced the later development of political polling, and became the primary lesson in statistics textbooks — was the critique of quota sampling. (An accessible summary of lessons from the 1948 polling fiasco by the renowned psychologist Rensis Likert was published just a month after the election in Scientific American.)

Serious polling at the time was divided between two general methodologies: random sampling and quota sampling. Random sampling, as the name implies, works by attempting to select from the population of potential voters entirely at random, with each voter equally likely to be selected. This was still considered too theoretically novel to be widely used, whereas quota sampling had been established by Gallup since the mid-1930s. In quota sampling the voting population is modelled by demographic characteristics, based on census data, and each interviewer is assigned a quota to fill of respondents in each category: 51 women and 49 men, say, a certain number in the age range 21-34, or specific numbers in each “economic class” — of which Roper, for example, had five, one of which in the 1940s was “Negro”. The interviewers were allowed great latitude in filling their quotas, finding people at home or on the street.

In a sense, we have returned to quota sampling, in the more sophisticated version of “weighted probability sampling”. Since hardly anyone responds to a survey — response rates are typically no more than about 5% — there’s no way the people who do respond can be representative of the whole population. So pollsters model the population — or the supposed voting population — and reweight the responses they do get proportionately, according to demographic characteristics. If Black women over age 50 are thought to be equally common in the voting population as white men under age 30, but we have twice as many of the former as the latter, we count the responses of the latter twice as much as the former in the final estimates. It’s just a way of making a quota sample after the fact, without the stress of specifically looking for representatives of particular demographic groups.

Consequently, it has most of the deficiencies of a quota sample. The difficulty of modelling the electorate is one that has gotten quite a bit of attention in the modern context: We know fairly precisely how demographic groups are distributed in the population, but we can only theorise about how they will be distributed among voters at the next election. At the same time, it is straightforward to construct these theories, to describe them, and to test them after the fact. The more serious problem — and the one that was emphasised in the commission report in 1948, but has been less emphasised recently — is in the nature of how the quotas are filled. The reason for probability sampling is that taking whichever respondents are easiest to get — a “sample of convenience” — is sure to give you a biased sample. If you sample people from telephone directories in 1936 then it’s easy to see how they end up biased against the favoured candidate of the poor. If you take a sample of convenience within a small demographic group, such as middle-income people, then it won’t be easy to recognise how the sample is biased, but it may still be biased.

For whatever reason, in the 1930s and 1940s, within each demographic group the Republicans were easier for the interviewers to contact than the Democrats. Maybe they were just culturally more like the interviewers, so easier for them to walk up to on the street. And it may very well be that within each demographic group today Democrats are more likely to respond to a poll than Republicans. And if there is such an effect, it’s hard to correct for it, except by simply discounting Democrats by a certain factor based on past experience. (In fact, these effects can be measured in polling fluctuations, where events in the news lead one side or the other to feel discouraged, and to be less likely to respond to the polls. Studies have suggested that this effect explains much of the short-term fluctuation in election polls during a campaign.)

Interestingly, one of the problems that the commission found with the 1948 polling with relevance for the Trump era was the failure to consider education as a significant demographic variable.

All of the major polling organizations interviewed more people with college education than the actual proportion in the adult population over 21 and too few people with grade school education only.

Less than zero, part 2

In a long-ago post I wrote about how huge debts don’t make you poor, and illustrated this with the story of real-estate mogul Donald Trump. Negative large fortunes are closer to positive large fortunes than either is to zero. (I later had to correct my interpretation later, on discovering that the counterintuitive behaviour of Trump’s creditors was largely a reflection of their involvement in money laundering.)

Now we learn from the N Y Times that Trump has been paying $750 in federal income tax each year as president. Presumably that’s just an arbitrary number that he made up so that he could say it wasn’t zero. (Apparently even Trump has some limits to his his explicit lying.)

But here’s the thing: $750 is probably worse than $0. People have been assuming he wasn’t paying taxes. It sounds like a general insult. $750 is too specific (as well as being too small). The number becomes a shorthand for his tax-dodging, as well as inviting people to compare their own tax bills to Trump’s.

This demonstrates again how absurdly miserly Donald Trump, above and beyond his criminality. He had to choose an amount to pay purely for the symbolism of possibly needing to tell average Americans how much he had paid. He could certainly have afforded not to choose an amount large enough that even Americans of modest means would find risible. At least four figures…

The opposite of a superficial lie

“The opposite of a fact is a falsehood. But the opposite of a profound truth may very well be another profound truth.”

Niels Bohr

The news media have gotten themselves tangled up, from the beginning of the Trump era, in the epistemological question of whether any statement can objectively be called a lie. Yes, Trump says things that are untrue, that contradict objectively known facts, but are they lies? Does he have the appropriate mens rea to lie, the intention to deceive, or is that just a partisan insult?

The opposite problem has gotten too little attention. Just because Donald Trump says something that corresponds to objective facts, one cannot infer that he is speaking the truth. (We don’t really have a word in English to correspond to the opposite of lie, in this dichotomy.) A good example is the controversy over Trump’s private and public comments on the incipient Coronavirus pandemic in February and March of this year. On February 7, 2020, Trump told Woodward

You just breathe the air and that’s how it’s passed. And so that’s a very tricky one. That’s a very delicate one. It’s also more deadly than even your strenuous flus.

This is quite an accurate statement, and also very different than what he was saying publicly. On February 10 he said, in a campaign speech,

I think the virus is going to be — it’s going to be fine.

And February 26 in an official White House pandemic task force briefing:

The 15 [case count in the U.S.] within a couple of days is going to be down to close to zero. … This is a flu. This is like a flu.

When you see that someone has been saying one thing in public and something completely different in private, it’s natural to interpret the former as lying and the latter as the secret truth — particularly when, as in this case, the private statement is known to be, in fact, true, and the public statement false. And particularly when the speaker later says

I wanted to always play it down. I still like playing it down, because I don’t want to create a panic.

With Trump, though, this interpretation is likely false.

The thing is, while his statement of February 7 was true, he could not have known it was true. No one knew it was true. We can see any number of statements by responsible public-health officials making similar statements at the time. For example, Anthony Fauci on February 19:

Fauci doesn’t want people to worry about coronavirus, the danger of which is “just minuscule.” But he does want them to take precautions against the “influenza outbreak, which is having its second wave.”

“We have more kids dying of flu this year at this time than in the last decade or more,” he said. “At the same time people are worrying about going to a Chinese restaurant. The threat is (we have) a pretty bad influenza season, particularly dangerous for our children.”

And it’s not just Americans under the thumb of Trump. February 6, the day before Trump’s remark to Woodward, the head of the infectious disease clinic at a major Munich hospital, where some of the first German Covid-19 patients were being treated, told the press that “Corona is definitely not more dangerous than influenza,” and criticised the panic that was coming from exaggerated estimates of mortality rates.

Researchers were posting their data and models in real time, but there just wasn’t enough understanding possible then. This is the kind of issue where the secret information that a government has access to is of particularly limited value.

So how are we to interpret Trump’s statements? I think the key is that Trump is not a liar per se, he is a conman and a bullshitter, someone to whom the truth of his statements is completely irrelevant.

In early February he probably did receive a briefing where the possibility that the novel coronavirus was highly lethal and airborne was raised as one possibility, as well as the possibility that it was mild and would disappear on its own. .In talking to elite journalist Bob Woodward he delivered up the most frightening version, not because he believed it was true, but because it seemed most impressive, making him seem like the mighty keeper of dangerous secrets. When talking to the public he said something different, because he had other motives. It’s purely coincidence that what he said in private turned out to be true.

It would be poetic justice of Trump were to be damaged by the bad luck of one time accidentally having told the truth.

Adrift on the Covid Sea

Political leaders in many countries — but particularly in the US and UK — are in thrall above all to the myth of progress. Catastrophes may happen, but then they get better. And to superficial characters like Johnson and Trump, the improvements seem automatic. It’s like a law of nature.

So, we find ourselves having temporarily stemmed the flood of Covid infections, with governments laying out fantastic plans for “reopening”. Even though nothing significant has changed. The only thing that could make this work — absent a vaccine — would be an efficient contact tracing system or a highly effective treatment for the disease. None of which we have. But we still have a timeline for opening up pubs and cinemas (though less important facilities like schools are still closed, at least for many year groups).

It’s like we had been adrift for days in a lifeboat on the open ocean, carefully conserving our supplies. And there’s still no rescue in sight, but Captain Johnson announces that since we’re all hungry from limiting our food rations, and the situation has now stabilised, we will now be transitioning toward full rations.

“Zelensky loves your ass”

There’s a lot of competition for the weirdest moments in the Ukraine bribery-extortion-political meddling affair that underlies the current impeachment hearings, but for me there’s not much that can compete with the testimony of diplomat David Holmes that he overheard hotel-magnate-cum-ambassador Gordon Sondland telling Trump that Zelensky would “do anything you ask for” because Zelensky “loves your ass”.

My first reaction on reading this — I may have understood it differently had I heard it spoken — was that it was most bizarre for a head of state to be commenting (favourably or unfavourably) on the intimate anatomy of the US president. And that Trump didn’t strike me as someone particularly concerned about his toned glutes.

I quickly realised that this is not actually an erotic compliment, but rather an application of the somewhat gangster argot that uses “ass” as a general intensifier. I am reminded of the section of Gravity’s Rainbow titled “On the phrase ‘ass backwards’”, where the literal-minded Berlin drug dealer Säure Bummer asks a group of AmericanS

Why do you speak of certain reversals — machinery connected wrong, for instance, as being “Ass backwards”? I can’t understand that. Ass usually is backwards, right? You ought to be saying “ass forwards,” if backwards is what you mean.

After a typical digression about umlauts and helicopters Seaman Bodine replies

“‘Ass’ is an intensifier, as in ‘mean ass’, ‘stupid ass’ — well, when something is very backwards, by analogy you’d say ‘backwards ass.’”

“But ‘ass backwards’ is ‘backwards ass’ backwards,” Säure objects.

“But gee that doesn’t make it mean forwards.”

I’m still not exactly sure what “he loves your ass” actually means and, in particular, whether it conveys an erotic charge.

The no quid pro quo party

Hearing Donald Trump and all his lackeys repeating “no quid pro quo” ad nauseam gave me flashbacks to an earlier Republican president:

This was sufficiently prominent to be parodied in Doonesbury:

Rick: Sir, off the record, what’s the deal with Honduras? It really is starting to look like you cut a deal with President Suavo to support the Contras…
Bush: Rick, that’s just a bunch of needless, reckless speculation, so let me help you out, fella…The word of the President of the United States, me, George Bush, is there was no pro quo! Repeat, no… quid…pro…quo! Ergo, no de facto or de jure nolo contendere! Reporters: Quis? Quois?

Trump supporters are ignoring the base (rate) — Or, Ich möcht’ so gerne wissen, ob Trumps erpressen

One of the key insights from research on decision-making — from Tversky and Kahneman, Gigerenzer, and others — is the “base rate fallacy”: in judging new evidence people tend to ignore the underlying (prior) likelihood of various outcomes. A famous example, beloved of probability texts and lectures, is the reasonably accurate — 99% chance of a correct result — test for a rare disease (1 in 10,000 in the population). A randomly selected person with a positive test has a 99% chance of not having the disease, since correct positive tests on the 1 in 10,000 infected individuals are far less common than false positive tests on the other 9,999.

This seems to fit into a more general pattern of prioritising new and/or private information over public information that may be more informative, or at least more accessible. Journalists are conspicuously prone to this bias. For instance, as Brexit blogger Richard North has lamented repeatedly, UK journalists would breathlessly hype the latest leaks of government planning documents revealing the extent of adjustments that would be needed for phytosanitary checks at the border, for instance, or aviation, where the same information had been available for a year in official planning documents on the European Commission website. This psychological bias was famously exploited by WWII British intelligence operatives in Operation Mincemeat, where they dropped a corpse stuffed with fake plans for an invasion at Calais into the sea, where they knew it would wind up on the shore in Spain. They knew that the Germans would take the information much more seriously if they thought they had found it covertly. In my own experience of undergraduate admissions at Oxford I have found it striking the extent to which people consider what they have seen in a half-hour interview to be the deep truth about a candidate, outweighing the evidence of examinations and teacher evaluations.

Which brings us to Donald Trump, who has been accused of colluding with foreign governments to defame his political opponents. He has done his collusion both in private and in public. He famously announced in a speech during the 2016 election campaign, “Russia, if you’re listening, I hope you’re able to find the 30,000 emails that are missing. I think you will probably be rewarded mightily by our press.” And just the other day he said “I would think that if [the Ukrainean government] were honest about it, they’d start a major investigation into the Bidens. It’s a very simple answer. They should investigate the Bidens because how does a company that’s newly formed—and all these companies—and by the way, likewise, China should start an investigation into the Bidens because what happened in China is just about as bad as what happened with Ukraine.”

It seems pretty obvious. But no, that’s public information. Trump has dismissed his appeal to Russia as “a joke”, and just yesterday Senator Marco Rubio contended that the fact that the appeal to China was so blatant and public shows that it probably wasn’t “real”, that Trump was “just needling the press knowing that you guys are going to get outraged by it.” The private information is, of course, being kept private, and there seems to be a process by which formerly shocking secrets are moved into the public sphere gradually, so that they slide imperceptibly from being “shocking if true” to “well-known, hence uninteresting”.

I am reminded of the epistemological conundrum posed by the Weimar-era German cabaret song, “Ich möcht’ so gern wissen, ob sich die Fische küssen”:

Ich möcht’ so gerne wissen
Ob sich die Fische küssen –
Unterm Wasser sieht man’s nicht
Na, und überm Wasser tun sie’s nicht!

I would so like to know
if fish sometimes kiss.
Underwater we can’t see it.
And out of the water they never do it.

Election hacking, part 2

Given that the official US government response to Russian interference in the 2016 presidential election has been essentially nothing, and the unofficial response from Trump and his minions has been to welcome future assistance, I’ve been assuming that 2020 will be open season for any other intelligence agencies with a good cyberwar division to have a go. Why should the Russians have all the fun?

And number one on my list would be the Israeli Mossad. Is this so obvious that no one thinks it worth mentioning, or so wrong-headed that even crazy people don’t think of it? They’re technologically sophisticated, have excellent contacts to the US political establishment, and they have already demonstrated the absence of any compunction at interfering in US internal affairs. They are also highly motivated: Having bet the entire US-Israel relationship on the premise that Trumpism will rule in the US forever, Israel’s security essentially requires the destruction of US democracy. At least, that’s how they’ll rationalise it to themselves.

Something to think about, one week before Israel’s parliamentary election.

May we compare Anne Frank’s case to the Holocaust?

Following up on my earlier post on the unequivocal rejection by many authorities — including the US Holocaust Museum — of any comparison between the concentration camps in which Central American migrants are being interned in the US, and Nazi atrocities. No one is being gassed, no one is being murdered, no one is being worked to death. They are simply being interned in unsafe and unsanitary conditions for indeterminate periods.

And here it occurs to me that if we are being very careful about our historical analogies, we really need to strike out one of the most celebrated stories that (erroneously) is placed in this context, that of Anne Frank. The USHMM includes a page about her life and diary, and the “Holocaust Encyclopedia” describes her as “among the most well-known of the six million Jews who died in the Holocaust.” But was she really? Anne and her sister were undocumented migrants in The Netherlands, rounded up in a police raid and deported to Germany. They were not sent to a death camp, but to Bergen-Belsen, which is commonly referred to as a concentration camp, but that is obviously misleading, since people could think Jews were being gassed there. Nobody killed them there. They just happened to die (like most of their fellow prisoners) of typhus.

Indeed, we should consider Primo Levi’s contention that everyone who survived Auschwitz did so because of some freak combination of exceptional events and exceptional personal qualities (not necessarily positive):

At a distance of years one can today definitely affirm that the history of the Lagers has been written almost exclusively by those who, like myself, never fathomed them to the bottom. Those who did so did not return, or their capacity for observation was paralysed by suffering and incomprehension.

So if the true generic experience of the Holocaust belonged only to those who died, maybe it is inappropriate to compare anyone’s experience to the Holocaust, including that of its victims.