Everyone knows about the famous Dewey Defeats Truman headline fiasco, and that the Chicago Daily Tribune was inspired to its premature announcement by erroneous pre-election polls. But why were the polls so wrong?
The Social Science Research Council set up a committee to investigate the polling failure. Their report, published in 1949, listed a number of faults, including disparaging the very notion of trying to predict the outcome of a close election. But one important methodological criticism — and the one that significantly influenced the later development of political polling, and became the primary lesson in statistics textbooks — was the critique of quota sampling. (An accessible summary of lessons from the 1948 polling fiasco by the renowned psychologist Rensis Likert was published just a month after the election in Scientific American.)
Serious polling at the time was divided between two general methodologies: random sampling and quota sampling. Random sampling, as the name implies, works by attempting to select from the population of potential voters entirely at random, with each voter equally likely to be selected. This was still considered too theoretically novel to be widely used, whereas quota sampling had been established by Gallup since the mid-1930s. In quota sampling the voting population is modelled by demographic characteristics, based on census data, and each interviewer is assigned a quota to fill of respondents in each category: 51 women and 49 men, say, a certain number in the age range 21-34, or specific numbers in each “economic class” — of which Roper, for example, had five, one of which in the 1940s was “Negro”. The interviewers were allowed great latitude in filling their quotas, finding people at home or on the street.
In a sense, we have returned to quota sampling, in the more sophisticated version of “weighted probability sampling”. Since hardly anyone responds to a survey — response rates are typically no more than about 5% — there’s no way the people who do respond can be representative of the whole population. So pollsters model the population — or the supposed voting population — and reweight the responses they do get proportionately, according to demographic characteristics. If Black women over age 50 are thought to be equally common in the voting population as white men under age 30, but we have twice as many of the former as the latter, we count the responses of the latter twice as much as the former in the final estimates. It’s just a way of making a quota sample after the fact, without the stress of specifically looking for representatives of particular demographic groups.
Consequently, it has most of the deficiencies of a quota sample. The difficulty of modelling the electorate is one that has gotten quite a bit of attention in the modern context: We know fairly precisely how demographic groups are distributed in the population, but we can only theorise about how they will be distributed among voters at the next election. At the same time, it is straightforward to construct these theories, to describe them, and to test them after the fact. The more serious problem — and the one that was emphasised in the commission report in 1948, but has been less emphasised recently — is in the nature of how the quotas are filled. The reason for probability sampling is that taking whichever respondents are easiest to get — a “sample of convenience” — is sure to give you a biased sample. If you sample people from telephone directories in 1936 then it’s easy to see how they end up biased against the favoured candidate of the poor. If you take a sample of convenience within a small demographic group, such as middle-income people, then it won’t be easy to recognise how the sample is biased, but it may still be biased.
For whatever reason, in the 1930s and 1940s, within each demographic group the Republicans were easier for the interviewers to contact than the Democrats. Maybe they were just culturally more like the interviewers, so easier for them to walk up to on the street. And it may very well be that within each demographic group today Democrats are more likely to respond to a poll than Republicans. And if there is such an effect, it’s hard to correct for it, except by simply discounting Democrats by a certain factor based on past experience. (In fact, these effects can be measured in polling fluctuations, where events in the news lead one side or the other to feel discouraged, and to be less likely to respond to the polls. Studies have suggested that this effect explains much of the short-term fluctuation in election polls during a campaign.)
Interestingly, one of the problems that the commission found with the 1948 polling with relevance for the Trump era was the failure to consider education as a significant demographic variable.
All of the major polling organizations interviewed more people with college education than the actual proportion in the adult population over 21 and too few people with grade school education only.