The end of the Turing test

The Turing test has always had a peculiar place in the philosophy of mind. Turing’s brilliant insight was that we should be able to replace the apparently impossible task of developing a consensus definition of the words ‘machine’ and ‘think’, with a possibly simpler procedural definition: Can a machine succeed at the “Imitation game”, whose goal is to convince a neutral examiner that it (and not its human opponent) is the real human? Or, to frame it more directly — and this is how it tends to be interpreted — can a computer carry on a natural language conversation without being unmasked by an interrogator who is primed to recognise that it might not be human?

Turing’s argument was that, while it certainly is possible without passing the test — even humans may be intelligent while being largely or entirely nonverbal — we should be able to agree on some observable activities, short of being literally human in all ways, that would certainly suffice to persuade us that the attribution of human-like intelligence is warranted. The range of skills required to carry on a wide-ranging conversation makes that ability a plausible stand-in for what is now referred to as general intelligence. (The alert interrogator is a crucial part of this claim, as humans are famously gullible about seeing human characteristics reflected in simple chat bots, forces of nature, or even the moon.)

If we won’t accept any observable criteria for intelligence, Turing points out, then it is hard to see how we can justify attributing intelligence even to other humans. He specifically takes on, in his essay, the argument (which he attributes specifically to a Professor Jefferson) that a machine cannot be intelligent merely because it performs certain tasks. Machine intelligence, Jefferson argued, is impossible because

No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants.

Turing retorts that this leads to the solipsistic view that

the only way by which one could be sure that a machine thinks is to be the machine and to feel oneself thinking. One could then describe these feelings to the world, but of course no one would be justified in taking any notice. Likewise according to this view the only way to know that a man thinks is to be that particular man.

In principle everyone could doubt the content of everyone else’s consciousness, but “instead of arguing continually over this point it is usual to have the polite convention that everyone thinks.” Turing then goes on to present an imagined dialogue that has since become a classic, in which the computer riffs on Shakespeare sonnets, Dickens, the seasons, and Christmas. The visceral impact of the computer’s free-flowing expression of sentiment and understanding, Turing then suggests, is such that “I think that most of those who support the argument from consciousness could be persuaded to abandon it rather than be forced into the solipsist position.” He compares it, charmingly, to a university oral exam, by which it is established that a student has genuinely understood the material, rather than being able simply to reproduce rote phrases mechanically.

I used to accept this argument, but reflecting on Chat-GPT has forced me to reconsider. This is a predictive text generation tool recently made available that can produce competent texts based on arbitrary prompts. It’s not quite ready to pass the Turing test*, but it’s easy to see how a successor program — maybe GPT-4, the version that is expected to be made available to the public next year — might. And it’s also clear that nothing like this software could be considered intelligent.

Thinking about why not helps to reveal flaws in Turing’s reasoning that were covered by his clever rhetoric. Turing specifically argues against judging the machine by its “disabilities”, or its lack of limbs, or its electronic rather than biological nervous system. This sounds very open-minded, but the inclination to assign mental states to fellow humans rather than to computers is not irrational. We know that other humans have similar mental architecture to our own, and so are not likely to be solving problems of intellectual performance in fundamentally different ways. Modern psychology and neurobiology have, in fact, shown this intuition to be occasionally untrue: apparently intelligent behaviours can be purely mechanical, and this is particularly true of calculation and language.

In this respect, GPT-3 may be seen as performing a kind of high-level glossolalia, or like receptive aphasia, where someone produces long strings of grammatical words, but devoid of meaning. Human brain architecture links the production of grammatical speech to representations of meaning, but these are still surprisingly independent mechanisms. Simple word associations can produce long sentences with little or no content. GPT-3 has much more complex associational mechanisms, but only the meanings that are implicit in verbal correlations. It turns out to be true that you can get very far — probably all the way to a convincing intellectual conversation — without any representation of the truth or significance of the propositions being formed.

It’s a bit like the obvious cheat that Turing referred to, “the inclusion in the machine of a record of someone reading a sonnet, with appropriate switching to turn it on from time to time”, but on a level and complexity that he could not imagine.

Chat-GPT does pass one test of human-like behaviour, though. It’s been programmed to refuse to answer certain kinds of questions. I heard a discussion where it was mentioned that it refused to give specific advice about travel destinations, responding with something like “I’m not a search engine. Try Google.” But when the query was changed to “Write a script in which the two characters are a travel agent and a customer, who comes with the following query…” it returned exactly the response that was being sought, with very precise information.

It reminds me of the Kasparov vs Deep Blue match in 1997, when a computer first defeated a world chess champion. The headlines were full of “human intelligence dethroned”, and so on. I commented at the time that it just showed that human understanding of chess had advanced to a point that we could mechanise it, and that I would consider a computer intelligent only when we have a program that is supposed to be doing accounting spreadsheets but instead insists on playing chess.

Continue reading “The end of the Turing test”

Suicides at universities, and elsewhere

The Guardian is reporting on the inquest results concerning the death by suicide of a physics student at Exeter University in 2021. Some details sound deeply disturbing, particularly the account of his family contacting the university “wellbeing team” to tell them about his problematic mental state, after poor exam results a few months earlier (about which he had also written to his personal tutor), but

that a welfare consultant pressed the “wrong” button on the computer system and accidentally closed the case. “I’d never phoned up before,” said Alice Armstrong Evans. “I thought they would take more notice. It never crossed my mind someone would lose the information.” She rang back about a week later but again the case was apparently accidentally closed.

Clearly this university has structural problems with the way it cares for student mental health. I’m inclined, though, to focus on the statistics, and the way they are used in the reporting to point at broader story. At Exeter, we are told, there have been (according to the deceased student’s mother) 11 suicides in the past 6 years. The university responds that “not all of the 11 deaths have been confirmed as suicides by a coroner,” and the head of physics and astronomy said “staff had tried to help Armstrong Evans and that he did not believe more suicides happened at Exeter than at other universities.”

This all sounds very defensive. But the article just leaves these statements there as duelling opinions, whereas some of the university’s claims are assertions of fact, which the journalists could have checked objectively. In particular, what about the claim that no more suicides happen at Exeter than at other universities?

While suicide rates for specific universities are not easily accessible, we do have national suicide rates broken down by age and gender (separately). Nationally, we see from ONS statistics that suicide rates have been roughly constant over the past 20 years, and that there were 11 suicides per 100,000 population in Britain in 2021. That is, 16/100,000 among men and 5.5/100,000 among women. In the relevant 20-24 age group the rate was also 11. Averaged over the previous 6 years the suicide rate in this age group was 9.9/100,000; if the gender ratio was the same, then we get 14.4/100,000 men and 5.0/100,000 women.

According to the Higher Education Statistics Agency, the total number of person years of students between the 2015/2016 and 2020/2021 academic years were 81,795 female, 69,080 male, and 210 other. This yields a prediction of around 14.5 deaths by suicide in a comparable age group over a comparable time period. Thus, if the number 11 in six years is correct, it is still fewer deaths by suicide at the University of Exeter than in comparable random sample of the rest of the population.

It’s not that this young man’s family should be content that this is just one of those things that happens. There was a system in place that should have protected him, and it failed. Students are under a lot of stress, and need support. But non-students are also under a lot of stress, and also need support. It’s not that the students are being pampered. They definitely should have institutionalised well-trained and sympathetic personnel they can turn to in a crisis. Where where are the “personal tutors” for the 20-year-olds who aren’t studying, but who are struggling with their jobs, or their families, or just the daily grind of living? And what about the people in their 40s and 50s, whose suicide rates are 50% higher than those of younger people?

Again, it would be a standard conservative response to say, We don’t get that support, so no one should get it. Suck it up! A more compassionate response is to say, students obviously benefit from this support, so let’s make sure it’s delivered as effectively as possible. And then let’s think about how to ensure that everyone who needs it gets helped through their crises.

The Queen’s two bodies

There’s something that baffles me about the public discussion of ERII’s legacy: Why do so many people feel comfortable lauding the late monarch as the (no longer) living embodiment of the nation when she’s waving to the crowd and dispensing Christmas bromides, but just a befuddled girl when her imperial government is committing crimes against humanity?

The cognitive dissonance is extreme: What kind of monsters would we be were we to be charmed by a person responsible for the murder and torture of thousands? Therefore she was not responsible. Therefore, implicitly — since she was responsible for everything — these crimes did not occur.

And just to be extra clear, I am not doubting the expert claim that Harold Macmillan lied like a rug to keep Her Majesty in the dark on the sordid details of the Empire, or lied to the public to pretend that he did. The living embodiment of the nation embodies its crimes as well as its virtues. She can’t embody the spirit of Paddington Bear, but be free of any taint of Hola. The victims of Her Majesty’s government are her royal victims, whether or not her mortal body participated, whether or not it was indeed aware.

The alternative is, monarchy is just bullshit, just celebrity culture with extra-fancy headgear. That seems to be the genuine belief revealed by the public’s response.

How to do (presidential) things with words

Donald Trump’s home has been raided by the FBI. While there has been no official announcement of the object of the raid, most are assuming that the government is looking for official documents that the former president may have taken with him from the White House. And particular concern has been raised about possible secret (classified) documents. This raises an interesting legal question, because it is generally accepted that the president has broad latitude to classify and declassify any information.

One of the great texts of modern Anglo-American philosophy of language is J L Austin’s How to Do Things with Words. The title is brilliant, of course, and it compelled me to pick it up off a friend’s bookshelf and read it before I’d ever heard of it or knew how significant it was. As someone who had immersed himself as a teenager in the early twentieth century mathematico-logical approach to Austin’s simple point was a revelation: Language is not solely (or even mainly) about making statements about the world that can be judged on their truth value. (Wittgenstein had already led me into this terrain, but Austin is much more concrete, and not so oracular.)

Austin’s point is that there is a whole class of “speech acts”: Verbal utterances that are not true or false, but actions. Examples are

  • Making a promise;
  • Naming something (e.g., a ship christening, one of Austin’s examples);
  • Issuing a challenge, bet, or threat;
  • Marrying (meaning here, performing the ceremony, though also one of the parties making marriage vows);
  • Making an order;
  • Handing down a legal ruling.

Crucial to Austin’s analysis is that we need different categories for describing the success of such utterances. Not truth, but appropriateness. Basically, there needs to be an accepted conventional procedure for conducting this act at a certain time, with agreement that the procedure has a certain effect, and such that the role of uttering the words has an established role in the procedure. And this procedure must have been carried out in the correct circumstances by appropriate people, and in the correct manner.

Which brings us back to the sticky-fingered former president. One of Trump’s lackeys is insisting that Trump can’t have broken the law regarding classified information, because he declassified all of it before he stole it. (Regardless of whether the information officially classified, he presumably still contravened the Presidential Records Act by taking the government documents, but that seems like a more politically venial crime than mishandling classified information.)

“The White House counsel failed to generate the paperwork to change the classification markings, but that doesn’t mean the information wasn’t declassified,” Kash Patel, a former staffer for Rep. Devin Nunes (R-CA) and, briefly, a Pentagon employee, told Breitbart in May.

“I was there with President Trump when he said ‘We are declassifying this information,’” Patel added.

There is an established procedure for declassifying documents, which may be invoked by a president, but it is more complicated than the president simply declaring “I declassify thee”. (For one thing, how would you define the blast radius of such an order? Has the president declassified all information held by the government? Everything written on papers in the general direction the president is gesturing at? What about an encrypted laptop in the same room?) “Per a 2009 executive order, markings on classified material need to be updated to reflect changes in their status.”

Patel went on to suggest that Trump had been betrayed, but that his order to “declassify” should retain legal force.

“It’s petty bureaucracy at its finest, government simpletons not following a president’s orders to have them marked ‘declassified,’” Patel said. “The president has unilateral authority to declassify documents — anything in government. He exercised it here in full.”

In Austin’s framework, there is a conventional procedure being invoked here, and the president is the appropriate person to invoke it. But the procedure was not carried out in the correct manner. It is rather as though an eager couple in a hurry appears in church. They haven’t registered their marriage (28 days required by law in England), and they don’t have time for a full ceremony. The priest says “I declare you married” and sends them on their way.

Trump’s lackey treats this as a mere matter of “petty bureaucracy”, but the need to exercise power through formal procedures is an important check on autocracy. In the Third Reich the Führer’s will was paramount, even if it had not been expressed. Germans were supposed to “work toward the Führer”. Requiring explicit instructions in specific forms creates a modicum of transparency and accountability.

There’s a certain formality two-step here that is immensely corrosive of public responsibility. You start with the observation, the president has the right to do X if he chooses. It’s a plenary power, potentially dangerous, so it is hemmed in by various complications and procedures. In particular, he needs to explicitly invoke the power. Which you can’t do in the required specificity to an unlimited extent. And then you start to say, well, it’s his power, he could exercise it any time he wants, so it’s mere pettifogging to insist that he actually have done that rigmarole of invoking, and pretty soon everyone is just working toward the leader, guessing what the law currently is.

Socialist maths

The recent decision by the state of Florida to ban a slew of mathematics textbooks from its schools because of their links to banned concepts has attracted much attention. The website Popular Information has pored through the banned texts to try and suss out what the verboten ideological content might be. Some books seem to have impermissably encouraged students to work together and treat each other with respect. Another may have set off alarms because it included, among its capsule biographies of mathematicians, some non-white individuals.

I’ve always wondered, though, why it’s not considered problematic that books persistently teach the concept of division with problems that require that a fixed amount of wealth — 10 cookies, say — be allocated equally among a group of children. No consideration of whether some of the children might be smarter, or work harder, or just be closer to the cookie jar, and thus be entitled to a larger share. Pretty much the definition of socialism!

(More generally, it always fascinated me, in my years spent as an observer on the playground, that it was taken for granted that toddlers were always being pressured to share their toys, and learning to share was seen as a developmental milestone, where we do not expect adults to be willing or able to share anything at any time.)

Opine borders

Boris Johnson has aroused the ire of many classical historians for his dubious claim that the Roman Empire was destroyed by “uncontrolled immigration”. What is most striking is the unquestioned implication that when Romans moved outward, conquering and enslaving their neighbours, that was GLORY, and much to be lamented when it was (possibly) destroyed by their ultimate failure to prevent people from “the east” from migrating in the opposite direction. It seems to me, if there’s anyone who had a problem with uncontrolled migration from the east it was Carthage.

Neanderthals and women

The article seems to have good intentions, but this headline in today’s Guardian is the most sexist I’ve seen in some time. It sounds like the men were hard at work “creating language”, and some women helped out with some testing, and maybe brought snacks. Also some Neanderthals came by and lent a hand. And apes.

Last and First Antisemites

There’s something fascinating about 19th and 20th century English antisemitism. In continental Europe hatred of Jews was seen as fundamentally political, hence controversial, and was viewed with some distaste by many bien-pensant intellectuals.

Not so in England, where anti-Semitism was never so passionate or violent, but also never particularly controversial until the Nazis went and gave it a bad name. It’s all over the literature, hardly seeming to demand any comment, as I noted with some surprise a while back about the gratuitous anti-Semitism in The Picture of Dorian Grey.

Anyway, I just got around to reading for the first time Olaf Stapledon’s Last and First Men. It’s a remarkable piece of work, barely a novel, giving a retrospective overview of about a billion years of human history from the perspective of the dying remnant of humanity eking out its last days on Neptune. And the early parts, at least, are blatantly antisemitic. Chapter 4 tells of a time, still only thousands rather than millions of years in our future, when all racial and national distinctions have vanished through intermixing of populations and the creation of a world state. There is just one exception: the Jews. They are still there, defining themselves as a separate “tribe”, that uses their native “cunning” — specifically, financial cunning — to dominate their weaker-minded and less ruthless fellow humans:

The Jews had made themselves invaluable in the financial organization of the world state, having far outstripped the other races because they alone had preserved a furtive respect for pure intelligence. And so, long after intelligence had come to be regarded as disreputable in ordinary men and women, it was expected of the Jews. In them it was called satanic cunning, and they were held to be embodiments of the powers of evil… Thus in time the Jews had made something like “a corner” in intelligence. This precious commodity they used largely for their own purposes; for two thousand years of persecution had long ago rendered them permanently tribalistic, subconsciously if not consciously. Thus when they had gained control of the few remaining operations which demanded originality rather than routine, they used this advantage chiefly to strengthen their own position in the world… In them intelligence had become utterly subservient to tribalism. There was thus some excuse for the universal hate and even physical repulsion with which they were regarded; for they alone had failed to make the one great advance, from tribalism to a cosmopolitanism which in other races was no longer merely theoretical. There was good reason also for the respect which they received, since they retained and used somewhat ruthlessly a certain degree of the most distinctively human attribute, intelligence.

Finding the mitochondrial Na’ama

I was having a conversation recently about Biblical ancestry and the antediluvian generations, and it got me to thinking about how scientists sometimes like to use biblical references as attention-grabbing devices, without actually bothering to understand what they’re referring to — in this case, the so-called “mitochondrial Eve”. The expression was not used in the 1987 Nature paper that first purported to calculate the genealogical time back to the most recent common ancestor (MRCA) of all present-day humans in the female line, but it was a central to publicity around the paper at the time, including in academic journals such as Science.

The term has come to be fully adopted by the genetics community, even while they lament the misunderstandings that it engenders among laypeople — in particular, the assumption that “Eve” must in some sense have been the first woman, or must have been fundamentally different from all the other humans alive at the time. The implication is that the smart scientists were making a valiant effort to talk to simple people in terms they understand, taking the closest approximation (Eve) to the hard concept (MRCA), and the simple bible-y people need to make an effort on their part to understand what they’re really talking about.

In fact, calling this figure Eve is a blunder, and it reveals a fundamental misunderstanding of the biblical narrative. Eve is genuinely a common ancestor of all humans, according to Genesis, but she is not the most recent in any sense, and suggesting that she is just confusing. The MRCA in the Bible is someone else, namely the wife of Noah. Appropriately, she is not named, but if we want a name for her, the midrashic Genesis Rabbah calls her Na’ama. She has other appropriate characteristics as well, that would lead people toward a more correct understanding. To begin with, she lived many generations after the first humans. She lived amid a large human population, but a catastrophic event led to a genetic bottleneck that only she and her family survived. (That’s not quite the most likely scenario, but it points in the right direction.) And perhaps most important — though this reflects the core sexism of the biblical story — there was nothing special about her. She just happened to be in right place at the right time, namely, partnered with the fanatic boat enthusiast when the great flood happened.

Gender and the Metropolis (algorithm)

I’ve always heard of the Metropolis algorithm having been invented for H-bomb calculations by Nicholas Metropolis and Edward Teller. But I was just looking at the original paper, and discovered that there are five authors: Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller. Particularly striking having two repeated surnames, and a bit of research uncovers that these were two married couples: Arianna Rosenbluth and Marshall Rosenbluth, and Augusta Teller and Edward Teller. In particular, Arianna Rosenbluth (née Wright) appears to have been a formidable character, according to her Wikipedia page: She completed her physics PhD at Harvard at the age of 22.

In keeping with the 1950s conception of computer programming as women’s work, the two women were responsible, in particular, for all the programming — a heroic undertaking in those pre-programming language days, on the MANIAC I — and Rosenbluth in particular did all the programming for the final paper.

And also in keeping with the expectations of the time, and more depressingly, according to the Wikipedia article “After the birth of her first child, Arianna left research to focus on raising her family.”