Your shadow genetic profile

So, the “Golden Gate killer” has been caught, after forty years. Good news, to be sure, and it’s exciting to hear of the police using modern data systems creatively:

Investigators used DNA from crime scenes that had been stored all these years and plugged the genetic profile of the suspected assailant into an online genealogy database. One such service, GEDmatch, said in a statement on Friday that law enforcement officials had used its database to crack the case. Officers found distant relatives of Mr. DeAngelo’s and, despite his years of eluding the authorities, traced their DNA to his front door.

And yet… This is just another example of how all traditional notions of privacy are crumbling in the face of the twin assaults from information technology and networks. We see this in the way Facebook generates shadow profiles with information provided by your friends and acquaintances, even if you’ve never had a Facebook account. It doesn’t matter how cautious you are about protecting your own data: As long as you are connected to other people, quite a lot can be inferred about you from your network connections, or assembled from bits that you share with people to whom you are connected.

Nowhere is this more true than with genetic data. When DNA identification started being used by police, civil-liberties and privacy activists in many countries forced stringent restrictions on whose DNA could be collected, and under what circumstances it could be kept and catalogued. But now, effectively, everyone’s genome is public. It was noticed a few years back that it was possible to identify (or de-anonymize) participants in the Personal Genome Project, by drawing on patterns of information in their phenotypes. Here’s a more recent discussion of the issue. But those people had knowingly allowed their genotypes to be recorded and made publicly available. In the Golden Gate Killer case we see that random samples of genetic material can be attributed to individuals purely based on their biological links to other people who volunteered to be genotyped.

The next step will be, presumably, “shadow genetic profiles”: A company like GEDmatch — or the FBI — could generate imputed genetic profiles for anyone in the population, based solely on knowledge of their relationships to other people in their database, whether voluntarily (for the private company) or compulsorily (FBI).

Just browsing

Among the first orders of business for the Conservatives, now that they have a majority, is to increase their ability to spy on the general public — for only the most noble of reasons bien sûr:

That law, labelled a snooper’s charter, would have required internet and mobile phone companies to keep records of customers’ browsing activity, social media use, emails, voice calls, online gaming and text messages for a year. 

It occurred to me that a reasonably effective defense against government snooping on your browsing history (and, indeed, Google snooping on your browsing history) might be to have a browser that is constantly active, and searches for random search terms whenever it is not being actively used.

Some ideas:

  1. The random browsing should not be completely arbitrary. It should include sufficient numbers of securityphilic keywords to make it difficult to search through.
  2. You don’t want the real searches to stand out as topically coherent. You’d want the choice of search terms to crawl through topic space.
  3. You might want to embed the real searches in the crawl. Suppose I type “David Cameron smashed restaurant” into my search window, when the browser, on its own initiative, has just searched for “spurious GCHQ bomb plots”. Instead of carrying out my search immediately, it interpolates thematically. Maybe a dozen searches like “spurious David Cameron bomb plots” and “spurious David cameron bomb restaurant”.

Do all babies look alike?

And if not, why don’t they have any privacy rights with regard to their photographs?

Here is the illustration provided by the BBC on its home page for a report on the decision to approve fertility procedures that take genetic material from three different people:

Not a three-parent baby, but they are expected to look similar to this one.
Not a three-person baby, but manufacturers promise they will look similar to this model.

One wonders what purpose this photograph serves. Are there readers who see the headline and think, “Wait, babies, I’ve heard of them. Can’t quite remember what they look like…” In what sense is this an illustration of the article? It’s not even a newborn infant. They might as well have shown a 90-year-old lady, because making three-person babies inevitably leads to the eventual creation of three-person 90-year-olds. It might be even more relevant to show an elderly person, because that’s the goal: the purpose of the procedure is to improve the health and longevity of the humans so conceived.

They could have used their stock photograph of weirdly lighted lab technicians pipetting something into a test tube instead.

I’m wondering, who is this baby who is standing in for a “three-person baby”? I’m used to seeing children have their features blurred out in news photos. But, of course, this one was presumably a “volunteer” model. One baby can stand in for all babies. (As long as it’s white, of course.)

What would they do with the data?

The Conservatives and the security services are ramping up the propaganda for the digital panopticon, now particularly pressuring US-based social network companies to give up their quaint ideas of privacy. If you’re not with the snoopers you’re with the terrorists and the paedophiles.

“Terrorists are using the internet to communicate with each other and we must not accept that these communications are beyond the reach of the authorities or the internet companies themselves,” [David Cameron] told MPs after the report was published.

“Their networks are being used to plot murder and mayhem. It is their social responsibility to act on this.”

This refers to the government report on the murder of soldier Lee Rigby by an Islamist extremists Michael Adebolajo and Michael Adebowale, that accuses Facebook (not by name — the name of the company was only leaked to the press, for some reason) of failing to inform the security services that they had been carrying on conversations about plans to murder a soldier on Facebook.

Try this out with regard to telephone service: If criminals were found to have plotted a killing on the telephone — not that such things ever happened before there was Facebook — would that be taken to prove that the telecoms are responsible for monitoring the content of every phone call? What about the post? What if they didn’t use electronic media, but fiendishly took advantage of the fact that there is currently no electronic surveillance in everyone’s bedrooms?

Why aren’t the security services who have been downloading all of our communications, including everything on Facebook, supposedly to protect us from terrorism, responsible for detecting the terrorist chats?

Those who see no problem with the collection of vast quantities of private data by various security services, or who see it as a necessary evil, tend to assume that Western democracies can ensure through legal structures that the information is used in the public interest, in the defence of democracy. Others believe this is naïve. There is nothing about Western democracy that nullifies the basic truths of humanity, and how people respond to the temptations of power.

If you are having difficulty imagining what our wise and good protectors in the security services might get up to if they had access to a complete collection of correspondence, maps of contacts, purchasing history for everyone in the country — indeed, for most of the world — consider this historical affair that has recently been in the news: Continue reading “What would they do with the data?”

“Networks of choice”

I’ve commented before on the brilliant satirical sketch from the 2008 US election campaign, in which John McCain’s campaign staff discussed an ad accusing Barack Obama of proposing tax breaks for child molesters. “Did he really do that?” asks the candidate. “He proposed tax breaks for all Americans, and some Americans are child molesters” was the answer.

Last year, Home Secretary Theresa May applied this joke structure to the abuse of anti-terror laws to attack press freedom. Now the Guardian reports that the new director of GCHQ Robert Hannigan is keeping up this satiric tradition. He

has used his first public intervention since taking over at the helm of Britain’s surveillance agency to accuse US technology companies of becoming “the command and control networks of choice” for terrorists.

Fortunately, since it’s just the terrorists who prefer using Google and Apple and Facebook and Twitter they’ll be easy for the security services to target. The honest people are presumably all using the upstanding high-quality British online services. Because they have nothing to hide.

What does it mean, by the way, that the Guardian calls this a “public intervention” (rather than a speech, as other political figures are described as delivering)? It sounds ominous.

Default settings, encryption, and privacy

One essay that powerfully shaped my intellect in my impressionable youth was Douglas Hofstadter’s Changes in Default Words and Images, Engendered by Rising Consciousness, that appeared in the November 1982 issue of Scientific American (back when Scientific American was good), and Hofstadter’s associated satire A Person Paper on Purity in Language. Hofstadter’s point is that we are constantly filling in unknown facts about the world with default assumptions that we can’t recognise unless they happen to collide with facts that are discovered later. He illustrates this with the riddle, popular among feminists in the 1970s, that begins with the story of a man driving in a car with his young son. The car runs off the road and hits a tree, and the man is killed instantly. The boy is brought to the hospital, prepped for surgery, and then the surgeon takes one look at him and says “I can’t operate on this boy. He’s my son.” As Hofstadter tells it, when this story was told at a party, people were able to conceive of explanations involving metempsychosis quicker than they could come to the notion that the surgeon was a woman. It’s not that they considered it impossible for a woman to be a surgeon. It’s just that you can’t think of a human being without a sex, so it gets filled in with the default sex “male”. (The joke wouldn’t really work today, I imagine. Not only are there so many women surgeons that it’s hard to have a very strong default assumption, but the boy could have two fathers. On the other hand, a “nurse” has a very strong female default, so much so that a male nurse is frequently called a “male nurse”, to avoid confusion.)

Continue reading “Default settings, encryption, and privacy”

Identifiability

A hot topic in statistics is the problem of anonymisation of data. Medical records clearly contain highly sensitive, private information. But if I extract just the blood pressure measurements for purposes of studying variations in blood pressure over time, it’s hard to see any reason for keeping those data confidential.

But what happens when you want to link up the blood pressure with some sensitive data (current medications, say), and look at the impact of local pollution, so you need at least some sort of address information? You strip out the names, of course, but is that enough? There may be only one 68-year-old man living in a certain postcode. It could turn into one of those logic puzzles where you are told that Mary likes cantelope and has three tattoos, while John takes cold baths and dances samba, along with a bunch of other clues, and by putting it all together in an appropriate grid you can determine that Henry is adopted and it’s Sarah’s birthday. Some sophisticated statistical work, particularly in the peculiar field of algebraic statistics, has gone into defining the conditions under which there can be hidden relations among the data that would allow individuals to be identified with high probability.

I thought of this careful and subtle body of work when I read this article about private-sector mass surveillance of automobile license plates — another step in the Cthulhu-ised correlations of otherwise innocuous information that modern information technology is enabling. Two companies are suing the state of Utah to block a law that prevents them from using their own networks of cameras to record who is travelling where when, and use that information for blackmail market research.

The Wall Street Journal reports that DRN’s own website boasted to its corporate clients that it can “combine automotive data such as where millions of people drive their cars … with household income and other valuable information” so companies can “pinpoint consumers more effectively.” Yet, in announcing its lawsuit, DRN and Vigilant argue that their methods do not violate individual privacy because the “data collected, stored or provided to private companies (and) to law enforcement … is anonymous, in the sense that it does not contain personally identifiable information.”

They’re only recording information about  So, in their representation, data are suitably anonymised if they don’t actually include the name and address. We’re just tracking vehicles. Could be anyone inside… We’re just linking it up with those vehicles’ household incomes. Presumably they’re going to target ads for high-grade oil and new tires at those cars, or something.

 

What happens if you forget the key?

Courts in the US and the UK have recently been ruling that criminal suspects may be forced to reveal cryptographic keys that encode files that may include incriminating evidence. US courts have been divided on whether this infringes upon the otherwise absolute right to avoid self-incrimination. I’ve never taken that argument very seriously — it’s certainly not in the spirit of the right to refuse to assist in prosecuting oneself to allow people to hide documentary evidence of a crime, just because the revelation would be “speech”.  But while people may be compelled to testify in court, and in some democracies may be required to assist police by correctly identifying themselves, it’s not usual for people to be compelled by law to reveal particular information, particularly when they may not know it. While perjury charges may be brought against those who testify falsely, the inevitable unreliability of memory makes perjury convictions difficult, and I thought impossible when the subject simply pleads ignorance rather than testifying to a falsehood.

In fact, the strongest argument for a right not to reveal a password is that it’s not the hidden data that are protected by the right against self-incrimination, but rather the admission that you know the password, hence are at least in some way in control of and responsible for them, that cannot be compelled. According to the Regulation of Investigatory Powers Act 2000 (that was apparently a banner year for civil liberties in the UK), “failing to disclose an encryption key” is an offence in itself. In 2009 a man was jailed for 13 months for refusing on principle to provide encryption keys to the authorities, despite the fact that he was not suspected of any crime other than not cooperating with the police.

I have encrypted volumes on my laptop hard drive — with old exam papers — whose passwords I’ve forgotten. I probably should delete them, but I haven’t gotten around to it, and maybe I’ll remember one of these days. Even if I did delete them, they’d still be there on my hard drive unless I took exceptional measures. So if customs officials ever took an interest in my laptop while I was entering the UK, I could end up in prison for up to two years. The only thing I could do to protect myself is either to destroy the hard drive, or have it erased, which is itself suspicious.

Unlike most other criminal offences, the offence of withholding a cryptographic key is impossible to prove, but also impossible to disprove. It is even impossible for anyone but the accused even to know whether or not there has been any offence. And if there has been no criminal offence — if the accused does not, in fact, know the key — there is no way to prove that. It is the democratic state’s version of the plight of the man being tortured for information that he does not have, so that he has nothing to offer to end the suffering.

Along these lines, I was wondering about the current state of the right to silence in British law, and there came a revelation in the form of the British authorities (oddly, the news reports are all vague about which authorities it was; presumably the UK Border Agency, but maybe agents from a secret GCHQ data-mining task force) detaining the partner of journalist Glenn Greenwald under schedule 7 of the Terrorism Act 2000. According to the Guardian,

Those stopped under schedule 7 have no automatic right to legal advice and it is a criminal offence to refuse to co-operate with questioning,

This is pretty frightening, particularly when these laws are being so blatantly abused to settle political grudges.

How do you tell the difference between eavesdropping and ineptitude?

So, apparently, the Nassau County (New York) Police have a “joint terrorism task force”, and they can monitor residents’ web searches in more or less real time. And they paid a visit to a family that had searched for pressure cookers and backpacks online, as well as having revealed interest in news about the Boston bombing. I’m not an expert, but I don’t think it’s a good idea to go telling everyone that law enforcement is monitoring the contents of web activity. Aside from the fact that the monitoring itself is probably illegal, revealing operational capabilities tends to get people stranded in the holding area of foreign airports, where Americans get stabbed in the back. And it doesn’t matter what your motivations were for revealing the information. (Pressure cookers? Really? The whole reason for using pressure cookers to make bombs is that there are millions of them around, a very large fraction of which will likely never be used to kill or maim civilians. And backpacks.)

The next Edward Snowden should avoid contacting a journalist directly. Instead, he can just tip off local law enforcement to an important national security journalist’s involvement in some nefarious plot, and then feed them with the appropriate keywords that he’s trying to communicate. He’ll probably get a medal.

The story, as reported by the aspiring terrorist herself, has some delightful details that sound like they came from Monty Python:

Meanwhile, they were peppering my husband with questions. Where is he from? Where are his parents from? They asked about me, where was I, where do I work, where do my parents live. Do you have any bombs, they asked. Do you own a pressure cooker? My husband said no, but we have a rice cooker. Can you make a bomb with that? My husband said no, my wife uses it to make quinoa. What the hell is quinoa, they asked.

Again, I’m no expert on interrogation, but I’m just going to hazard a guess that “Can you make a bomb with that?” is not the sort of question that frequently leads to actionable intelligence.

Privacy rights in Germany

Unlike the US, Germany has a constitutional court that doesn’t kowtow as soon as the government yells “National Security”. Whereas the US Supreme Court has chosen to rewrite Catch 22 as a legal judgement, saying effectively that no one has standing to challenge secret government surveillance programs, because they are secret, hence no one can prove (using information the government will allow to appear in open court) that they have been affected.

Deutschlandfunk’s science programme Forschung Aktuell has been reporting this week on problems of information technology, security, and privacy, and today I learned (transcript in German)

In 2001 the police chief of Baden-Württemberg Erwin Hetger demanded a programme of advance data storage, by which all connection data of web surfers in Germany would be stored for six months.

“I think we cannot allow the Internet to become effectively a law-free zone. Hence my clear and unambiguous recommendation: Whoever moves about on the Internet must be willing to accept that his connection data are stored for a fixed period of time.”

The Bundestag did, in fact, pass such a law in 2007. But in 2010 the Constitutional Court annulled the law.

While such advance data storage is not necessarily impossible under the German constitution, the constitutional requirements for such an action would be very strict, and were not satisfied by the law that was passed.

The president of the Constitutional Court Hans-Jürgen Papier specifically emphasised that if such data were to be stored, it would have to be done in a more secure way than the law had required.

The comparison of this process — where the basic parameters of privacy rights and government snooping are set by the normal democratic process of legislatures passing laws that are then reviewed in publicly accessible court decisions — just makes clear how supine the US courts and Congress have been, as has been the UK parliament.
Continue reading “Privacy rights in Germany”