Clever Cryptography Could Protect Privacy in Covid-19 Contact-Tracing Apps

[ Original @ Wired ]

Researchers are racing to achieve the benefits of location-tracking without the surveillance. Systems like the one being developed at MIT would alert you to potential exposure without giving away your movements.

Before the Covid-19 pandemic, any system that used smartphones to track locations and contacts sounded like a dystopian surveillance nightmare. Now, it sounds like a dystopian surveillance nightmare that could also save millions of lives and rescue the global economy. The paradoxical challenge: to build that vast tracking system without it becoming a full-on panopticon.

Since Covid-19 first appeared, governments and tech firms have proposed—and in some cases already implemented—systems that use smartphone data to track where people go and with whom they interact. These so-called contact-tracing apps help public health officials get ahead of the spread of Covid-19, which may in turn allow an easing of social distancing requirements.

The downside is the inherent loss of privacy. If abused, raw location data could reveal sensitive information about everything from political dissent to journalists’ sources to extramarital affairs. But as these systems roll out, teams of cryptographers have been racing to do the seemingly impossible: Enable contact-tracing systems without mass surveillance, building apps that notify potentially exposed users without handing over location data to the government. In some cases, they’re trying to keep even an infected individual’s test results private while still warning anyone who might have entered their physical orbit.

“This is possible,” says Yun William Yu, a professor of mathematics at the University of Toronto who has worked with one group developing a contact-tracing app for the Canadian government. “You can develop an app that both serves contact-tracing and preserves privacy for users.” Richard Janda, a privacy-focused law professor at McGill University working on the same contact-tracing project, says they hope to “flatten the curve on authoritarianism” as well as infections. “We’re trying to ensure that the way this rolls out is with consent, with privacy protection, and that we don’t regret after the virus has passed—as we hope it does—that we’ve all handed over information to public authorities that we shouldn’t have given.”

WIRED spoke to researchers at three of the leading projects offering designs for privacy-preserving contact-tracing apps—all of whom are also collaborating with each other to varying degrees. Here are some of their approaches to the problem.

Bluetooth Contact Tracing

The best way to protect geolocation data from abuse, argues Stanford computer scientist Cristina White, is not to collect it in the first place. So Covid-Watch, the project White leads, instead anonymously tracks contacts between individuals based on their phones’ Bluetooth signals. It never needs to record location data, or even to tie those Bluetooth communications to someone’s identity.

Covid-Watch uses Bluetooth as a kind of proximity detector. The app constantly pings out Bluetooth signals to nearby phones, looking for others that might be running the app within about two meters, or six and a half feet. If two phones spend 15 minutes in range of each other, the app considers them to have had a “contact event.” They each generate a unique random number for that event, record the numbers, and transmit them to each other.

If a Covid-Watch user later believes they’re infected with Covid-19, they can ask their health care provider for a unique confirmation code. (Covid-Watch would distribute those confirmation codes only to caregivers, to prevent spammers or faulty self-diagnoses from flooding the system with false positives.) When that confirmation code is entered, the app would upload all the contact event numbers from that phone to a server. The server would then send out those contact event numbers to every phone in the system, where the app would check if any of the codes matched their own log of contact events from the last two weeks. If any of the numbers match, the app alerts the user that they made contact with an infected person, and displays instructions or a video about getting tested or self-quarantining.

“People’s identities aren’t tied to any contact events,” says White. “What the app uploads instead of any identifying information is just this random number that the two phones would be able to track down later but that nobody else would, because it’s stored locally on their phones.”

Redacted Location Tracing

Bluetooth tracing has limitations, though. Apple blocks its use for apps running in the background of iOS, a privacy safeguard intended to prevent exactly the sort of tracking that now seems so necessary. The novel coronavirus that causes Covid-19 can also remain on some surfaces for extended periods of time, meaning infection can happen without phones having the opportunity to communicate. Which means GPS location tracking will likely play a role in contact-tracing apps, too, with all of the privacy risks that come with sharing a map of your movements.

One MIT project called Private Kit: Safe Paths, which says it’s already in discussions with the WHO, is working on a way to exploit GPS while minimizing surveillance. MIT’s app is rolling out in iterations, starting with a simple prototype that allows people to log their locations and share them with health care providers if they’re diagnosed with Covid-19. The current version asks users to tell health care providers which sensitive locations they should redact—like homes or workplaces—rather than being able to do it themselves. But the next iteration of the app will build in the ability to sort all the recorded locations of any users diagnosed as Covid-19 positive into “tiles” of a few square miles, and then cryptographically “hash” each piece of location and time data. That hashing process uses a one-way function to transform each location and timestamp in a user’s history into a unique number—a process that’s designed to be irreversible, so those hashes can’t be used obtain the location and time information. And only those hashes, sorted by what “tile” of several-square-mile areas they fall into, would be stored on a server.

To check if a healthy user has crossed paths with an infected one, a Safe Paths user will choose “tiles” on a map that they’ve traveled in. Their app then downloads all the hashes of the timestamped locations of infected users within those tiles. It then performs the same hashing function on all the timestamped locations in their own history, compares those hashes to the downloaded ones, and alerts them if it finds that a hash matches with one of the downloaded ones. That match means they were at the same place, at roughly the same time, as someone who’s Covid-19 positive.

“For the infected person, there’s protection because their information has been already redacted and hashed,” says Ramesh Raskar, a professor in MIT’s Media Lab leading the project. The server stores only a collection of hashes, not the legible location trails of infected users. “For the healthy people, there is no privacy compromise at all because they’re doing all the calculation on their own phone.”

Hashing Servers and Mix Nets

That system is still far from perfect, Raskar readily admits. If the government agency or health care organization controlling the server wants to violate the privacy of the infected users, it’s still possible to “crack” those hashes by hashing all the possible times and locations on a map. That would determine every possible hash, and allow someone to match them with the downloaded database, obtaining its raw timestamped location data—just as hackers try dictionaries of every possible password to crack the hashes in stolen password databases. A malicious user, on the other hand, could only use that technique to crack the set of locations in the tiles they were able to download; the tile scheme is designed to prevent users from downloading and cracking the entire hashed location collection.

But MIT’s Raskar says Private Kit: Safe Paths is already planning yet another iteration of the system that would avoid that hash-cracking problem. To do so, it would use two servers, a hashing server and a storage server, controlled by different organizations. Only the hashing server would possess a secret key necessary to perform the hashing function, so that the storage server couldn’t crack the hashes of uploaded locations. Thanks to some other mathematical sleights of hand, the hashing server would only handle encrypted locations, too, and thus never possess anyone’s sensitive data.

Another group of computer scientists from the University of Pennsylvania, the University of Toronto, and McGill University have proposed yet another, equally convoluted system to the Canadian government. Their system would report redacted, hashed location trails of infected users to a health care authority through a so-called Mix Network: a collection of at least three servers controlled by different entities. Like the onion routing system used by the anonymity software Tor, each of those intermediary servers would mix up the hashed, timestamped locations of users before passing them on to the next one, so that by the time they reach the government authority storing the hashes, that final server wouldn’t be able to associate any of those hashed locations with a specific user.

The organizations controlling the intermediate servers would only be able to piece together the full location trails if they colluded. The final server in the hands of the government agency administering the system would still possess all the hashed, timestamped locations necessary to tell users if they’d potentially been infected by being present at one.

The three projects described above—Covid-Watch, Private Kit: Safe Paths, and the Canadian consortium—aren’t necessarily in competition. MIT’s Raskar, for example, says he’s spoken to both the other groups, and sees Private Kit: Safe Paths as a framework for building contact-tracing apps that could incorporate some features from the other projects based on what government agencies ask for, such as mix networks or Bluetooth proximity sensing. “Every country, every organization can choose what parts they want to use: They can use the hashing scheme, the encryption scheme, the Bluetooth scheme, or not,” Raskar says. “It’s like Lego pieces that they can assemble.”

Tradeoffs

Of course, clever cryptography doesn’t mean anything without buy-in from health care organizations, governments, and users. When it comes to that adoption, different players in the system may be at odds, says Covid-Watch creator White. Users may appreciate privacy, but health care workers and governments don’t necessarily want to build a system that prevents them from, say, proactively notifying users who have been potentially exposed to Covid-19, or even actively tracking the location of infected or potentially exposed people.

As a result, White says, projects making some privacy compromises like MIT’s Private Kit: Safe Paths are getting more buy-in from public health agencies than her Bluetooth-centered Covid-Watch system. “Public health agencies really don’t want to do the kind of thing that we’re proposing because they do want more data,” White says. “But I think we’re more providing what the public might want.”

Still, as serious as the threat of surveillance might be, White says, now is not the time to insist on a perfectly private system before rolling out a contact-tracing app. “If you have to make a little bit of a tradeoff, that’s fine, too. Because something like this needs to happen in order for people to come out of quarantine,” White says. “We’re glad that something like this Bluetooth system exists where you really make no privacy tradeoff if you design it right. But if this didn’t exist, we’d probably be advocating for something else. Because we want to save lives.”

[ Original @ Wired ]