Legality of scientific research at stake in CFAA lawsuit (Q&A)

It might sound improbable, but the United States’ pre-eminent antihacking law could be used to block research into computer algorithms’ influence on everything from law enforcement methods to which cute cat photos pop up in your Facebook feed.

That’s the argument in a lawsuit filed against the U.S. government over the Computer Fraud and Abuse Act at the end of June. The legal complaint, brought by the American Civil Liberties Union on behalf of academic researchers and journalists, asserts that the CFAA “unconstitutionally criminalizes research aimed at uncovering whether online algorithms result in racial, gender, or other illegal discrimination in areas such as employment and real estate.”

Specifically, the lawsuit is challenging the portion of the CFAA that makes it illegal to violate a website’s terms of service. That section in particular worries University of Michigan at Ann Arbor communications professor Christian Sandvig, one of the case’s four academic plaintiffs. (First Look Media, which publishes The Intercept, is also a plaintiff.)

The U.S. government has 60 days from the filing of the lawsuit, July 1, to file a response.

“Regardless of the outcome, something has to change. In the past, the CFAA has been portrayed as a security researcher’s problem. As things have become more computerized, in a sense, we’re all security researchers now.” — Christian Sandvig, University of Michigan professor

Sandvig worries that government and corporate entities can stifle his work at will because of what CFAA critics call an overly broad, unfair law. As we explained in our primer on the CFAA earlier this year, offenses under the 30-year-old law include testing the security of a computer network, even when legally hired to do so; speculating in writing about how a school’s computer network can be vulnerable to hackers; and even downloading documents on a publicly available website.

Courts today are split on how to interpret the CFAA. Successful challenges to the law often have centered on “whether the CFAA’s prohibition on ‘exceeding authorized access’ extends to forbidden uses of information to which the computer user has authorized access,” says Harley Geiger, who as senior legislative counsel to Rep. Zoe Lofgren (D-Calif.) from 2012 to 2014 worked extensively on an ultimately unsuccessful CFAA reform attempt named after the late computer rights activist Aaron Swartz.

Sandvig spoke with The Parallax about why he decided to join the current legal action against the CFAA. What follows is an edited transcript of our conversation.

Q: Why is this research-based challenge to the CFAA coming up now? Have researchers just not felt threatened by the law before?

As a nonlegal expert, it seems like the CFAA could be used to prosecute anything. Not just computer abuse but also research. The situation we’re in now with social media and algorithms is that computers and code mediate everything.

We’ve been calling the research I’ve been doing in the last couple years algorithm auditing. It’s familiar to computer security research, but we’re using it for social research—the consequences of very complicated systems. The idea of auditing that output is powerful.

In finance, it’s really normal to have audits. In computer security, it’s normal to test your security. But what if we took that approach to social science and computer systems? In the future, we might have buy-in from the people and companies who are being audited.

The CFAA lawsuit throws into sharp relief that technology doesn’t have to be the way it is now. Corporations that use algorithms don’t have to run our lives. Computer algorithms might make decisions that are discriminatory.

But as it is now, companies could use the CFAA to stop our research, if they want.

With this lawsuit, what do you think your chances of success are?

I’m feeling pretty good about it. I obviously hope the case is successful. Regardless of the outcome, something has to change. In the past, the CFAA has been portrayed as a security researcher’s problem. As things have become more computerized, in a sense, we’re all security researchers now.

Joseph Weizenbaum once wrote that long programs have no authors. When something gets really big, you might not know how the end result was influenced by a specific author. A company might be happy to have someone audit their algorithm and help determine what it does.

Problems related to unaudited computer code aren’t limited to huge companies like Google and Facebook, right?

The government, in the past, has used the very broad umbrella of security and law enforcement to be exempt from algorithm auditing. There are all kinds of negative consequences.

Virginia Eubanks, for example, wrote about welfare eligibility determined by algorithms. And a Snowden-leaked slideshow showed that machine-learning algorithms are used to determine whom to target with drones.

Is the kind of behavioral prediction shown in Minority Report a world without algorithm auditing?

Some of these decisions are going to have to be made with law enforcement. We have to come together and decide what we want as a society. I tend to be optimistic that can happen. But there are definitely a lot of negative feelings about what can be done with computers.

Why is it so hard to get algorithms right? Isn’t code just code?

There’s this idea that technology, broadly, is bad. I love computers and technology. My father taught me to program on an RPG IBM System/360. But I feel that things have shifted with social media.

Increasingly, computers are scary, and they’re filtering out important voices or extracting data from you. On social media, there has been a series of follies and errors. But there’s another side to it. You could make computer programs that with machine learning could detect discrimination.

One of the reasons algorithms are difficult to get right is complexity. In the past, you had email on the screen that displayed email in reverse-chronological order. So the algorithm sorted by date and reversed it.

Now we’re seeing public reports from Internet services that are using 500 factors in a formula or equation or code. Those are weighted, and they change all the time. They’re now complex enough that no one person is able to understand how an algorithm works.

There’s some great work by Nick Seaver, who’s sort of an anthropologist of engineers—like the Jane Goodall of engineers. He talks about how not one of hundreds of engineers working on little bits of code can see the end result.

What other factors make code complex?

Transparency. The fact that when we search Google, we see different stuff makes it hard to track how it works. Personalization is great, but it’s hard to say exactly what Google is doing.

These companies also see these algorithms as their really important intellectual property. I think that they’re overprotective of them. In a previous paper, we called this “Consumer Reports for algorithms.” They are a collective; they work together to gather information.

I don’t think that there’s anybody in a computer department intentionally writing super-racist algorithms. But it’s really in everyone’s interest in this space to have a Lemon Law for algorithms because nobody running a job website wants to have gender discrimination.

It used to be that researchers went door-to-door with pieces of paper and pencil. But most things are now mediated by computers. So if social research is going to continue, you need social researchers who are going to be dealing with computers in some way. I think that this is uncontroversial for most people.