The Digital Deluge and the Age of AI

For the Provocations series, in conjunction with UCI’s “The Future of the Future: The Ethics and Implications of AI” conference.

It’s hard to wrap one’s head around the volume of content coursing through the veins of social media platforms. In 2018, YouTube told Congress that its users uploaded 450 hours of video per minute to the platform. Facebook and other major social media companies are in the same zone. That “content” is user expression, and it covers everything imaginable at a scale that is practically inconceivable. When people talk about the flood of information in the digital age, this is what they mean.

Some percentage of that content will interfere with the rules the platforms set under their terms of service, variously known as Community Guidelines, Community Standards, and so forth. These rules forbid “terrorist” or “extremist” content, harassment, “hate speech,” child endangerment, promotion of self-harm, and other categories of “harmful content.” The scale of new posts makes human-only content moderation impossible. Thus, companies have had to identify ways to locate content that poses the most difficult problems under their rules or under the national laws where they operate.

Two tools dominate. The first, one that has been with content moderation since early in the lives of the platforms, involves only human: users themselves notify companies of alleged rule violations. This kind of flagging offers users some measure (or at least a feeling) of influence over (or at least input about) their platform experiences. But at the same time, flagging is subject to serious abuse. It is easy for groups to coordinate flagging to harass and to troll users — triggering actions such as suspensions or takedowns of posts even where no violation of company standards is at issue. Platform owners have not been good at adjusting to the abuse of flagging, but it remains a tool that it would be difficult to imagine them getting rid of, especially given how, particularly for targets of harassment and abuse, it does empower recourse to remedies.

The other tool involves machines — algorithmic identification of violating content via mechanisms collectively placed under the rubric of Artificial Intelligence (AI). AI has worked well for addressing some problems, such as child sexual exploitation, where there is a broad social and political consensus about what constitutes a rule violation and the imagery is readily identifiable. In general, AI works better for images than text. Machine learning allows companies to determine what kind of content should be deleted or reviewed, or to figure out how to order and curate news and search results, or to decide which friends or tweets or stories to recommend, “training” the software over time to know the difference between the acceptable and the unacceptable.

The public’s impression of AI is that it is machines taking over, but — for now, for the foreseeable future, and certainly in content moderation — it is really human programming and the leveraging of that power, which is a massive one for corporations. The machines have a lot of difficulty with text, with all the variations of satire and irony and misdirection and colloquial choppiness that is natural to language. They have difficulty with human difference and have facilitated the upholding of race, gender, and other kinds of biases to negative effect. Even worse, as the scholar Safiya Noble argues in her book Algorithms of Oppression, “racism and sexism are part of the architecture and language of technology.” And all of this is not merely because they are machines and “cannot know” in the sense of human intelligence. It is also because they are human-driven.

We often do not know the answers about meaning, at least not on a first review. The programmers have biases, and those who create rules for the programmers have biases, sometimes baked-in biases having to do with gender, race, politics, and much else of consequence. Exacerbating these substantive problems, AI’s operations are opaque to most users and present serious challenges to the transparency of speech regulation and moderation.

Still, all social media companies are using AI and recommitting to it. Mark Zuckerberg has made clear that he expects increasing shares of Facebook’s content moderation to be handled by machines. In November 2018, he said that the “single most important improvement in enforcing our policies is using artificial intelligence to proactively report potentially problematic content to our team of reviewers, and in some cases to take action on the content automatically as well.” They’re not ready to handle the toughest challenges, he acknowledged, but Facebook and YouTube in particular are clearly banking on AI to solve significant content challenges as the volume of content remains astronomical (and may continue to expand).

This plan is okay — as far as it goes. There are, however, real risks. It is a risk to speech if these companies fail to somehow build into their AI tools respect for the fundamental rights that users have, not only to express themselves and participate in the public life the platforms offer but also to know the grounds for content and account decisions taken against them. At the moment, it is not clear that the platforms are prepared to uphold these rights and standards, seemingly incentivizing AI’s operation more than user rights. AI content moderation is opaque, even more opaque that notoriously difficult-to-know processes by which the companies — with humans and machines — are moderating content today.

Worse, company confidence in “proactive moderation” — i.e., assessing content before it appears online — is feeding into government fantasies of a complaisant AI’s power to filter all sorts of expression at the point of upload. Repeatedly, European leaders have expressed the belief that AI can solve public problems, such as hate speech, terrorist content, disinformation, and copyright infringement. This belief is feeding into the Age of AI, which risks an age of content regulation largely hidden from the tools of public accountability.

The enabling of AI filters, especially proactive ones, poses a clear danger to political speech online. Who does the coding? Who decides what to hunt for and what to flag? Do we expect these decisions to be made by companies? By governments? By the courts? By the pressure of trolls and bots? These questions always matter. There is also always the possibility that suppressing one group’s speech will enable another kind of speech, which could be politically or even physically dangerous to various groups of citizens.

AI already dominates our information ecosystem, and it is poised to dominate our future. But if we leave its development only to the market, or only to politics, or only to those with power, we will end up creating a dystopic information environment that ignores freedom of expression and dramatically interferes with democratic norms.

David Kaye is Clinical Professor of Law at UCI and the UN Special Rapporteur on the Promotion and Protection of the Right to Freedom of Opinion and Expression. This post is adapted from material in his 2019 book, Speech Police: The Global Struggle to Govern the Internet.