Faced with a rising backlash over the spread of disinformation in the aftermath of the 2016 elections, Facebook last year came up with a seemingly straightforward solution: It created an online library of all the advertisements on the social network.
Transparency, it decided, was the best disinfectant.
Ads would stay in the library for seven years, letting ordinary users see who was pushing what messages and how much they were paying to do it. Facebook gave researchers and journalists deeper access, allowing them to extract information directly from the library so they could create their own databases and tools to analyze the ads — and ferret out disinformation that had slipped past the social network’s safeguards.
“We know we can’t protect elections alone,” Facebook said when it unveiled the latest version of its Ad Library in March. “We’re committed to creating a new standard of transparency and authenticity for advertising.”
But instead of setting a new standard, Facebook appears to have fallen short. While ordinary users can look up individual ads without a problem, access to the library’s data is so plagued by bugs and technical constraints that it is effectively useless as a way to comprehensively track political advertising, according to independent researchers and two previously unreported studies on the archive’s reliability, one by the French government and the other by researchers at Mozilla, the maker of the Firefox web browser.
The problems raise new questions about Facebook’s commitment to battling disinformation, and reflect the struggles of big tech firms and governments across the world to counter it.
United States officials are already grappling with Russian attempts to interfere in the 2020 presidential race, and are powerless to stop American tricksters from joining the fray, because they are protected by the First Amendment. In Europe, an ambitious effort to build an early warning system fell flat during European Parliament elections in May, producing no alerts, despite Russian disinformation campaigns that officials said were designed to sway public opinion and depress voter turnout.
For Facebook, in particular, it is an especially challenging moment: The company was ordered to pay a record $5 billion fine by the Federal Trade Commission on Wednesday for privacy violations, and it agreed to better police how it handles its users’ data. The measures, though, will do little to help the company with its disinformation problem.
The Mozilla researchers, who provided their report to The New York Times, had originally set out to track political advertising ahead of the European elections using the application program interface, or A.P.I., that Facebook set up to provide access to the library’s data. They instead ended up documenting problems with Facebook’s library after managing to download the information they needed on only two days in a six-week span because of bugs and technical issues, all of which they reported to Facebook.
In one instance, Jason Chuang, a Mozilla researcher, engaged in a lengthy back-and-forth with Facebook about a bug that crashed a search after 59 pages of results. Weeks later, a Facebook representative sent a message saying, “This is unfortunately a won’t fix for now.”
The representative added, “We wait for improvement to arrive in the long term and we’ll keep tracking it internally.”
Facebook later sent a message saying that it had fixed the bug, and that the search could be done without crashing the library. But the note came from a person who had not handled the previous report and was sent to a different message chain about a separate issue the Mozilla researchers were encountering. And as recently as this week, the researchers said the library still crashed when they tried to check if the bug was fixed.
On two other occasions, the researchers said Facebook blocked them from reporting fresh bugs. The reason? They had already reported too many.
“One could just call it broken,” said Laura Edelson, a researcher at New York University who has spent months trying to use the library to build her own database of political advertising in the United States.
The library was a centerpiece of Facebook’s response to the growing anger and the prospect of government regulation it faced around the world. It was one of a number of initiatives — including expanded fact-checking efforts and changes to how its news feed was ranked — that executives promoted as examples of how Facebook was a responsible corporate citizen that could be trusted to fix its own problems. Sheryl Sandberg, the chief operating officer, even included the archive among the “strong steps” the company had taken to curb abuse on the network when she testified before the Senate Intelligence Committee in September.
In response to questions for this article, Satwik Shukla, a Facebook product manager, said, “We were the first to introduce this level of ads transparency, and it remains a priority.”
He added that Facebook was working hard to address issues with access to the library’s data through the A.P.I. and “continually seeks feedback from researchers and journalists.”
[Get the Bits newsletter for the latest from Silicon Valley and the technology industry.]
Facebook’s rivals have also sought to make advertising on their platforms more transparent. Google has set up its own archive, allowing researchers to download data on ads bought through the company, including ones on its search pages, as well as ads that run on YouTube, which is owned by the same business.
Google’s archive appears to be functioning better than Facebook’s, the Mozilla researchers and French officials found. But critics say it is missing a crucial component: It does not include what are known as issue ads — that is, ads that seek to sway opinion about a topic instead of a candidate. Many of the ads placed by Russian trolls in 2016, for instance, were about issues, such as the Black Lives Matter movement, and Google has said that it is looking at ways to include issue ads in the archive.
Twitter, which accounts for a tiny percentage of political advertising, has set up its own Ad Transparency Center. But the French researchers found that the initiative did not offer a comprehensive accounting of political ads on the platform. Twitter said that only a small percentage of video ads were missing, and that they would soon be incorporated into the library.
Twitter also does not allow users to download the data wholesale, though that can be done through a process the French researchers said was “time-consuming, requires advanced programming skills and entails potential violations of Twitter’s” rules.
Facebook’s critics acknowledged that the company was doing more than its competitors to open up advertising to independent scrutiny. Many also noted that paid advertising represented only a small portion of disinformation on social media.
Senator Mark Warner, a Virginia Democrat who has been pushing the Honest Ads Act, which would require tech companies to maintain online libraries of political advertising, said he appreciated Facebook’s efforts.
But “given the resources that Facebook has available, I would like to see more,” Mr. Warner said in response to a question from The Times.
Providing access to tens of millions of ads through an A.P.I. is not a simple proposition, but it is also not an engineering feat for a company like Facebook. With 2.4 billion users, Facebook routinely rolls out complicated new features and products at scales that few tech firms could hope to manage.
“This is not like a problem that technology hasn’t solved and they’re really trying to do their best,” said Ms. Edelson, the N.Y.U. researcher. “No, that’s not what is going on. These are fixable problems.”
Ms. Edelson, who has provided data obtained from Facebook’s library to The Times, is a veteran computer engineer who has spent two decades collecting and storing large amounts of data. She said she was able to extract data from Facebook’s library by “crafting an extremely careful strategy to navigate around the bumps.”
Ms. Edelson wrote software to get around some problems, such as “infinite loops” — that is, when the library gets stuck returning the same results over and over again. Many of her other fixes were jury-rigged bits of code to handle specific technical problems, she said.
Extracting all the data from the library, she said, “is probably impossible if you are following all the rules.”
Reporters from The Times tried to use data from the library to analyze political advertising ahead of last year’s midterm elections. But their work was thwarted by bugs and technical limits imposed by Facebook on how information could be searched and retrieved.
The Mozilla analysts and French officials, who also provided their research to The Times, reported a slew of similar problems. They found that identical searches often returned different results, and that the library became unreliable and often crashed when they tried to extract large amounts of information. The bugs and technical limits made it functionally impossible to track political advertising in some places, the French reported, citing the United States as an example.
In late May, Facebook reported there were 3.8 million ads in the American library. With each search limited to 2,000 results, the researchers needed to do 1,900 searches to collect all the data, “which we found impossible to achieve in the two weeks we tried,” they said.
The French officials also found that Facebook sometimes removed ads without explanation. They said 31 percent of the ads in the French library were removed in the week before the European elections, including at least 11 that violated French electoral law.
The company later told the researchers that the deletions were the result of a labeling problem. But Matti Schneider, the French foreign ministry official who oversaw the research, said it was important to see all the ads, even those that did not comply with Facebook’s labeling rules.
The deletions raise questions about “the trust one can put in research based on such shaky ground,” he said.