Remember that iconic scene in “All the President’s Men” where hours tick by at the Library of Congress as reporters Woodward and Bernstein flip through mounting piles of index cards, each one memorializing a book requested by the White House?
Chances are if Post reporters need that same information today, it’s kept in an Excel spreadsheet that can be sorted, searched and alphabetized in a matter of seconds.
Electronic databases are making it possible for journalists to analyze and present information that previously would have overwhelmed the limits of human patience. With access to raw government data, the Texas Tribune can publish not just a list of the highest-paid state employees, but a searchable database that allows inquisitive readers to mine for their own data nuggets (for instance, that in 2011-12. there was a school-bus driver who got paid $117,000 in a single year).
While reporters’ ability to crunch data is increasing at warp speed — the result of better technology plus better training — many government agencies are still mired in the quill-pen era. Those requesting public records often are frustrated to find that a request for a government agency’s “data” produces a towering stack of computer printouts — costing thousands in needless copying expenses — when a spreadsheet or text file would have been quicker, cheaper and much easier to analyze.
Sometimes government agencies refuse to produce electronic records out of concern over data security (for instance, that a table of employee names might contain hidden metadata compromising information that is legally confidential). Sometimes, it’s pretty clear that the agency is simply being obstructionist — knowing that a talented journalist can manipulate electronic records to reveal patterns or anomalies that would go undetected on paper.
State freedom-of-information laws normally entitle the requester to specify the format in which records are to be produced. An agency is not obligated to create new records to satisfy a request — if the records exist only on paper, then state law does not compel the agency to enter the statistics into a spreadsheet just because the requester would find that more useful. But if the spreadsheet already exists, then the agency must make that format available if requested.
A recent wave of favorable court rulings is reinforcing the public’s right of access to government data in the form that’s convenient for the requester, not for the government agency. Here are some highlights:
- On Monday, the California Supreme Court ruled that the Sierra Club was entitled to access to a database of some 640,000 parcels of land kept by Orange County, Calif. The county tried to hide behind an exemption in the California Public Records Act that enables state agencies to withhold “computer software” that the agencies design. The database, said the justices in Sierra Club v. Superior Court of Orange County, was not “software.” Accordingly, it was covered by the Public Records Act and subject to production on request.
- An Illinois court ruled May 28 that a state agency was required to provide a requester with an “unlocked” version of a spreadsheet that the requester could search and reorganize. The requester — a researcher compiling a report about the reliability of those creepy automated red-light cameras — wanted a spreadsheet of tickets issued at each camera location. The state provided a PDF of the spreadsheet, but not the spreadsheet itself. In Fagel v Department of Transportation, the Illinois Appellate Court, First District, said that wasn’t good enough. State law entitles the requester to obtain a document “in the format in which it is maintained,” meaning the original spreadsheet format and not the less-useful “snapshot” of the spreadsheet that the DOT offered.
- In Johnson v. Broussard, a Louisiana appeals court sided with the head of a pharmacists’ trade association who sought access to the state’s licensure database for pharmacists. The state balked at the request, arguing that the database contained confidential information that would require laborious redactions, and offered instead to produce a much more limited paper mailing list. But Louisiana’s First Circuit Court of Appeals decided June 7 that the requester didn’t have to settle for the mailing list. Under the Louisiana Public Records Act, the court ruled, the requester is entitled to the data in the format that the agency keeps it — in this case, a searchable database (with a relatively minimal $500 expense for redaction of information legally exempt from disclosure).
These rulings and many more like them are creating a growing body of legal precedent that agencies cannot play hide-the-data by raising blanket claims of “privacy” or “inconvenience.”
Journalists who want access to large data files should always consider a visit or a phone call before filing a written request, to inquire how records are stored internally. In addition to making the records easier to analyze, computerization can also save on per-page duplication fees that can shoot open-records bills into the stratosphere.