Reporter’s Toolbox: EPA Data Moves Toward Openness — Again
EDITOR'S NOTE: This story is one in a series of special reports from SEJournal that looks ahead to key issues in the coming year. Visit the full “2022 Journalists’ Guide to Energy & Environment” special report for more.
By Joseph A. Davis
The universe of data from the U.S. Environmental Protection Agency is expanding and getting easier to use — making data journalism on the environment beat more productive.
There is actually a history of EPA data, one that goes back to before 1986, when Congress passed the law that set up the Toxics Release Inventory (although for context, back when the TRI first went online people were using 1200 bits-per-second telephone modems and the web didn’t exist).
But the history of public data at EPA is checkered. The visionary “fishbowl” of the first Administrator Ruckelshaus years was later suppressed during the second Bush administration, only to burst into the open again under the Obama administration. During the Trump era, people worried the data would vanish altogether.
Now, under President Biden, transparency is improving again.
Gateway to environment datasets
|EPA’s environmental dataset gateway arrays regional datasets as well as national ones. Image: Screenshot of EPA website. Click to enlarge.|
Rather than recount the sins of the past, Toolbox wants to point you to EPA’s open data page, which links ultimately to almost all the data resources the agency offers. It leads to a catalog of EPA data resources, including the agency’s environmental dataset gateway.
One nifty and newish feature of that gateway is that it arrays regional datasets as well as national ones.
That gateway can take you well beyond the world of EPA’s once cutting-edge Envirofacts warehouse. Envirofacts is still running and may be a good entry point for generalists. Its strength is that it allows you to search many of EPA’s historic pollution databases at the same time (a “multisystem search”).
In the years since Envirofacts began, EPA has evolved toward increasing use of geospatial databases. Today, another portal offers one-stop access to geospatial downloads.
Still a very handy tool for journalists is EPA’s ECHO (Enforcement and Compliance History Online) database. It is especially useful for tracking a given facility’s (or company’s) history of permitted pollution discharges and emissions (and violations and their resolution).
Emphasis on openness, standards
The one thing hard to ignore about post-Biden presentations of EPA data is an emphasis on the “open.” That includes a full presentation of the data policies and guidance on which EPA’s data openness is based, including federal laws and executive orders.
One striking feature of these is how often EPA’s policies mandating openness go back (at least) to the Obama years — policies which may have been ignored or violated during the Trump years, but which were never repealed.
Another newish thing in EPA’s
data offerings is the emphasis
on standards and their visibility.
Another newish thing in EPA’s data offerings is the emphasis on standards and their visibility. Some of these standards are government-wide. One of the most important standards for agency data is consistency — which allows data to be meaningfully compared, compiled and crunched.
If you are a fan of EPA data, you could (or should) be familiar with EPA’s Facility Registry Service, or FRS. This is a system that standardizes identification of facilities and companies, making them comparable across datasets. It also allows tracking of a given plant’s parent company and subsidiaries, which gives journalists a way of probing companies’ environmental performance across geographic space.
Another key standard for EPA data is the chemical identification data standard. This allows standardized information about the many chemicals EPA does (or doesn’t) regulate. Such systems (e.g., CAS number) have existed for years. But as EPA reaches out to assess and regulate a much wider array of substances, it is encountering chemicals that are not in the lexicon or haven’t been studied yet.
Toolbox explained why this is increasingly important in our recent report on computational toxicology and EPA’s CompTox Dashboard. You can find other SEJournal articles about EPA data in Toolboxes on its CAMEO software, air quality monitor mapping and early release of Toxics Release Inventory data (as well as TipSheets on toxic chemicals data and using TRI to find toxic threats).
Toolbox will continue pointing out data sources that could lead environmental journalists to good stories. EPA itself is trying to do something similar. Check out this list of EPA databases and you might get ideas. And here’s more about EPA data.
Joseph A. Davis is a freelance writer/editor in Washington, D.C. who has been writing about the environment since 1976. He writes SEJournal Online's TipSheet, Reporter's Toolbox and Issue Backgrounder, and curates SEJ's weekday news headlines service EJToday and @EJTodayNews. Davis also directs SEJ's Freedom of Information Project and writes the WatchDog opinion column.
* From the weekly news magazine SEJournal Online, Vol. 6, No. 39. Content from each new issue of SEJournal Online is available to the public via the SEJournal Online main page. Subscribe to the e-newsletter here. And see past issues of the SEJournal archived here.