Climate Scientists Are Scrambling to Protect Their Data, Should Public Health Researchers Be Doing The Same?



The scientific community has generally been appalled with the various statements/actions/threats/tweets/insinuations/insults coming out of the Trump administration.  This isn’t meant to be a political statement, this is just based on the scientific community’s response to having their life’s work questioned by a small group of people that don’t know anything about what they do.  For example, scientists have even organized a protest march.  That’s right, even a group of people that is more than happy to spend decades quibbling over the most minutiae of details has unanimously and expeditiously agreed that the new administration seems to have taken the tone of the Know-Nothing Party of the 1850s and that’s a danger to the very scientific process.


Climate scientists in particular have begun taking action to protect the very foundation of their work which is the data that has been amassed over the decades and currently resides in various data repositories sitting in government offices.  While it would be fun to think about a NOAA scientist going rogue and somehow also becoming an amazing hacker and throwing up protective firewalls everywhere, they are doing something much more mundane; they’re creating backups.  Specifically, they are creating publicly-accessible backups that are beyond the government’s servers and organizing everything with a Google Docs spreadsheet to boot.  When I read about this, my first thought was: Shouldn’t public health scientists be doing this too?

Public health is just as sadly politicized as climate science.  As Sarah Gollust, a health policy professor at the University of Minnesota points out in this article, “Public health is inherently ‘paternalistic'” and those who are against government intervention in most or all things view this as a form of government overreach.  Because of this, it is seemingly impossible to debate the topic of public health with certain people even in the face of overwhelming evidence.  The largest contributors to increases in human lifespans have been public health programs that mandate proper sewage systems, clean water access, vaccines, laws against smoking, seat belt usage…I could go on.

Public health is somehwat unique in the number of different information sources it pulls from in order to conduct research.  Yes, there is reporting done from healthcare providers that filters up through the city and/or county, then state, and up to the national level which is a complete mess full of silos of disparate data repositories, but they are in the process of improving.  Additionally though, researchers also look to environmental data to know what conditions people are living in. They look to food regulations to be able to better understand what decisions are needing to be made when people choose what foods they eat. They look to law enforcement statistics to see what dangers people are facing…except regarding guns of course…they can’t study anything about guns.

Without data, and good data at that, the scientific method can’t proceed.   This is why if there are threats to remove data sources from existence, say getting rid of the EPA, a notable frustration is felt amongst all rational people.  Why would anyone want knowledge growth to stop?  We all can come up with answers to that question of course, but none of them deal with the best interests of humanity.  Right now if a regulation is rolled back to allow coal mining companies to pollute creeks and other water bodies (randomly picked example), we have data that can be used to say that specific kind of pollution and at those levels can have negative health outcomes for people.  Imagine if we didn’t have the data in the first place…or as a more wild example, imagine if J60 related codes (Black Lung Disease) couldn’t be used as a cause of death.

This isn’t meant to be fear-mongering, I’m not saying there is a high probability of all data relating to public health being completely wiped out, but there is a high probability of data relating to certain political objectives could be obfuscated or prohibited from being gathered.  This isn’t new.  You can scroll back up and click on the link relating to the CDC not being able to do research on gun violence because somehow that “clashed” with a political opinion.  Not a specific study, mind you,  but stopping anyone from having data so they could ask questions of it.  So it may be imperative that data sets relating to public health start being copied and distributed to non-government controlled sources at least until elected officials become a little less adversarial.  Hopefully that is sooner rather than later because data set loss or data set prohibition could have disastrous consequences for areas of research that could last for decades.

If anyone has started this project, please let me know.


I attended a Meetup on February 25 of a small group of programmers and hobbyists that were part of a larger effort to scrape all vital government data sets and back them up to publicly-accessible sites.  The focus of this particular meeting was working on “uncrawlable” data sets and starting with the NOAA’s world-wide weather buoy data.  Admittedly, passionate efforts like this restore my faith in humanity.  Check-out the Meetup link to see where they are storing data too.  Each department is encapsulated in a Google Doc spreadsheet with links.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: