Is your research data server secure?

There is an article on Inside Higher Ed about a professor at University of North Carolina getting demoted after a breach of security of a server she was using to store her research data on. The data included 114,000 Social Security Numbers. Apparently the University was going to fire her but a “faculty hearings committee” persuaded the University to demote her to associate professor and cut her pay in half instead.

The article doesn’t really go into enough details for me to comment to what degree she was at fault she, but considering the outcome, it seems that she must have been at fault at some level, and in fact the faculty review board “did not dispute was that [the researcher] was accountable for the breach according to existing university policy.” But based on this article, it seems that dismissal or even the final outcome of keeping her tenure but cutting her pay in half is a pretty drastic penalty unless she callously disregard the security of the records or if she was collecting data she shouldn’t have been – neither of which it seems applies in this case.

However, it does bring up some interesting questions.

The researcher said she “did everything I knew to do, but I did not know how to secure a machine.” I am sure that is so, but if she is collecting this data she should know that she is taking responsibility for it and if she doesn’t know how to “secure a machine”, why did she put it on a machine connected to the Internet?

The report partially addresses the issue of her not knowing how to secure a server by saying that the researcher hires a “university software programmer” to maintain the server but that person wasn’t certified. Personally, I don’t think certification means much so I wouldn’t fault the researcher for hiring someone without certification. That said, the fact that the person was a programmer does not necessarily mean they know about securing a server. I know some programmers that are great sysadmins and I know some that couldn’t administer a system to save their life. Without knowing the person’s background it is hard to say if the person was or was not qualified. Honestly, I think there is probably more to this part of the story.

The defenders in the article point to “systemic institutional failure” but that is a dangerous slippery slope. The only thing coming close to systematical failure I can see might be the “system” allowing researchers and research labs to run their own servers or possibly approving the collection of the data in the first place. Do these supporters really want central IT controlling and locking down everything and/or approving what type of research data they can gather? I doubt it. However, with this freedom comes responsibility and accountability. This is not “Systematic Failure.” Assuming the server wasn’t properly secured there was a failure, but it was not “systematic” (if the server was properly secured, which sounds unlikely reading this article, then there was no human “failure” but either a technical one or “just” a crime).

The article points out that the researcher constantly gave the person she hired an “excellent” rating a systems administrator (although by her own acknowledgement that she does not know how to secure a server, I am not sure how she could evaluate that – or for that matter what that means in UNC’s context). One thing that any manager needs to take into consideration when evaluating a person is that those evaluations may mean something down the road. The article makes no such mention of what happened to the programmer.

The comments to the article are quite interesting. There are a number of responses on both sides. One of the most pertinent comments is by someone posting as Lala. Lala rote:

Granted that UNC overreacted — dismissal is a bit too much in this case — but why would anyone put Social Security numbers in a dataset that is stored online? This is Data Security 101 stuff. The researcher certainly wasn’t qualified to judge the security of the system’s firewalls but she certainly was aware that she was storing easily-identifiable data in a potentially risky location.

I think this is a very good question. Unless you are absolutely sure you are doing everything you can to protect the data (which the researcher apparently wasn’t based on her own admission she didn’t know about computer security and the faculty review boards conclusion) it seems like a very risky proposition to even collect the data in the first place, let alone put it on the Internet. Unfortunately Social Security Numbers have been used as a unique identifier for years, but I know many institutions (and specifically many libraries) have gone to great lengths to not have that information any more. Without being knowledgeable about thus particular research I can’t say why the researcher needed the SSN or if she considered using some other identifier. This is really what I think is the crux of the matter. Even well-secured servers can be hacked (ask Google , the United Sates Department of Defense, or Iran. The best way to make sure that private data such as Social Security Numbers aren’t hacked is to not have them in the first place.