blog.ecorrado.us

Ramblings about library technology, open source software, and other adventures!

 

Is your research data server secure? 2011 January 27

Filed under: technology — ecorrado @ 17:01:29

There is an article on Inside Higher Ed about a professor at University of North Carolina getting demoted after a breach of security of a server she was using to store her research data on. The data included 114,000 Social Security Numbers. Apparently the University was going to fire her but a “faculty hearings committee” persuaded the University to demote her to associate professor and cut her pay in half instead.

The article doesn’t really go into enough details for me to comment to what degree she was at fault she, but considering the outcome, it seems that she must have been at fault at some level, and in fact the faculty review board “did not dispute was that [the researcher] was accountable for the breach according to existing university policy.” But based on this article, it seems that dismissal or even the final outcome of keeping her tenure but cutting her pay in half is a pretty drastic penalty unless she callously disregard the security of the records or if she was collecting data she shouldn’t have been – neither of which it seems applies in this case.

However, it does bring up some interesting questions.

The researcher said she “did everything I knew to do, but I did not know how to secure a machine.” I am sure that is so, but if she is collecting this data she should know that she is taking responsibility for it and if she doesn’t know how to “secure a machine”, why did she put it on a machine connected to the Internet?

The report partially addresses the issue of her not knowing how to secure a server by saying that the researcher hires a “university software programmer” to maintain the server but that person wasn’t certified. Personally, I don’t think certification means much so I wouldn’t fault the researcher for hiring someone without certification. That said, the fact that the person was a programmer does not necessarily mean they know about securing a server. I know some programmers that are great sysadmins and I know some that couldn’t administer a system to save their life. Without knowing the person’s background it is hard to say if the person was or was not qualified. Honestly, I think there is probably more to this part of the story.

The defenders in the article point to “systemic institutional failure” but that is a dangerous slippery slope. The only thing coming close to systematical failure I can see might be the “system” allowing researchers and research labs to run their own servers or possibly approving the collection of the data in the first place. Do these supporters really want central IT controlling and locking down everything and/or approving what type of research data they can gather? I doubt it. However, with this freedom comes responsibility and accountability. This is not “Systematic Failure.” Assuming the server wasn’t properly secured there was a failure, but it was not “systematic” (if the server was properly secured, which sounds unlikely reading this article, then there was no human “failure” but either a technical one or “just” a crime).

The article points out that the researcher constantly gave the person she hired an “excellent” rating a systems administrator (although by her own acknowledgement that she does not know how to secure a server, I am not sure how she could evaluate that – or for that matter what that means in UNC’s context). One thing that any manager needs to take into consideration when evaluating a person is that those evaluations may mean something down the road. The article makes no such mention of what happened to the programmer.

The comments to the article are quite interesting. There are a number of responses on both sides. One of the most pertinent comments is by someone posting as Lala. Lala rote:

Granted that UNC overreacted — dismissal is a bit too much in this case — but why would anyone put Social Security numbers in a dataset that is stored online? This is Data Security 101 stuff. The researcher certainly wasn’t qualified to judge the security of the system’s firewalls but she certainly was aware that she was storing easily-identifiable data in a potentially risky location.

I think this is a very good question. Unless you are absolutely sure you are doing everything you can to protect the data (which the researcher apparently wasn’t based on her own admission she didn’t know about computer security and the faculty review boards conclusion) it seems like a very risky proposition to even collect the data in the first place, let alone put it on the Internet. Unfortunately Social Security Numbers have been used as a unique identifier for years, but I know many institutions (and specifically many libraries) have gone to great lengths to not have that information any more. Without being knowledgeable about thus particular research I can’t say why the researcher needed the SSN or if she considered using some other identifier. This is really what I think is the crux of the matter. Even well-secured servers can be hacked (ask Google , the United Sates Department of Defense, or Iran. The best way to make sure that private data such as Social Security Numbers aren’t hacked is to not have them in the first place.

 
 

iPads in the University Classroom 2011 January 22

Filed under: libraries,technology — ecorrado @ 11:01:06

The Wired Campus section of the Chronicle of Higher Education has a brief column about the use of iPads instead of textbooks in one class at Notre Dame. Really, there is not enough information in the article to say if tablets like iPads will be a bad or good replacement in the long term (i.e. no control group reported on, maybe it is just this professors style that works, etc.). However, it does seem i was successful and with proper apps (like ones that can annotate pdfs) it is a viable option for course materials. However judging from this paragraph

And when it came time for their computer-based final exam, 39 of the 40 students in class put away their iPads in favor of a laptop.

iPads don’t replace a laptop, so why not just deliver materials to laptops? Still, that said, it is good to see higher education looking into these new tools. We just have to remember that while some things are new ans shiny, they might not actually be better for a particular purpose so we need to keep a critical eye when considering/evaluating at new technology.

 
 

U.S. Academic Libraries switching to Koha in 2010 2011 January 4

Filed under: libraries,technology — ecorrado @ 06:01:15

One of the things that interested me in Marshall Breeding’s “ILS Turnover Reverse report from Library Technology Guides” was what libraries were switching to Koha. In particular I was interested in which academic libraries have switched to Koha in 2010. As commentators in my earlier blog post about my “Thoughts on Library Technology Guides’ ILS Turnover Report” there are some questions about the data. In my opinion, most of the questions – at least those about numbers – are more problematic outside of the United States and a few other countries. For some of the reasons behind this, see Marshall Breeding’s comment on my last blog post about this report where he discusses how he gathers the data used in this report. For that reason, I decide to limit this post to U.S. Academic Libraries that switched to Koha in 2010.

According to my count [1] 15 U.S. academic libraries switched to Koha from another ILS and one more, a trade school named Antonelli College, went from no ILS to using Koha. I was interested in looking at was the profiles of the schools, and in particular the number of volumes [2], and the type and size of patrons served. I was also interested in looking at what libraries are listed as being independently supported. To a lesser degree I wanted to see if there was anything particularly interesting in who academic libraries were choosing to acquire Koha support from.

All of the U.S. academic libraries switching to Koha have less then 140,000 volumes (at least as far as I can tell) [3]. The two largest are in the New York City metro area and are getting support from Liblime. It is possible (likely?) that they are using Koha via WALDO consortium which has a partnership with PTFS/LibLime. The only other U.S. Academic library to switch in 2010 that has more then 100,000 volumes is D’Youville College. D’Youville is listed as independent, however the demo video on their Website of their new catalog shows that they are hosted via PTFS/LibLime as well and may possibly also be contracting through WALDO [4]. In other words, the larger U.S. academic libraries that moved to Koha in 2010 are doing so via LibLime/PTFS and I am pretty sure they are using “Liblime Enterprise Koha” and not the Open Source version. According to the Liblime Website, Liblime Enterprise Koha enhancements include many acquisitions enhancements and enhanced authority control. Here I need to plead ignorance of recent Koha developments in this area and of how “enhanced” Liblime Enterprise Koha really is in these areas, but from previous experience, these were areas of that I am under the impression that the Open Source version needs some development to attract larger academic libraries [5]. Many libraries still do not use acquisitions within the ILS or make extensive use of authority records (any use?). So these are not always a high priority when selecting an ILS in smaller libraries. However when you start getting closer to medium sized academic libraries they become more of an issue. In other words, I am not surprised that the U.S. academic libraries that are switching to Koha are small academic libraries, and that the larger ones that are migrating are switching to Liblime Enterprise Koha. Although the largest of the bunch selected Liblime, ByWater did attract some schools with volume counts that were not much smaller. Goddard College, for example, has 97,000 volumes and two others have about 75,000 records.

Besides D’Youville, the other library that is listed as independent is University of Science and Arts of Oklahoma. This is the third largest in terms of volume counts to make the switch on 2010. I wanted to check there catalog to see what it looked liked, but it is currently unavailable. If they truly are independent, it would be interesting to here about there experiences migrating to Koha.

The academic libraries that migrated serve a diverse type of schools. There is trade schools, 2 year community colleges, 4 year schools, graduate schools, and seminaries. Therefore it doesn’t look like the type of college or university being served is a factor for those who have selected Koha.

Of the schools that switched to Koha, 4 were using Koha, 3 Unicorn, 2 Horizon. Single schools had EOS.Webm Vurtua, Winnebego Spectrum, Athena, and Millennium.

Notes:
[1] Defining what is an academic library can be tricky sometimes. While it is easy to say Binghamton University Libraries, for example, is an academic library, there are places that fall into the gray area like trade schools, advanced research institutes, etc. Also, if a school is based in the United States, but the library is in London as part of a undergraduate program, is it a U.S. academic library (FWIW: In this case I said no). So, you might count more or less libraries than I. However, for purposes of this inquiry, I don’t think it is a factor since the ones I didn’t include really weren’t “outliers” in terms of size or scope.

[2] I used a variety methods to get volume counts. Mostly though, I looked at what the libraries self-reported wither on Library Technology Reports or somewhere else

[3] There was one larger academic library to make the switch in the United Kingdom. Staffordshire University has approximately 180,000 volumes and switched to Koha with support from PTFS-Europe.

[4] This demonstrates some of the concerns members of the Koha community have with whether or not the self-reporting of Koha service providers is accurate

[5] As I mentioned in the past I do support a Koha install for a small collection (> 1000 records). I did look at some of these issues briefly while installing Koha and migrating items to the new install. I didn’t notice anything that made me think these features are not still lacking compared to their proprietary counterparts, but I did not look closely, so I may be wrong and I welcome any information that shows me they can do the same things, as streamlined, as something like Millennium, Voyager, or Aleph.

 
 

Academic Search Engine Spamming 2011 January 3

Filed under: libraries,technology — ecorrado @ 17:01:58

Jonathan Rochkind had a really interest blog post commenting on a recent article about “Academic Search Engine Spam and Google Scholar’s Resilience Against it” published by Joeran Beel and Bela Gipp in Journal of Electronic Publishing. The article (and Rochkind’s blog post) discuss how scholars could manipulate citation counts and visibility in Web-based academic search engines like Google Scholar. It is unclear what the risk-reward factor for this would be, but if it can be done, I am sure at least a few scholars will try to do it. However, it is also true as Beel and Gipp point out that citation gaming is not at all new. Some publishers and journals actively encourage people to cite from there journal(s), and there are citation circles and of course self-citing.

I am not really sure how much we should be worried about this, at least how much we should worry about it MORE than we do the whole idea of using dubious measures such as citation counts to account for promotion and tenure decisions to begin with. As Rochkind sums it up:

Once you start to look too carefully, the whole academic publishing endeavor can start to seem like a somewhat arbitrary game played by agreed upon rules in order to justify tenure decisions, rather than attempt to share knowledge with ones peers or the world or in general. In this light though, the possibility of gaming Google Scholar is perhaps less alarming, as it’s really just business as usual.

Happy reading.