Article about SkyRiver

After being away for a short while at a conference, I am catching up on some week old e-mails. One e-mail I received was about an article in ALCTS Newsletter Online about Michigan State University’s experience with the new SkyRiver bibliographic utility. The bottom line, according to the article, is that they saved about $80,000 a year and didn’t see a loss in productivity once the catalogers became used to using SkyRiver instead of OCLC for copy cataloging. They did say, however, that foreign language materials where lacking. They also mentioned the cost-prohibitiveness of uploading records to OCLC. Anyway, if you are at all interested in this alternative to OCLC, it gives a nice, albeit brief, overview of Michigan State’s experience with SkyRiver.

Code4Lib Journal Issue 10 Published

The tenth issue of Code4Lib Journal was published this morning. I was the Coordinating Editor (CE) this time around. When I volunteered to be the CE, I was afraid it was going to be a lot of work. Fortunately, while their was a fair amount of work involved, it wasn’t overwhelming. This is because the authors and the rest of the editorial committee are passionate about Code4Lib and really put a lot of effort and dedication into the Journal. My thanks to all the editors and authors.

The articles in Issue 10 are….

Editorial Introduction: The Code4Lib Journal Experiment, Rejection Rates, and Peer Review
Edward M. Corrado

Code4Lib Journal has been a successful experiment. With success, questions have arisen about the scholarly nature and status of the Journal. In this editorial introduction we take a look at the question of Code4Lib Journal’s rejections rates and peer review status.

Building a Location-aware Mobile Search Application with Z39.50 and HTML5
MJ Suhonos

This paper presents MyTPL (http://www.mytpl.ca/), a proof-of-concept web application intended to demonstrate that, with a little imagination, any library with a Z39.50 catalogue interface and a web server with some common open-source tools can readily provide their own location-aware mobile search application. The complete source code for MyTPL is provided under the GNU GPLv3 license, and is freely available at: http://github.com/mjsuhonos/mytpl

OpenRoom: Making Room Reservation Easy for Students and Faculty
Bradley D. Faust, Arthur W. Hafner, and Robert L. Seaton

Scheduling and booking space is a problem facing many academic and public libraries. Systems staff at the Ball State University Libraries addressed this problem by developing a user friendly room management system, OpenRoom. The new room management application was developed using an open source model with easy installation and management in mind and is now publicly available.

Map it @ WSU: Development of a Library Mapping System for Large Academic Libraries
Paul Gallagher

The Wayne State Library System launched its library mapping application in February 2010, designed to help locate materials in the five WSU libraries. The system works within the catalog to show the location of materials, as well as provides a web form for use at the reference desk. Developed using PHP and MySQL, it requires only minimal effort to update using a unique call number overlay mechanism. In addition to mapping shelved materials, the system provides information for any of the over three hundred collections held by the WSU Libraries. Patrons can do more than just locate a book on a shelf: they can learn where to locate reserve items, how to access closed collections, or get driving maps to extension center libraries. The article includes a discussion of the technology reviewed and chosen during development, an overview of the system architecture, and lessons learned during development.

Creating a Library Database Search using Drupal
Danielle M. Rosenthal & Mario Bernardo

When Florida Gulf Coast University Library was faced with having to replace its database locator, they needed to find a low-cost, non-staff intensive replacement for their 350 plus databases search tool. This article details the development of a library database locator, based on the methods described in Leo Klein’s “Creating a Library Database Page using Drupal” online presentation. The article describes how the library used Drupal along with several modules, such as CCK, Views, and FCKeditor. It also discusses various Drupal search modules that were evaluated during the process.

Implementing a Real-Time Suggestion Service in a Library Discovery Layer
Benjamin Pennell and Jill Sexton

As part of an effort to improve user interactions with authority data in its online catalog, the UNC Chapel Hill Libraries have developed and implemented a system for providing real-time query suggestions from records found within its catalog. The system takes user input as it is typed to predict likely title, author, or subject matches in a manner functionally similar to the systems found on commercial websites such as google.com or amazon.com. This paper discusses the technologies, decisions and methodologies that went into the implementation of this feature, as well as analysis of its impact on user search behaviors.

Creating Filtered, Translated Newsfeeds
James E. Powell, Linn Marks Collins, Mark L. B. Martinez

Google Translate’s API creates the possibility to leverage machine translation to both filter global newsfeeds for content regarding a specific topic, and to aggregate filtered feed items as a newsfeed. Filtered items can be translated so that the resulting newsfeed can provide basic information about topic-specific news articles from around the globe in the desired language of the consumer. This article explores a possible solution for inputting alternate words and phrases in the user’s native language, aggregating and filtering newsfeeds progammatically, managing filter terms, and using Google Translate’s API.

Metadata In, Library Out. A Simple, Robust Digital Library System
Tonio Loewald, Jody DeRidder

Tired of being held hostage to expensive systems that did not meet our needs, the University of Alabama Libraries developed an XML schema-agnostic, light-weight digital library delivery system based on the principles of “Keep It Simple, Stupid!” Metadata and derivatives reside in openly accessible web directories, which support the development of web agents and new usability software, as well as modification and complete retrieval at any time. The file name structure is echoed in the file system structure, enabling the delivery software to make inferences about relationships, sequencing, and complex object structure without having to encapsulate files in complex metadata schemas. The web delivery system, Acumen, is built of PHP, JSON, JavaScript and HTML5, using MySQL to support fielded searching. Recognizing that spreadsheets are more user-friendly than XML, an accompanying widget, Archivists Utility, transforms spreadsheets into MODS based on rules selected by the user. Acumen, Archivists Utility, and all supporting software scripts will be made available as open source.

AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery
Nitin Arora

AudioRegent is a command-line Python script currently being used by the University of Alabama Libraries’ Digital Services to create web-deliverable MP3s from regions within archival audio files. In conjunction with a small-footprint XML file called SimpleADL and SoX, an open-source command-line audio editor, AudioRegent batch processes archival audio files, allowing for one or many user-defined regions, particular to each audio file, to be extracted with additional audio processing in a transparent manner that leaves the archival audio file unaltered. Doing so has alleviated many of the tensions of cumbersome workflows, complicated documentation, preservation concerns, and reliance on expensive closed-source GUI audio applications.

Automatic Generation of Printed Catalogs: An Initial Attempt
Jared Camins-Esakov

Printed catalogs are useful in a variety of contexts. In special collections, they are often used as reference tools and to commemorate exhibits. They are useful in settings, such as in developing countries, where reliable access to the Internet—or even electricity—is not available. In addition, many private collectors like to have printed catalogs of their collections. All the information needed for creating printed catalogs is readily available in the MARC bibliographic records used by most libraries, but there are no turnkey solutions available for the conversion from MARC to printed catalog. This article describes the development of a system, available on github, that uses XSLT, Perl, and LaTeX to produce press-ready PDFs from MARCXML files. The article particularly focuses on the two XSLT stylesheets which comprise the core of the system, and do the “heavy lifting” of sorting and indexing the entries in the catalog. The author also highlights points where the data stored in MARC bibliographic records requires particular “massaging,” and suggests improvements for future attempts at automated printed catalog generation.

Easing Gently into OpenSRF, Part 1 and 2
Dan Scott

The Open Service Request Framework (or OpenSRF, pronounced “open surf”) is an inter-application message passing architecture built on XMPP (aka “jabber”). The Evergreen open source library system is built on an OpenSRF architecture to support loosely coupled individual components communicating over an OpenSRF messaging bus. This article introduces OpenSRF, demonstrates how to build OpenSRF services through simple code examples, explains the technical foundations on which OpenSRF is built, and evaluates OpenSRF’s value in the context of Evergreen.

S3 for Backup, Is It Worth It?

I’ve been using Amazon’s S3 to back up my blog for a while now, and I really like it for that purpose. The amount of data my blog has is very little, so I end up getting billed $0.01 a month. It probably costs Amazon more money to bill me then they make off of me! This has got me looking at using S3 to backup ELUNA‘s document repository. Currently, we have about 12 GB of data. Assuming we transfer all 12 GB in and out each month (which we wouldn’t, but I’m just saying) it comes out to $4.65 a month for regular storage and $4.05 for reduced redundancy storage according to the AWS calculator. Not bad – especially considering this is certainly an over-estimate (providing we don’t actually need to restore, which in that case I’m not worried about a few bucks to get my data back!). I have other free-to-me options, but this seems like a pretty good deal to me and is something I am considering suggesting to ELUNA.

However, does Amazon S3 scale as a backup solution for my library? It does, but I think only to a point. Let’s say you have 500 GB of images, video, and other data. It will cost you $50 a month for Reduced Redundancy and $75 a month for regular storage (and can you imagine telling the boss, I’m sorry, Amazon lost our data, we were using the Reduced Redundancy plan? I don’t think so.) – not counting data transfer which could double the costs. That is $600 to $900 a year. Maybe that is still reasonable depending on the nature of your project, but you can see it grows quickly to a point where other, local, options that you have more control of our looking more and more reasonable.

We are lucky enough to have a campus-wide IT department that does a good job handling backups so this isn’t something we are considering here in the library, but it seems to me that it could be a good solution if you hit the sweet spot when compared to local storage options. Obviously, the advantage of Amazon being off-site storage is something that shouldn’t be overlooked, which makes the sweet-spot a little higher price range. I have multiple locations I could store a network-based backup up device, so that isn’t as huge of a deal for me. Still, I’d say it is worth other libraries investigating if they have non-huge data that need to be backed up.

Just Say No to Overpriced Journals

I read an interesting piece om the Chronicle today about University of of California Tries Just Saying No to Rising Journal Costs. There are some interesting stats, but the big one that jumps out is that Nature wants to charge an average of over $17K per journal (for each of their 67 journals). Really, the only way to get a handle on the exorbitant cost of journals is for libraries to just say no to paying for them, and more importantly, faculty to just say no: Say no to publishing in them, and say no to requiring (or “highly recommending them”) publications. I hope the University of California system does put pressure and call for a faculty boycott of these journals.

I know publishers will argue that there are costs to publish, and I agree. Even though I support Open Access, I understand commercial (even if some of them don’t call themselves commercial) publishers wanting to recoup costs. But if there costs are so hight, that they need to charge so much, they need to rethink there publishing model. I am not saying that the journal I am involved with, the Code4Lib Journal, is in the same category as the Nature journals, but do you know what we charge? Zero. More importantly, do you know what are expenses are? Zero. If Nature needs to charge $17K a journal, they need to see where all that overhead is, and make some changes and get the costs down to a much more reasonable amount.

Nylink shutting down

I just read a press release that says that Nylink will wind down its operations over the next 12 months. Nylink, like other regional OCLC Networks had to look for different revenue streams due to changes at OCLC and the economy in general. However, unlike most (all?) of the other regional networks, it was tied to a state institution, in this case, the State University of New York. This restricted Nylink’s options, such as merging with other regional networks. I have only been in New York for two years so I don’t have a lot of experience with Nylink, but they seem like good people that were trying to do a good job for the libraries in the State of New York, and I am sure that they will be missed. I wish the employees of Nylink a good final 12 months and good luck in the future.

Koha – LibLime keruffle continues

There was some hope by me and others that LibLime being acquired by PTFS would lead to an improvement in the strained relations between LibLime and a large number of other Koha community members. For a brief while it looked like that was going to happen – maybe not completely, but at least it looked like an improvement in the situation was possible. However currently there has been a major set back.

To some degree it has been a bit of a he said-she said situation and since I wasn’t involved in the discussions, I don’t know all the details. While on principle I support the Koha community committee members position, I am not sure what really happened and I think it was a mistake to call off the meeting with PTFS. I can sympathize with the committee members stated reasons for doing so (as described in their blog post), but closing communication isn’t going to get anyone anywhere. Yes, an open e-mail or IRC discussion would be better than a closed phone call, but a closed phone call is better than no communication at all. And the excuse that one committee member made about the call being at 6:00 AM his time as a reason for it not practical to participate is disappointing to say the least. If this is an important issue for the world-wide Koha community, you need to make time and adjust your schedule. After all, it is going to be 6:00 AM (or worse) somewhere around the world.

PTFS has also made a post about there position on this situation. To read more views on this, check out the Koha E-Mail list archives.

Open Source FUD

As people who read this blog regularly no doubt know, I am a big supporter of Open Source Software and its use in libraries. I am happy to say that here at Binghatom we use a number of Open Source applications – some built specifically for the library environment such as E-Prints, and other more generally applications such as wiki and blog software. We also use a lot of proprietary software, including the Aleph ILS, Metalib [1], ContentDM, Content Pro, and so on. All of these applications, whether they are Open Source or Proprietary have there pluses and minuses.

In the past, the Open Source community had to deal with a lot of FUD. As Bob Molyneux reportedly described at the Evergreen Conference, “people used to asked ‘Open source? you going to use code written by a bunch of dope-smokin’ hippies?’ now they are a bit more educated.” Thankfully I have found that as well and that is a good thing. However, Nicole Engard’s post reminded me of something that has been a slight annoyance to me lately.

Over the last year or so at a number of conferences and on blog posts I’ve been hearing criticisms of proprietary offerings from library vendors such as SirisDynix, III, and Ex Libris. The usually related to some feature a product doesn’t have. For example, maybe a particular ILS doesn’t have relevancy ranking. The presenter or blogger will fairly point that out, but they will extrapolate the issue to all proprietary ILSs, saying something like we had to use Open Source because the proprietary systems don’t support X, Y, or Z. The problem is, they do not mean that all proprietary systems don’t support X, Y, or Z. They mean the particular one at their institution choose to use does not. I don’t know why they do this, whether it is because they are ill-formed or maybe just careless, but I’m sure most Open Source advocates wouldn’t want to be judged by the worse, or most limited, Open Source project out there. Why judge all proprietary offerings based on the limitations of some of the proprietary offerings?

If you want to make the argument that Open Source is better philosophically than proprietary, I am all for it. However, if you are comparing feature sets, please be specific to what you mean and don’t lump all proprietary solutions, or for that matter, all Open Source solutions together. While not as divisive as some of the FUD used against Open Source in the past, it is still FUD, and these over generalizations have no place in the conversation in my opinion.

[1] For those that don’t know, in the interest of full disclosure, I am a member of the Ex Libris Users of North America‘s Steering Committee.

Uphill Battle on Digital Preservation

Inside Higher Ed had a nice review of a symposium that focused on the Uphill Battle of Digital Preservation. The article points out some of the many challenges of digital preservation, especially with born digital information. In many respects I believe this is much more of a policy problem then it is a technical problem. Yes, you need technology to preserve and make information available, but that technology exists (See the Internet Archive). The bigger issue is having scholars realize the importance of preserving heir stuff, and more importantly making sure that the have incentives and structures to do just that.

The incentives, which can include requirements that any scholarly output that originated out of a grant, sabbatical, etc. be deposited. It can also include tying some promotion and tenure requirements to submitting materials for preservation, and making sure guidelines are written that publishing in Open Access journals is not looked as less worthy then other journals. Certainly which carrots will work better will depend on the institution and the discipline, but the point is there needs to be some form of incentive. Professors are busy people and they are going to focus on what will make them succeed at a university and if digital preservation is not one of those things, for most of them it will with get ignored or be a very low priority.

Tied to professors being busy is that it must be made as simple as possible for them to submit stuff for preservation. One way to do this is to have someone (most likely in the library) where they can just e-mail the paper , URL, etc. to that will do the rest. The burden of preservation should not be too cumbersome. After all, faculty are paid to research and teach in their giving field. They are not paid (unless it is there field) to be experts in digital preservation. Librarians and archivists have long been experts in preserving information, and they should continue to be so regardless of format.

Do Webinars always suck?

Dean Dad over at Inside Higher Ed asks the question, “Why do webinars always suck?” and those go onto explain the ways they suck. I actually do attend my fair share of webinars so I obviously don’t think they always suck, but I know I’ve been disappointed in them way more then I should be. As Dean Dad points out, they fact that you don’t have to travel is a big plus considering the economy, but I never get as much out of them as a face-to-face meeting and usually get less out of the then I would a pre-recorded session. Also, if the webinar is more then an hour, forget it. My mind has had enough at that point. I think that is why I really dislike virtual conferences. Multiple hours staring at a screen and listening to someone present is just not a replacement for being there in person.

Besides the idea of listening to someone through a computer, the other negative is often you are sitting alone somewhere watching them. No human companionship. No person to talk to about the session. No “free” coffee. But I digress… I think the one thing I have found is that when I watch a webinar with someone else it is usually a much better experience. That is why at work I often will ask other people if they are interested and arrange to watch it together if the topic is relevant. Otherwise, like one of the comments made by Sibyl, I bring something wlse to do – although I can guarantee it won’t be Mafia Wars or Farmville!

What do you thin? Do Webinars always suck?

Ninth issue of Code4Lib Journal published

The ninth issue of the Code4Lib Journal was published today. There are some really good articles in this issue. In fact, I think this is one of the better issues we have published so far, so I encourage you to check it out.

One particular article I’d like to point out is Sibyl Schaefer’s article on Challenges in Sustainable Open Source: A Case Study. In this article, Ms. Schaefer points out the challenges in creating a community around a Free/Open Source project that has a limited audience. In the example case study she discusses software for archival description and data management, but I believe the issues would be similar in many other projects as well. If you are involved in leadership or are otherwise heavily invested in a Free/Open Source project I’d highly encourage you to read it. Not only does she offer insight into the challenges this particular project had, but also offers suggestions on a way to move forward that I think will be useful for any software project that is trying to create a sustainable community.

If you not an Free/Open Source developer but are just looking for a few good, free applications for manging MARC records and links to electronic journals, you may want to read Brandy Klug’s article on Wrangling Electronic Resources: A Few Good Tools. It provides information about MarcEdit and three different link checkers: Link Valet, W3C Link Checker, and Xenu’s Link Sleuth.

Below are the complete contents/abstracts of issue 9:

Editorial Introduction – Moving Forward
Carol Bean

Welcoming new editors, and reflecting on the sustainability factor.

A Principled Approach to Online Publication Listings and Scientific Resource
Sharing

Jacquelijn Ringersma, Karin Kastens, Ulla Tschida and Jos van Berkum

The Max Planck Institute (MPI) for Psycholinguistics has developed a service
to manage and present the scholarly output of their researchers. The PubMan
database manages publication metadata and full-texts of publications
published by their scholars. All relevant information regarding a
researcher’s work is brought together in this database, including
supplementary materials and links to the MPI database for primary research
data. The PubMan metadata is harvested into the MPI website CMS (Plone). The
system developed for the creation of the publication lists, allows the
researcher to create a selection of the harvested data in a variety of
formats.

Querying OCLC Web Services for Name, Subject, and ISBN
Ya’aqov Ziso, Ralph LeVan, and Eric Lease Morgan

Using Web services, search terms can be sent to WorldCat’s centralized
authority and identifier files to retrieve authorized terminology that helps
users get a comprehensive set of relevant search results. This article
presents methods for searching names, subjects or ISBNs in various WorldCat
databases and displaying the results to users. Exploiting WorldCat’s
databases in this way opens up future possibilities for more seamless
integration of authority-controlled vocabulary lists into new discovery
interfaces and a reduction in libraries’ dependence on local name and
subject authority files.

Challenges in Sustainable Open Source: A Case Study
Sibyl Schaefer

The Archivists’ Toolkit is a successful open source software package for
archivists, originally developed with grant funding. The author, who
formerly worked on the project at a participating institution, examines some
of the challenges in making an open source project self-sustaining past
grant funding. A consulting group hired by the project recommended that —
like many successful open source projects — they rely on a collaborative
volunteer community of users and developers. However, the project has had
limited success fostering such a community. The author offers specific
recommendations for the project going forward to gain market share and
develop a collaborative user and development community, with more open
governance.

Using Cloud Services for Library IT Infrastructure
Erik Mitchell

Cloud computing comes in several different forms and this article documents
how service, platform, and infrastructure forms of cloud computing have been
used to serve library needs. Following an overview of these uses the article
discusses the experience of one library in migrating IT infrastructure to a
cloud environment and concludes with a model for assessing cloud computing.

Creating an Institutional Repository for State Government Digital
Publications

Meikiu Lo and Leah M. Thomas

In 2008, the Library of Virginia (LVA) selected the digital asset management
system DigiTool to host a centralized collection of digital state government
publications. The Virginia state digital repository targets three primary
user groups: state agencies, depository libraries and the general public.
DigiTool’s ability to create depositor profiles for individual agencies to
submit their publications, its integration with the Aleph ILS, and product
support by ExLibris were primary factors in its selection. As a smaller
institution, however, LVA lacked the internal resources to take full
advantage of DigiTool’s full set of features. The process of cataloging a
heterogenous collection of state documents also proved to be a challenge
within DigiTool. This article takes a retrospective look at what worked,
what did not, and what could have been done to improve the experience.

Wrangling Electronic Resources: A Few Good Tools
Brandy Klug

There are several freely available tools today that fill the needs of
librarians tasked with maintaining electronic resources, that assist with
tasks such as editing MARC records and maintaining web sites that contain
links to electronic resources. This article gives a tour of a few tools the
author has found invaluable as an Electronic Resources Librarian.

CONFERENCE REPORT: Code4Lib 2010
Birong Ho, Banurekha Lakshminarayanan, and Vanessa Meireles

Conference reports from the 5th Code4Lib Conference, held in Asheville, NC,
from February 22 to 25, 2010. The Code4Lib conference is a collective
volunteer effort of the Code4Lib community of library technologists.
Included are three brief reports on the conference from the recipients of
conference scholarships.

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »