blog.ecorrado.us

Ramblings about library technology, open source software, and other adventures!

 

RDA and transforming to a new bibliographic framework 2011 June 3

Filed under: libraries,technology — ecorrado @ 17:06:56

I haven’t had the opportunity to work much with RDA records yet, however I’ve been following some e-mail lists, blogs, and other commentaries where people have been discussing there experiences with it. The Library of Congress , the National Library of Medicine (NLM), and the National Agricultural Library (NAL) organized testing to evaluate whether or not they will implement RDA.

Out of this testing experience (which is still being analyzed), the Library of Congress issued “Transforming our Bibliographic Framework: A Statement from the Library of Congress” on May 13. According to the statement, “Spontaneous comments from participants in the US RDA Test show that a broad cross-section of the community feels budgetary pressures but nevertheless considers it necessary to replace MARC 21 in order to reap the full benefit of new and emerging content standards.” Therefore, Library of Congress is going to investigate, among other things, replacing MARC 21.

From what I have heard of the RDA testing, I think this makes sense. The general feel I get is that RDA by its self is not enough of a change to make libraries expend the resources necessary to implement it. Sure there are some improvements over AACR2, but there are also many things I read that are not improvements. This is especially true if you agree with the Taiga Forum 6′s 2011 Provocative Statement #2 that libraries will need to participate in radical cooperation. RDA offers a bit too much flexibility to insure that bibliographic records created by one library will fit well for other libraries. For example, the Rule of 3 is gone which on the cover is an improvement since it allows for more then 3 authors to be included as main or added entry. However, as discussions on the RDA-L list, it requires only the first author and illustrators of children’s books as author main or added entry. Local choices are great if you are only working for the local and not “radically cooperating.”

I won’t go through the list of complaints (and, to be fair, some complements) of RDA I’ve seen, as you can find them yourselves. I think my takeaway though is RDA on top of our existing bibliographic infrastructure is probably not going to make a monumental improvement for our patrons while at the same time it will be costly to implement (especially retroactively). RDA might be better than AACR2, but is it better enough that migrating to it is worth the time and costs? I am not so sure. Maybe simple changes to AACR2 would be just as good and more practical?

Some people I talk to think moving to RDA is a necessary first step that will make more significant or radical changes easier in the future. I, however, have a underlying fear that if libraries implement RDA in the current environment they will be stuck with it for a long time and it will actually make it harder to implement something different in the future. I hope the others are right and I am wrong since I believe in the short to medium term, RDA will be implemented on top of our existing bibliographic infrastructure – for better or worse.

If we replace our underlying bibliographic infrastructure with something else and change to RDA, say maybe something based on RDF or some other standard model for data interchange, we might actually get a significant change that will help expose our bibliographic data to the greater world of linked data while at the same time making it easier for libraries to take advantage of linked data.

One thing that the Library of Congress needs to take account in this process is the economic realities of implementing something new. I don’t see this specifically mentioned in the issues they plan on addressing. I assume that it will be part of the underlying discussions, but I would like to see it more prominently mentioned. Part of this is also involving vendors as well as open source developers of systems such as Evergreen and Koha. If LoC makes a change, it will effect libraries throughout the US (and probably the world). If the systems libraries use can’t function withing this new bibliographic framework, it will be a difficult and extremely expensive transition.

I think this is something librarians, especially those in systems and cataloging, should follow closely. I know I will be doing so.

 
 

MARC is better than Dublin Core 2011 June 1

Filed under: conferences,libraries,technology — ecorrado @ 14:06:33

During my presentation on Digital Preservation: Context & Content (slides) at ELAG 2011 last week, I made the statement that MARC is better than Dublin Core. This may have been a bit of proactive statement but I thought it was relevant to my presentation and the conference in general. I felt that someone had to say it, especially since there was a whole workshop on MARC Must Die and a number of other presentations were gleefully awaiting the day we were done with MARC. For example, with a great deal of support from the audience, Anders Söderbäck said “we the participants of #elag2011 hold these truths to be self-evident, that MARC must die…”.

Probably not surprisingly my statement sent off a mini-barrage of messages on the conference Twitter feed. Since the conference was almost over (my presentation was the second to last) and it wasn’t the core to what I was talking about, I didn’t have time to explain/expand on my position. I know that some of the people that responded to my statement on Twitter were not at the conference and at least a few I am pretty sure weren’t watching the live stream. Because of this, I wanted to take this time to put the statement in context and explain why I said I think MARC is better than Dublin Core. I understand people may not agree with me and this post won’t change that, but that doesn’t mean I need to agree with the band-wagon that wants to kill something that has been pretty successful for the last 40 or so years.

Before going any further since I’m not sure it was clear to everyone commenting on Twitter, I should point out that by MARC I mean MARC 21+AACR2 (which is the common usage of the term in the USA), but I imagine the same statements would likely apply to any version of MARC + what ever set of rules you want to apply, Similarly, by Dublin Core, I mean the Simple and/or Qualified Dublin Core Metadata along with the Dublin Core Metadata Element (DCMES) format (i.e. descriptive fields). I know that there are other aspects of the Dublin Core Metadata Initiative, but for the purposes of this discussion I don’t believe they are germane [1]. I am focusing on how Dublin Core can be used to describe objects. After all, that is why librarians use metadata – to describe things. No matter how easy it is for machines (or humans) to parse a metadata record, it would not be very useful if the standard does not make it possible to adequately describe, in a consistent way, whatever it is that one is trying to describe. I should also also point out, that while I love theory and research, in this case I am mostly concerned with the practical.

The statement came out of my experiences thus far with using Dublin Core for digital preservation at Binghamton University. Before we started on this, I was familiar with Dublin Core but never really had to work closely with it on a large scale so I didn’t have a strong opinion of it. I am not a cataloger, but as a systems librarian, I feel it is necessary to follow developments in cataloging and I am also having to work with MARC records on a fairly regular basis. Thus, I realize that MARC has its issues but please don’t kill it until we have something better and at this point, I don’t believe we do. [2]

In short, my problem with Dublin Core is that it does not allow for the granularity and consistency that I believe is necessary to adequately describe a mixed set of objects for long term preservation and access. Mixed sets is important here, if you are doing a long term preservation project that includes a disperse set of objects, I believe it is important that there is some consistency across collections. This is especially true if they are going to be managed or searched together. Librarians often comment on the need to break down silos or at least tie them together for discovery. The metadata needs to be adequate to do this. Maybe if you are a national library you can have multiple digital preservation solutions, but at a mid-sized university library that approach is problematic and most-likely not realistic. This is doubly so if you consider that one of the main components of preservation is ensuring access in the future (i.e. you are not talking about a dark archive). This is not really a new or unique criticism but I think it is often overlooked and/or too easily dismissed. Even one of the people who objected to my saying MARC was better than Dublin Core, Corey Harper, admitted this was a valid criticism in his article, “Dublin Core Metadata Initiative: Beyond the Element Set” published in the Winter 2010 issue of Information Standards Quarterly .

A couple of tweeters brought up DCAP (Dublin Core Application Profiles) which in theory could be used to allow for the use of additional (or alternative) metadata fields to address some of my issues with how well Dublin Core describes particular objects. However, as Corey Harper mentioned in a tweet, “I understand that DCAP infrastructure lacking, but…” (ellipsis in the original). But the “but” is not something that can be ignored. If the infrastructure isn’t there, it is a big issue – practice over theory. Even if the infrastructure wasn’t lacking, I am not sure how well it would address my criticisms. Even without DCAP I can add local qualifiers or elements for my application (and have in fact, done so), but as Dublin Core Metadata Initiative warns, “Nevertheless, designers should employ additional qualifiers with both caution and the understanding that interoperability could suffer as a result.” I don’t see how the use of multiple DCAPs would not end up leading to similar interoperability issues and result in a “Least Common Denominator” situation on the discovery end of things. Without discovery, you don’t have access, and without access you don’t have preservation.

Lastly, Michael Giarlo asked “But then is anyone actually putting DCMES up against MARC? Seems a category error to me.” I don’t think it is a category error at all. Both are metadata formats/standards that libraries are using to describe objects in their collections. Perhaps one might argue the category is overly broad category, but I think they are obviously in the same category. Comparing the two is only natural and is in fact, I think quite useful. DCMES may be easier to teach and for computer programmers to program, but in my experience it is nowhere near as useful when it comes to actually describing at an item – which as I said earlier is the goal in the first place. Maybe some technologists value interoperability over description, but I am not ready to go there. We need something better, not something different.

As I said earlier in the post, I doubt this will change anyones mind, but hopefully it explains why I said that MARC is better than Dublin Core.

[1] Truthfully I am a bit confused why this was an issue on Twitter. Wikipedia and even the official “Using Dublin Core” document Diane Hillmann created for DCMI just use the term “Dublin Core” to describe the metadata standard so this is pretty common usage.

[2] I do not mean to imply that anyone is making the argument that Dublin Core should completely replace MARC, but the MARC must die contingent is relevant to this particular discussion of MARC versus Dublin Core. At some point maybe I’ll make a post about some of the more complete alternatives to MARC being discussed.