blog.ecorrado.us

Ramblings about library technology, open source software, and other adventures!

 

MARC is better than Dublin Core 2011 June 1

Filed under: conferences,libraries,technology — ecorrado @ 14:06:33

During my presentation on Digital Preservation: Context & Content (slides) at ELAG 2011 last week, I made the statement that MARC is better than Dublin Core. This may have been a bit of proactive statement but I thought it was relevant to my presentation and the conference in general. I felt that someone had to say it, especially since there was a whole workshop on MARC Must Die and a number of other presentations were gleefully awaiting the day we were done with MARC. For example, with a great deal of support from the audience, Anders Söderbäck said “we the participants of #elag2011 hold these truths to be self-evident, that MARC must die…”.

Probably not surprisingly my statement sent off a mini-barrage of messages on the conference Twitter feed. Since the conference was almost over (my presentation was the second to last) and it wasn’t the core to what I was talking about, I didn’t have time to explain/expand on my position. I know that some of the people that responded to my statement on Twitter were not at the conference and at least a few I am pretty sure weren’t watching the live stream. Because of this, I wanted to take this time to put the statement in context and explain why I said I think MARC is better than Dublin Core. I understand people may not agree with me and this post won’t change that, but that doesn’t mean I need to agree with the band-wagon that wants to kill something that has been pretty successful for the last 40 or so years.

Before going any further since I’m not sure it was clear to everyone commenting on Twitter, I should point out that by MARC I mean MARC 21+AACR2 (which is the common usage of the term in the USA), but I imagine the same statements would likely apply to any version of MARC + what ever set of rules you want to apply, Similarly, by Dublin Core, I mean the Simple and/or Qualified Dublin Core Metadata along with the Dublin Core Metadata Element (DCMES) format (i.e. descriptive fields). I know that there are other aspects of the Dublin Core Metadata Initiative, but for the purposes of this discussion I don’t believe they are germane [1]. I am focusing on how Dublin Core can be used to describe objects. After all, that is why librarians use metadata – to describe things. No matter how easy it is for machines (or humans) to parse a metadata record, it would not be very useful if the standard does not make it possible to adequately describe, in a consistent way, whatever it is that one is trying to describe. I should also also point out, that while I love theory and research, in this case I am mostly concerned with the practical.

The statement came out of my experiences thus far with using Dublin Core for digital preservation at Binghamton University. Before we started on this, I was familiar with Dublin Core but never really had to work closely with it on a large scale so I didn’t have a strong opinion of it. I am not a cataloger, but as a systems librarian, I feel it is necessary to follow developments in cataloging and I am also having to work with MARC records on a fairly regular basis. Thus, I realize that MARC has its issues but please don’t kill it until we have something better and at this point, I don’t believe we do. [2]

In short, my problem with Dublin Core is that it does not allow for the granularity and consistency that I believe is necessary to adequately describe a mixed set of objects for long term preservation and access. Mixed sets is important here, if you are doing a long term preservation project that includes a disperse set of objects, I believe it is important that there is some consistency across collections. This is especially true if they are going to be managed or searched together. Librarians often comment on the need to break down silos or at least tie them together for discovery. The metadata needs to be adequate to do this. Maybe if you are a national library you can have multiple digital preservation solutions, but at a mid-sized university library that approach is problematic and most-likely not realistic. This is doubly so if you consider that one of the main components of preservation is ensuring access in the future (i.e. you are not talking about a dark archive). This is not really a new or unique criticism but I think it is often overlooked and/or too easily dismissed. Even one of the people who objected to my saying MARC was better than Dublin Core, Corey Harper, admitted this was a valid criticism in his article, “Dublin Core Metadata Initiative: Beyond the Element Set” published in the Winter 2010 issue of Information Standards Quarterly .

A couple of tweeters brought up DCAP (Dublin Core Application Profiles) which in theory could be used to allow for the use of additional (or alternative) metadata fields to address some of my issues with how well Dublin Core describes particular objects. However, as Corey Harper mentioned in a tweet, “I understand that DCAP infrastructure lacking, but…” (ellipsis in the original). But the “but” is not something that can be ignored. If the infrastructure isn’t there, it is a big issue – practice over theory. Even if the infrastructure wasn’t lacking, I am not sure how well it would address my criticisms. Even without DCAP I can add local qualifiers or elements for my application (and have in fact, done so), but as Dublin Core Metadata Initiative warns, “Nevertheless, designers should employ additional qualifiers with both caution and the understanding that interoperability could suffer as a result.” I don’t see how the use of multiple DCAPs would not end up leading to similar interoperability issues and result in a “Least Common Denominator” situation on the discovery end of things. Without discovery, you don’t have access, and without access you don’t have preservation.

Lastly, Michael Giarlo asked “But then is anyone actually putting DCMES up against MARC? Seems a category error to me.” I don’t think it is a category error at all. Both are metadata formats/standards that libraries are using to describe objects in their collections. Perhaps one might argue the category is overly broad category, but I think they are obviously in the same category. Comparing the two is only natural and is in fact, I think quite useful. DCMES may be easier to teach and for computer programmers to program, but in my experience it is nowhere near as useful when it comes to actually describing at an item – which as I said earlier is the goal in the first place. Maybe some technologists value interoperability over description, but I am not ready to go there. We need something better, not something different.

As I said earlier in the post, I doubt this will change anyones mind, but hopefully it explains why I said that MARC is better than Dublin Core.

[1] Truthfully I am a bit confused why this was an issue on Twitter. Wikipedia and even the official “Using Dublin Core” document Diane Hillmann created for DCMI just use the term “Dublin Core” to describe the metadata standard so this is pretty common usage.

[2] I do not mean to imply that anyone is making the argument that Dublin Core should completely replace MARC, but the MARC must die contingent is relevant to this particular discussion of MARC versus Dublin Core. At some point maybe I’ll make a post about some of the more complete alternatives to MARC being discussed.

 

1 Comment for this post

 
Bruce Says:

Ultimately the question of “better” needs context.

“Better” for whom, for example? It’s clear you implicitly mean “librarians”, but I would just point out that MARC only seems to make sense to library people, who have been trained in its use. It is almost entirely ignored outside the library world, in favor of simpler data representations: from simple DC in an HTML head, to microformats, to DC as RDFa. I have heard prominent microformat people even proclaim DC too complex/esoteric (hence you see no hDC microformat).

And by “DC” exactly what representation of it? It seems to me quite useful in the context of RDF, where you can use the basic terms for broad description, and then supplement it with other vocabulary terms for more specific purposes.

I guess you can tell where I stand on MARC ;-)