Burn it all down

29 May 2015

Imagine a colleague. She’s been hardworking, loyal and dependable her whole working life. Unfortunately, other than a late-career attempt to retrain, she more or less performs the same functions in the same way she did when she joined the library way back in 1968. That was the same year COBOL was standardised, just 16 years after Grace Hopper created the very first compiler. Your colleague had already been working for thirteen years before MS-DOS was released, and in her fourteenth year of employment the Commodore 64 became the hottest thing in computing. Her career predates nearly every programming language, every metadata standard, and every email ever sent. That’s right, I’m talking about MAchine Readable Cataloguing - MARC. If MARC started its career aged 18 it would be retiring next year aged 65, but it’s been redundant for years already.

Sometimes it feels as if the whole of librarianship itself is winding down to retirement. Whilst there are some amazing people doing innovative, urgently needed work, they always seem to be doing so against a background of disinterest, inertia and change aversion. MARC is an easy target, yet here we are, 20 years after Dublin Core, thirteen years after JSON, 8 years after the stabilisation of RDFa - still unwilling to step away from such a manifestly outdated and limiting metadata standard. Instead of embracing these exciting developments, 18 years have been wasted arguing about whether and how to make incremental, contradictory, and minimally valuable changes to MARC.

Waiting for Godot the Library of Congress

The best we seem to be able to do as an industry now is to wait for two American organisations - the Library of Congress and OCLC - to decide on our behalf what linked library data standards should look like. Never mind that one is a library for a specific set of legislators and the other is largely indistinguishable from a commercial software vendor. Whenever I discuss RDF or linked open data with colleagues the usual response is “Oh, you mean BIBFRAME”. Well yes, and no. Yes, BIBFRAME is an interesting project. And yes, the Bib Extend group (supported to a large degree by OCLC) preparing proposals for Schema.org is extremely valuable and necessary. But these are just two examples of how linked data and RDF can be used.

There is a long history of this sort of thing. Libraries across the English speaking world almost universally use Library of Congress Subject Headings to index our collections. This is such a standard that most librarians don’t ever stop to ponder how bizarre it is. Are librarians serving the needs of the United States legislature the best people to decide how to describe the subject of a book in a rural Australian library?

The incapacity of librarians to quickly move to take advantage of the enormous benefits of linked data displays some serious cultural and technical problems in the profession. We wait for someone else to make the rules, someone else to make the software, someone else to set the standards. Except they have to be library rules, library software and library standards. Heaven forbid we’d leverage knowledge and systems from a wider field. This passive and closed thinking causes all sorts of problems for libraries.

A box of hammers

Take, for example, the way libraries write tender specifications for library software. Invariably, requirements are listed in terms of the particular technology or protocol they must use, rather than in terms of their functionality. What protocol must they use? The one we already use, of course. Thus, publishers and e-book platforms supply metadata as MARC records to libraries (often for a fee), and software vendors ensure their products comply with SIP2, even though they are all too aware of how outdated, insecure and generally unfit for purpose it is. If instead we asked vendors to provide efficient and flexible metadata transfer abilities, and secure and rich circulation data services, they may well come to us with products that look quite different.

Consider the way WorldCat or Libraries Australia allow libraries to use a central shared repository of catalogue records to streamline copy cataloguing. OCLC is in fact ramping up their focus on this service, with their WorldShare library management platform enabling what amounts to two-click cataloguing by leveraging WorldCat. This is largely achieved through the magic of ‘the cloud’ - which is to say, the magic of sticking everything in the same multi-tenant data centre controlled by a corporation for a fee.

This is one way to do things, and not necessarily a terrible idea1, but in many ways it is an old way of thinking about the problem of sharing files between separate organisations with separate infrastructure. The multi-tenant data centre solves the problem by simply sticking all the infrastructure in one place and managing it together. This suits many librarians because we can now pretend to be forward-thinking, tech-savvy managers by outsourcing all responsibility for our technology needs. Through the magic of tender documentation this is paid for by inflating the price of library management software to account for data hosting and management. But there is still a central server hosting the ’source of truth’ for catalogue records, and a bunch of what are effectively client servers separately storing copies of the original record, sometimes tweaked for local preferences.

So 47 years later what do we have? A bunch of libraries with their own catalogues, consisting almost entirely of MARC records, created using rules designed for the completely unique needs of one particularly unusual library, and which they all got from the same central repository. More or less the same thing we had at the beginning. This is insane.

A pipeline to nowhere

You may be tempted to think that either the problem or the solution is Library School™. You might think it is a problem because universities are teaching the wrong things - they need to teach more technical and practical skills. You might think it is actually the solution, because what libraries need is an influx of shiny new talent - young, tech savvy, new hire messiahs. Either way, you’d be thinking this is a ‘pipeline problem’ where the solution is simply to get more people with the right skills flowing down the pipeline into the profession, and all our problems will be solved.

That’s not going to happen.

Libraries have a ‘pipeline problem’ in the same way the software industry does.2 That is to say, we don’t have a pipeline problem, we have a cultural problem. There are many wonderful library and information science degree courses in Australia, the US and elsewhere. The problem is that when these qualified librarians enter the workforce, they’re usually not allowed to do anything interesting, and there are few managers with the ability to mentor them in a way that nurtures those technical skills.

Part of the problem is the separation between ‘technical’ and ‘professional’ roles. In Australia and Canada3 at least, there are two official levels of library worker - librarians and library technicians. The library technician qualification is sometimes referred to as a ’para professional’ level, and stems from a time when libraries needed staff to assist cataloguing librarians with some of the tasks that fitted somewhere between cataloguing and basic administration. Those times are more or less over, and in the meantime virtually nothing has been done to update our conception of what a library technician’s skill set should be in the digitally networked age. This is not to say I don’t respect the deep knowledge and experience of library technicians and cataloguers. People who are so at one with MARC that they prefer to read MARC files in the raw are impressive. But they’re impressive in the same way Irving Finkel reading cuneiform tablets is impressive.

At the same time, circular arguments about librarians needing to learn to code burst forth every now and then. The heart of the problem is that libraries can be incredibly hierarchical places to work, and librarians always have to be at the top of the pyramid. So we keep the library technicians but, don’t teach them anything actually technical, and refuse to hire actual technical people - you know, IT graduates. People who don’t need to learn to code because they already know how. People who would make brilliant and useful contributions to our libraries, but have an Information Technology degree instead of an Information Management degree.

The age of library technicians and cataloguers is over, and shiny new librarians can’t solve our problems if they’re not supported by bold leaders and managers. What we need now is metadata managers. If this is going to happen, librarians - and specifically, library leaders and managers - need to build a culture that takes responsibility for our own destiny and our own decisions. All too often I hear librarians complaining that they would do things differently if only. If only IT would let us. If only our library software vendor would provide that feature. If only we had more money. If only there was an agreed new linked library metadata standard.

JFDI, and other acronyms

Things happen when people make them happen. Standards are rarely standards from birth. Things become a standard when there is agreement that they are generalisable and the best fit for purpose4. Standards also don’t have to mean beige sameness. TCP/IP and HTML are fairly straightforward, yet power billions of diverse and rich web experiences daily.

When librarians in New Zealand and Georgia, USA, were unsatisfied with commercial library software, they built Koha and Evergreen respectively. When librarians from a range of organisations needed a multi-purpose repository and digital asset management system, they cooperated to create Project Hydra. When existing discovery platforms weren’t cutting it, Blacklight was born. Things happen when people make them happen.

Others are now building on these successes. Oslo Public Library has taken the plunge and is moving to RDF linked data as their core metadata format, abandoning MARC entirely. What I find exciting about this is that Oslo have decided they simply cannot keep waiting for an agreed international standard. Instead they are forging ahead, building a system that has enough flexibility that they should be able to tweak it later to fit any standard that appears.

It’s time for bold ideas. For librarians who see a problem and start working to fix it. For librarians who learn how to build the technology they want, as they’re building it. For librarians who reconceptualise how we share metadata. For librarians who won’t accept the world as it is.

It’s time to do things because we can. It’s time to take risks. It’s time to reach out beyond librarianship and library standards.

It’s time to burn it all down.


1

setting aside some complex privacy and data sovereignty issues.

2

you could run a whole Academic sub-genre publishing articles just about the tech industry’s diversity ’pipeline problem’.

3

though not the US, I recently learned.

4

or sometimes just because of the network effect.