A measured approach

As has become traditional, I’m posting again in February after a long break in the second half of last year. Hopefully in 2015 I can break my bad habit and actually continue with regular blog content all the way through the year.

I’ve spent much of the last few months obsessing over stats and analytics from the Boroondara library websites, as I developed a brief for developers to help us with a major overhaul. The experience has reinforced the advice from Matthew Reidsma to regularly analyse the way people use your website, and test and make changes immediately and incrementally. A lot of the recommendations I’ve made at Boroondara are as much about the way we produce website content as they are about the design of the sites. For example, I’ve discovered that visitors using mobile devices are most likely to visit on weekends, whilst visitors on desktop are most likely to visit on Monday and least likely to visit on Sunday. Do we need to change our posting schedules? Does this difference reflect different users or just the same visitors using different devices across the week? These are questions we would not have even thought to ask until we saw the data - and this is just one simple example. Something more intriguing (and hindsight obvious) was my discovery that visitors to our Storytimes page were more than 50% more likely to come from mobiles (about 46% compared to 30% for visits to all pages). It’s pretty easy to construct a story of busy parents checking their phone from the local park to check if the library has a storytime today - but we hadn’t really considered this behaviour until now (and of course, there could be any number of alternative explanations for why there is this difference).

What became quite clear is that we should have been doing more than simply look at total hits and visits each month and really looked deeply into our analytics on both our catalogue and our general website. I won’t be doing this at Boroondara, because I finished up there last week, but if anyone at Brimbank Libraries is reading this - be prepared to become obsessed with user tracking and analytics!

Coincidentally, I recently read John O’Nolan’s post about onboarding stats at Ghost. I’ve read lots of stuff from UX experts and library website experts emphasising the usefulness of things like A/B testing and ongoing analysis of usage data, but until now I’ve never fully appreciated what they’re saying. Perhaps it’s because the Ghost Foundation is a non-profit, but I found O’Nolan’s post helped me to see how we can (carefully) use usage data to help library members get more from us. That is, libraries have the ability to actually use analytics to ‘improve the user experience’. Using data to manipulate users to act against their own best interests, as too many commercial services seem to do, isn’t the only possibility.

A couple of simple examples

Email notices

Pretty much every library sends notifications to members in one form or another. Mostly these are emails. Whilst I am stunned by the fact that several major library management systems are still only capable of sending plaintext emails and not HMTL formatted emails, at this point I am going to assume you are sending HTML formatted email notices.

Ever wondered whether the wording of your notices is effective? Perhaps if you used a different subject heading or made your email text more friendly members would have less overdue loans. Wouldn’t it be great to test your theory scientifically? A/B testing is the way to do this. Web companies do this all the time. True A/B testing is random - on a given day a website might randomly show different users different configurations on the front page, for example. They can then test which configuration (‘configuration A’ or ‘configuration B’) resulted in more sales, or newsletter sign-ups, or whatever.

It all sounds very hard and complicated, but you can fairly easily use an analytics program like Piwik to create ‘campaigns’ and associated tracking codes. All this does is add some extra code to the URLs you use, which is identified by your analytics system when visitors use a URL with that code. You could use campaign tracking codes by sending out two batches of email notices (perhaps on two consecutive Tuesdays, for example) with a link to ‘click here to renew these items’. By comparing the number of hits on your login page from that tracking code to the number of notices sent out using it, you can measure the effectiveness of different types of approaches to subject headings, wording and layout.

What do mobile visitors want to do?

An even simpler example comes from some of the analysis I’ve recently been doing. I had a feeling that visitors on mobile devices might show different browsing behaviour to those on desktops, but I didn’t really know. Because browsers tend to broadcast what type of browser they are, what device they are installed on, and the size of their screen, it’s pretty easy to track what type of device visitors are using. By creating a segment (about 15 seconds in your favourite analytics software), you can determine if visitors from mobile (or tablets, for that matter) behave differently from desktop users.

What I discovered was that nearly half of all mobile visitors to our website visited the Opening Hours page - making mobile users about three times more likely than desktop users to be looking for our opening hours. This has obvious ramifications for any mobile optimisation of our website - clearly opening hours need to be pretty close to the first thing they see. Of course, by claiming your branches’ Google Maps pages you can ensure that your opening hours are available right there in Google before users even hit your site. Since we’re in the business of providing information and experiences, rather than selling stuff through our websites, we’re in the fortunate position that it doesn’t actually matter if people get the information they need (in this case “Is the library open?”) without visiting our website at all.

It might strike you as obvious that people visiting a library website using a smartphone probably want to know whether the library is open, but with hard data you can actually test such intuitions. There were plenty of other ‘obvious’ assumptions that I found to be false when checking our website analytics properly. None of the things I have just described are difficult or even particularly clever. There are smart librarians who use and understand these tools in much more sophisticated ways than I ever have. Given the state of most library websites, however, it seems doubtful that these sorts of techniques are anywhere close to mainstream in libraries today.


At this point, some of you are probably yelling at your screen “I thought you were supposed to be interested in user privacy, you hypocrite!” Indeed, I am very interested in user privacy. Whilst working on our website project I have also been busy tightening up the privacy and security of our existing catalogue. The conclusion I have come to, however, is that we can genuinely protect the privacy of library members and visitors whilst still collecting a lot of useful aggregate data. The important thing is to always consider the consequences of tracking, collecting and storing any particular piece of data before you do anything, and ensure that is how you decide whether to collect it, rather than how useful or interesting it might be.

There are a couple of practices we need to be particularly careful to avoid:

Linking web and search analytics to identified library members

Whilst it may be possible to make a link between a tracked website user and a registered member through data matching things like their IP address, this still takes time and requires a targeted effort aimed at a specific person. If, on the other hand, you set up your web analytics in a way that can easily identify search terms used by specific user (and, therefore, vice-versa) you make it possible to provide lists of search terms associated with a specific person, or lists of specific people associated with particular search terms. It would be so easy to track actual members’ search terms and general website use that you could probably do it accidentally.

This is also worth thinking about with regard to how you track individual website users. Piwik, for example, includes ‘Visitor profiles’, which track users over time based on their IP address. This makes me very uncomfortable, especially coming from software that prides itself on being great for privacy. There are a couple of ways to reduce the privacy problems caused by this. Firstly, Piwik can be set up to simply ignore the last one, two or three bytes of an IP address. This makes it impossible to track usage geographically to particular suburbs or cities, but usually you won’t care much about that. The other feature Piwik recommends administrators use is archiving. The archive function stores usage data in aggregate in tables, then deletes the actual logs. This means you get to use old data for aggregate reports, but when the men in dark suits come knocking you don’t have any personally identifiable data to give them.

Using third parties who can see your data

The reason I have been mentioning Piwik so much, and the reason for their claim to be good for privacy, is that Piwik is a software program, not a cloud service. When you use Google Analytics, it’s not just you who has access to that data. Google can track users across the web using the javascript embedded in at least half of the world’s websites. There’s a reason Google Analytics is free of charge. The same is true for Facebook tracking pages with ‘Like us on Facebook’ buttons, Twitter with ‘Tweet this’ buttons and so on.

It’s all very well to have policies and statements about the freedom to read and how you protect member loan records, but the world has moved on. The library user who doesn’t use online services at all is almost extinct. Privacy statements are one thing, but privacy practice is another entirely. As a general rule if the data isn’t stored on-site, someone else probably has access to it. If you didn’t pay anything for the service, you can guarantee that. Eric Hellman provided a stark illustration last year of how many people and organisations have access to your users’ data if you don’t pay attention to security and privacy. Following on the heels of the Adobe Digital Editions debacle in October, it should be obvious to even the most obstinately clueless that libraries need to ask a lot more questions when third parties are providing services on our behalf.

The future

I’d like to see libraries take more action to protect user privacy and collect more and better data. I truly think it is possible for us to do both - but only if we are careful and thoughtful about how we go about it. Jason Griffey announced an exciting new project over the weekend, called ‘Measure the Future’. Led by Griffey and other library stars Gretchen Caserotti and Jenica Rogers, along with educator Jeff Branson, the project seeks to build a ‘Google Analytics for your library building’, tracking physical use of libraries just as we can track digital use. Built on open hardware and software by librarians, this has huge promise - but we need to be mindful of the same privacy concerns we have always expressed with regard to reading habits, and started to neglect as reading moves increasingly to digital environments.

Currently most libraries seem to be (accidentally) providing a huge hoard of private user data to virtually anyone who wants it, but not actually using any of it themselves. If we are to credibly claim to be defenders of intellectual freedom and responsive to our communities, we need to use data more cleverly - and protect member privacy while we do so.