The Future of HE Libraries and Rights Clearance

21 12 2016

Very excitingly, my book chapter on “Using Technology to Make More Digital Content Available to All” has now been published in “The End of Wisdom? The Future of Libraries in a Digital Age.”.


“The future of HE libraries will include taking a much more active part in helping its institution to navigate through the difficulty of rights clearance in order that they can publish content with as open a licence as is possible whilst conforming with the risk appetite of their institution. Traditionally, the response to this issue has been that it has been felt that it is not pragmatic to do this for various tranches of content which therefore remain closed and whose rights status are essentially deemed to be unquantifiable.

 The IOE has seen a vast increase in rights handling work in the last few years and the trend seems set for this to increase. Examples of this are the move towards OA for publication of research outputs expected for the next REF, digital archives, retrospective digitisation of theses, and preservation of Official Publications in education from the Web. This gives an institution a strategic imperative to increase its resources to deal with this work and where better to do so than the library in which much expertise exists already?”

The chapter then discusses some innovative techniques we used to help us with this problem. As I stated to begin with, this is only to whet your appetite. More will be revealed later.

A librarian’s view of an archive conundrum – GTCE

8 04 2014

In 2012, the General Teaching Council For England (GTCE) was disbanded and we were given a disk drive full of documents together with a spreadsheet of associated metadata. Archivists are (for not very surprising reasons) being deluged with born digital material, much of it inaccessible because it simply can’t be appraised at such a volume by a limited number of staff. For this reason, much of it simply has to be made closed access. We  wondered whether there was anyway in which EPrints could help. The archive system in use (CALM) is not itself a digital repository but more suited for describing print collections in the same way the many legacy library management systems are. Was it possible to use EPrints to allow for some level of discovery to take place in CALM and then (in cases where it was permitted), to allow for appraisal and ultimately to direct the end user to the full text document, possibly via the request functionality to allow for copyright consent to be obtained and the document released?

One of the assumptions we made at the start was that both we and the donor organisation had a shared understanding of the terminology in use here. For example, the metadata supplied by GTCE assigned each of the documents a category of “open” or “closed” access. It became apparent that the traditional archival definition of closed access (not available for access) was not what had been meant by GTCE. For example, it sometimes referred to a password protected pdf for which no password had been supplied and it was therefore inaccessible in that form, even though not necessarily unreleasable. As an aside, this refers to a digital preservation issue which was outside the scope of our project.

From the technical side, it wasn’t too hard to agree on a field mapping and ingest the metadata and documents. We used a basic Dublin Core scheme largely because there was not time to devise anything more complex. There was an issue over reconciliation of filenames and paths which caused some problems, but once resolved it was all emminently doable.

The trouble started when we started to review the documents uploaded prior to release. It was then that the access issues described above began to become clear. In one case, it became apparent that EPrint’s full text search indexing was designed in a way which meant that a document would (even if not released) have been open to searching for personal names, potentially picking up those documents in which data protection issues existed. Therefore making the document itself unavailable was not sufficient protection. It highlights a limitation when using EPrints for digital archives, though to be fair that was not what EPrints was originally designed for. In the event, the project taught us some valuable lessons and was a useful and practical way of introducing ourselves to digital archives. It is a field which is still in its infancy, but there are some direct correlations with some of the work that we have been doing in libraries. For example, the creation of DERA was designed to preserve an at-risk digital collection of published official documents. The areas of overlap are ingestion – we concluded that a more controlled form of this would have made things more straightfoward, perhaps some filename verification; and a requirement to use more digital preservation techniques. Watch this space!

New POPE has been appointed

6 03 2014

Preservation of Official Publications in Education (POPE) is our JISC-funded project which started 1st November. This four month project will visit all those nasty dead links in MARC21 856 fields which relate to Official Publications, and attempt to trace and ingest the digital copies into our DERA eprints system. We are using the Open Government Licence to do so. Now this is an example of common sense in IPR which leads to real benefit to the authors (those for whom copyright is said to exist in the first place). Whoever the authors are that contributed to educational policy development under the Labour Administration, they can rest assured that their work will be saved for posterity and allow for historical research to be conducted in relation to that Government’s views on education in the UK.

Contrast this with the general position on IPR that UK (and sometimes EU) legislation imposes upon us. I went to a fascinating meeting at JISC the other week in which it was explained that text mining in order to extract metadata which gives alternate access points to content is probably illegal unless explicit permission is granted from the rightsholder. This sits in direct opposition to the great concepts being developed by organisations such as the Resource Discovery Taskforce which encourage us to repurpose metadata as linked data and allow new access applications to be built by other communities. More critically for us at the moment, this stifles innovation and may harm our digital economy when other jurisdictions have a more measured view of the purpose of this.

My vote would be that we librarians should support the British Library’s attempts to get text mining added to a list of exceptions in copyright law, which would allow text mining to take place. Otherwise, we will remain chained to our one point-of-view silo-based fragmented search systems with islands of open data here and there.