Information professionals and linked data

A survey of information professionals in libraries, archives and museums set out to discover their attitudes to using linked data.

Page 1 of 2 next >>

Last year a questionnaire was distributed to libraries, archives and museums with the aim of exploring the benefits and challenges to using Linked Data as perceived by information professionals. The results of our survey, as well as a potential solution to overcoming some of the challenges identified, are discussed below.

The semantic web and linked data

The Web contains a vast amount of information presented in the form of webpages linked together via hyperlinks. In order to find specific resources on the web, search engines are used to rank webpages based on relevancy via keyword searches. While this is done to great effect, unlike humans, computers have very little understanding of the meaning of data on these webpages nor do they understand how they relate to each other.

The Semantic Web[1] (SW) is an extension of the current web in which information is given well defined meaning, for example, indicating whether a data entity refers to a geographic location, a person’s name, or a book title. Linked Data[2] (LD) involves creating identifiers for these entities and then linking them together by meaningfully describing how they are related, for instance, describing how Jane Austen is the author of the novel Pride and Prejudice. These entities can then be linked to endless amounts of other related resources, such as publishers, illustrators, plays and films, creating a web of related data.

Why should information professionals be interested?

From the perspective of libraries, archives and museums (LAMs), publishing metadata as per SW standards has the potential to greatly enhance information discovery through the interlinking of related resources across institutions. Additionally, by freeing metadata from databases and sharing it on the SW, LAM data could be accessed by SW search engines, increasing resource visibility on a global scale. Despite these benefits, relatively few LAMs have invested in LD projects, and the majority of these display limited interlinking across datasets and institutions.

Our research

We conducted a survey of information professionals (IPs) in order to investigate their position in relation to LD, with a particular focus on exploring the reasons behind the slow uptake of LD in LAMs and the noted lack of resource interlinking. The survey was completed by 185 librarians, archivists, metadata cataloguers and LAM/LD researchers. These participants represented a variety of institutions including: Academic Libraries (56%), Research Institutions (7%), Public Libraries (7%), Special Libraries (6%), Archives (6%), National Libraries (5%), Museums (4%), and Special Archives (1%). Participants also represented 20 different countries, with the majority coming from Ireland (28%), the USA (23%) and the UK (20%). Most participants had at least some prior knowledge of the SW (84%) and LD (90%).

Key findings

Over 80% of participants indicated that publishing and consuming LD would benefit the LAM domain. The main benefits mentioned include improved data discoverability and accessibility, as well as easier metadata sharing and resource interlinking across institutions.

Over 60% of participants indicated that LAMs face multiple barriers to using LD particularly in the areas of LD tooling, data integration, data interlinking, and resource quality. Participants mentioned that tools are often technologically complex and unsuitable for the needs and workflows of LAMs. With regards to data integration, participants indicated that mapping between the different controlled vocabularies used across different datasets poses a significant challenge.

A more in-depth exploration of the interlinking issue highlighted the processes of ontology and link type selection (determining and describing the relationship between two entities) as areas of particular difficulty. Participants also highlighted concerns regarding the quality, especially in relation to authority control, and the reliability of many currently published LD resources.

Upon exploring potential solutions to the aforementioned challenges, 89% of participants rated the idea of LD interlinking tool designed specifically for LAMs as useful. Reasons for this included that a bespoke tool could increase the number LAMs using LD if designed with the workflows and requirements of such institutions in mind. Participants also stated that LAM specific tooling could help overcome the technical knowledge gap currently experienced by IPs when using LD.

Future directions

Due to the positive response toward designing a LD interlinking tool specifically for LAMs, our future research will focus on the development of a framework to facilitate IPs during the interlinking process. This framework will include interlinking guidelines for LAMs as well as a user-interface tailored to the needs, workflows, and expertise of IPs.

Conclusion

Nowadays, with the Web often being the first and only place where people search for information, it is of great importance that LAMs make their data available online where it can be found by search engines and interact with other information resources - LD offers a means for LAMs to achieve this. Additionally, as experts in the field of metadata creation and knowledge discovery, IPs are well positioned to play a key role in the evolution of the SW. Therefore we believe that facilitating and supporting IPs in using LD will of great benefit to the SW as a whole. The results of our survey offer an insight into the barriers IPs experience when working with LD. We hope that LAMs could use this information to address and provide potential solutions to the identified challenges.

A more in-depth discussion of our survey and its results can be found in: McKenna, L., Debuyne, C., & O'Sullivan, D. (2018). Understanding the Position of Information Professionals with regards to Linked Data: A Survey of Libraries, Archives and Museums. In JCDL’18: The 18th ACM/IEEE Joint Conference on Digital Libraries, June 3–7, 2018, Fort Worth, TX, USA.[3]

______________________________________________________________

Lucy McKenna is completing her PhD under the supervision of Declan O’Sullivan and Christophe Debruyne in the ADAPT Centre, Trinity College Dublin. Funded by Science Foundation Ireland, ADAPT is a multi-institutional dynamic research centre focused on developing next generation digital technologies. Lucy's research is in the area of Linked Data for libraries, archives and museums. Lucy obtained a Masters in Library and Information Studies from University College Dublin in 2015.
_____________________________________________________________________

[1] https://www.w3.org/standards/semanticweb/

[2] https://www.w3.org/standards/semanticweb/data

[3] https://dl.acm.org/citation.cfm?id=3197041

Page 1 of 2 next >>