We are used to thinking of data as sitting beneath information and knowledge in a pyramid of value. But the concept of data has grown significantly in importance in the last five years, driven by public speculation about the power and risks of big data. In parallel, data related roles are becoming increasingly important in professional practice. Librarianship is becoming more of a data profession.
The Research Data Management example – familiar tasks
One obvious starting point for thinking about this impact in the academic library context is to analyse the specific case of Research Data Management (RDM). This is an area where dealing with data is clearly central. But is this familiar territory, simply reinventing or extending what librarians already do, or is a new set of competencies required?
The most direct way to understand what RDM means to academic librarians in practice is to think of some of the key tasks that are involved in delivering a research data service. These tasks would include:
- helping a researcher to find pre-existing data sources relevant to their research
- running a training or awareness session
- reviewing metadata associated with a potential deposit into a data repository
- investigating what researchers need in terms of support
- inputting to the creation of a data policy framework
- offering advice on a Data Management Plan (DMP) for a project proposal
Nearly all of these activities have strong continuities with what we already expect to do as a librarian.
What could be more natural for a librarian to find themselves helping someone search for and evaluate a source, albeit data rather than a published text? Although maybe needing to know about data sources and data archives, licence conditions and sources of data, overall this aspect of RDM feels like a very familiar role that a librarian would have the requisite skills for (Gregory et al., 2018).
Adding something about RDM to an existing user training session would be familiar to the information literacy focus of academic librarianship. There are continuities with open access advocacy and learning outcomes that would fit into existing information literacy sessions for researchers.
Reviewing a potential deposit to the data repository is again familiar territory. It is a fairly standard library task rooted in collection management principles and based on an understanding of the importance of metadata and standards. It might well be combined with the role of monitoring metadata related to outputs in the repository.
Gathering requirements from researchers about their support needs should come naturally to a user service-focused profession like librarianship. Librarians are used to gathering data from interviews, focus groups and surveys to discover what services users need, and then designing services or procuring systems to meet these needs. RDM takes us deeper into the research process as an aspect of scholarly communication, but a strong interest in user behaviour is a good starting point for carrying through this task.
Data policy is about creating a governance structure within which data is valued and managed. Contributing to the development of such a policy is probably a fairly familiar task, and requiring a good understanding of the wider policy context, such as relevant institutional and national policies.
Helping a researcher write their DMP could be the most unfamiliar of all the tasks listed above. While the skills needed to do it effectively are the same as any advice service, this particular support requires a fairly deep knowledge of funder requirements; of relevant standards; of local data management processes (e.g. around data storage); as well as a feel for the research process.
Thus so much of what RDM is about could be considered somewhat familiar territory – it seems to involve acquiring new knowledge but much of the role is familiar. There are plenty of ways that roles in RDM build on skills and knowledge that most already librarians have (Cox and Verbaan, 2018).
RDM – less familiar tasks
There are RDM-supporting tasks that are less familiar. These could include roles in data curation, data carpentry, data integrity, data analysis and visualisation and embedded roles in research project teams. These are more like the specialist or cutting edge of RDS and would include:
- Data curation - the long term digital preservation of datasets. Although traditionally an aspect of library work, this is more the territory of archives
- Data carpentry - understanding how to manipulate and transform data, preparatory to analysis
- Data integrity - data quality, reproducibility and open science
- Embedded roles - working directly with a research team. Breaking out of the library and working with researchers on a daily basis
- Analysis and visualisation – supporting, undertaking these tasks or selecting/supporting computational tools to do analysis
It remains to be seen whether academic libraries will start to see these roles as standard tasks. Probably institutions will vary depending on research intensive nature, among other factors.