Why should universities care about identifiers?

17 August 2012 - 3:19pm — Talat Chaudhri

Why do identifiers matter for research?

Imagine that you are a senior manager in an institution within the UK Higher Education sector with responsibilities for research: you have read some basic details about unique researcher identifiers and perhaps institutional identifiers. However, it may not be immediately apparent just how important these issues are, which may seem on the face of it to be a relatively superficial and/or trivial organisational matter. Clearly, any such strategic decision-maker will long have been aware of the demands of the Research Excellence Framework (REF) and its predecessor the Research Assessment Exercise (RAE), in which successful reporting of the best research outputs of university departments is crucial to the on-going funding of the institution. This is particularly central to the work of research-led universities, which is an increasingly competitive sector: even universities that formerly focussed more on teaching than research are increasingly aware of the need to drive up standards of quality research in order to secure additional funding.

The reality of unique identification in research

However, as anyone who has actually engaged with the business of research reporting to any degree will tell you, it is far from a superficial or trivial matter to carry out such an exercise without thinking very carefully about how researchers are identified; moreover, identifying the research groups, departments, projects and institutions that they may have variously belonged to at different times, all of which may have been re-organised on many occasions, is a considerable challenge raising considerable technical as well as organisational issues.

Perhaps the biggest problem of all derives from the scale of research reporting. On such a massive scale, it has to be done in a systematic way across higher education institutions in order to be useful. Any lack of a systematic approach in collecting the information on the institutional level will inevitably result in higher costs in processing the information later into a useful form, for example by governmental organisations such as HESA and the Research Councils (RCUK) relevant to each area of academic study. This may be carried out for a variety of reasons, amongst them for example:

The need to produce statistics at a national and at an institutional level in order to gauge how successful different parts of the research community are performing in comparison to each other and to similar institutions internationally, which may be a determinant of how funding is allocated.
The production of good, widely accessible information about the work of academic researchers and research groups for the purposes of future research, both in identifying research as a basis for future work and for guiding individuals and groups in terms of who they might work with in future, who their competitors may be, and in creating wider bibliographic information for a whole range of related purposes related to future publications.
Open Access, an increasing requirement imposed by funders where research is publicly funded.
Accountability in the use of public funds for research.

It is precisely the lack of a national approach to providing consistent metadata about individuals and groups connected with research that raises costs, creates inefficiencies and frustrates the development of new software functionality that makes the jobs of research managers more difficult and ultimately reduces the funds available to research and their best use within the sector. It is therefore the business of senior managers of academic research to care about identifiers.

Researcher identifiers: a crucial first step

Before any wider metadata about research may be considered, the most fundamental issue is identifying individuals who carry out research. Before this happens consistently on a national level, there is little point addressing the subsequent issue of identifying groups and institutions engaged in research consistently. It is also important to consider any national approach in terms of interoperability with other international approaches wherever possible: while, on the one hand, funders and statistics agencies can only hope to mandate national identifier schemes, at the same time it is clear that research collaboration is cross-institutional and international in scope, in some cases including researchers from numerous countries in one project or even in the production of one individual paper, data set or other research activity. This is the approach that has been taken by the JISC, together with RCUK, HESA and other partners in setting up the Research Identifiers Task and Finish Group, which is due to report in October 2012.

One emerging candidate with cross-sector and international support is the ORCID researcher identifier scheme, whose rapid development in 2011-12 is scheduled to culminate in a public launch in October 2012. There are, of course, existing, widely-used but relatively simple identifiers such as the HESA researcher identifier, and identifiers provided through commercial providers' web interfaces, but thus far these have not provided dependable unique identification. All such identifiers could be linked to a system like ORCID that is designed on interoperable principles and is not dependant on any particular software platform or web interface. An alternative approach is taken by the ISNI number: whereas ORCID seeks to offer individual researchers and institutions the ability to manage their data on a distributed model, ISNI represents a centrally moderated, bibliographic approach led by national libraries and other similar institutions with national and strategic responsibilities. It remains to be seen whether these different approaches are in competition or whether they will offer different but complementary functionality within the sector, and much may be dependent on how software vendors implement them.

Current Research Information Systems (CRIS)

It is not simply a matter of tracking publications and other related ouputs, for example in institutional repositories. This part of the equation is by now relatively well established in the UK HE sector, although it continues to develop: the issues surrounding Open Access, for example, have not been fully resolved. This, however, is just at the level of the final outputs of research and does not provide anything like sufficient insight into the processes of research, the projects and groups carrying out, the staff involved or the costs. Traditionally, this information has been gathered in a very long-winded process that is individual to each institution's particular workflows and processes (although there are obviously great similarities of approach between them), often a partly paper-based exercise that has been migrated to an extremely varied range of systems and databases, few of which are interoperable or complete. Many departments may be involved in the process apart from the institution's research office and the department in which the researchers are based, but perhaps the most significant would be the finance office, the human resources department and the library, to name just the key players. It will be necessary to keep some information confidential, e.g. personal staff information, salaries and so forth, to share some information internally and with research funders, and to publish other information, e.g. in a research repository that forms the institution's "shop window" of public outputs, library databases and so forth. The term Research Information Management (RIM) has emerged to cover all of these information gathering and information processing activities.

In order to do this systematically, more sophisticated research information management software has been developed, often known as Current Research Information Systems (CRIS). The market in the HE sector is currently led, in terms of the number of institutions adopting the software, by PURE, produced by ATIRA; other major players are Symplectic Elements, and CONVERIS, produced by AVEDAS. More recent entrants to this market are Thomson Reuters' Research in View. There are currently no open source products, although a JISC-funded modular approach by the Research Management and Administration Service (RMAS) project may have an increasing impact in this area, depending on subsequent adoption by HE institutions. It is not an overstatement to say that HE institutions are currently in a rush towards early adoption of these CRIS systems, motivated by the need to use research data to compete with each other for funding opportunities.

Next steps: organisational identifiers

In the next 2-3 years, it is likely that the matter of unique researcher identification will be resolved through the emergence of a dominant standard that has sufficient take-up and leverage in the UK and international HE sector to faciliate the work of research institutions and funders. Following this, there will be organisational structures associated with research that will require unique identification, often on a multi-layed basis: for example, a project may be at several institutions, perhaps internationally, and their staff may be in various departments or similar units whose names have changed or have been merged or de-merged at various times, all of which will require careful date and time stamping to make the information reliable for the period that it covers. There will be issues related to copyright, commercialisation and spin-off companies that make the precise provenance of research critical to the future success of academic research and development. Standards for organisational indentifiers are therefore the next important issue on the horizon. Like researcher identification standards, research managers and senior managers with strategic responsibility for research will need to keep abreast of this rapidly developing area.

Printer-friendly version