Searching for author names in online systems is fraught with difficulty because there are so many possible variations for each name. The Names Project promises to provide a much-needed solution for the many institutional repositories which are being developed.
Universities and research institutes around the world have embraced the idea of creating digital repositories of their faculties' work with enthusiasm. In the UK the OpenDOAR directory (www.opendoar.org) reports 167 operational repositories and the growth towards establishing such repositories is well illustrated in Figure 1.
In some institutions the process of submitting an item to the repository is entirely controlled by repository staff members, but in over 80% of repositories any member of university staff can submit an item, which causes issues with consistency of author names. With JISC funding, the British Library and Mimas, at The University of Manchester, started the Names Project in July 2007 to investigate the feasibility of setting up a Names Authority Service using Zetoc data as the basis.
Allowing as many people as possible to supply materials can encourage engagement with the repository, but one consequence of this approach is that the information provided will not be as consistent as it might be if submission were the responsibility of a smaller number of individuals. An author whose name is Alexandra Nicole Rose, for example, might be listed in the following ways:
Rose, A. N.
Rose, Alexandra N.
Rose, Alexandra Nicole
Rose, Alexandra
Used as a search term, each of these variant forms will bring back a different set of materials for that author, making it impossible to easily retrieve all the research relating to a single individual in one operation.
There are related issues in some repositories when two different people have similar names. At the University of Birmingham, for example, there are two people called Andrew J. Schofield: one a professor of theoretical physics and the other a senior lecturer in the School of Psychology.
In the July 2010 survey, repository managers were asked about these name-related problems in retrieval and description of resources in repositories. 80% of respondents reported that at least one of these issues was having an impact on their work.
A developer joined the team in early 2008, allowing work to begin on building a prototype to test the feasibility of a name authority system which would be useful for repositories and other services.
The project partners, the British Library and Mimas, already had a successful background of jointly providing the Zetoc service for the UK academic community (http://zetoc.mimas.ac.uk/). Zetoc gives access to information about the millions of journal articles and conference proceedings which have been added to the holdings of the British Library since 1993. Consequently, it holds names of individuals who have contributed to those materials, many of whom are currently active researchers: the sort of people who may be depositing copies of their work into institutional and subject-based repositories.
The fit between the contents of Zetoc and the aims of the Names Project was too good to overlook. The project team decided to use the information in Zetoc to create skeleton name authority records. From a test Zetoc data sample, for example, the following information has been grouped together about an individual with the name E. L. Jones (Zetoc contains only initials and surnames):
Fig. 3: Sample Names record for E. L. Jones
From this we can see that the automated matching process has matched E. L. Jones with two journal articles and a conference paper. From these, the author's fields of interest can also be identified, as can her co-authors. This example illustrates a potential problem with the source data: the name L. R. Prosnitz appears as a collaborator, as does L. P. Prosnitz. It is possible that these are two different people, but more likely that there is a mistake in the source data.
After doing a complete analysis of the Zetoc data, the Names Project will have millions of these basic records, covering researchers all over the world. The next step will be to improve the data, for example by adding full first names and institutional affiliation information.
Amanda Hill is project manager for the JISC-funded Names project (http://names.mimas.ac.uk/). She is an archivist who has worked for a range of university and local authority archives. Between 2001 and 2007 she ran the Archives Hub service for Mimas at the University of Manchester (www.archiveshub.ac.uk). In 2007 she left the UK and emigrated to Canada where she now works as a consultant (www.hillbraith.com) with a strong interest in improving and widening access to institutional resources of all types.
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.