Geo-Crosswalk – a gazetteer service and server for the UK

Presentation at the NKOS Workshop, JCDL 2002, on Digital gazetteers: integration into distributed digital library services, July 18, 2002

Andy Corbett¹, James Reid¹, David Medyckyj-Scott¹, Cressida Chappell²

¹ EDINA, University of Edinburgh, Edinburgh, Scotland. Email: a.corbett@ed.ac.uk

² History Data Service, University of Essex, England.

EDINA and the History Data Service are currently being funded by the UK Joint Information Systems Committee (JISC) to develop a prototype UK gazetteer service. The project follows on from a very successful scoping study carried out in 2001.

The aim of the project is to demonstrate the practicability of providing a full service to enhance geographic searching within the UK tertiary education electronic information environment. The resulting service will provide researchers and teaching staff with access to an on-line gazetteer for reference and act as a place names authority which could be used to provide authoritative place names to be used as an index terms when indexing learning objects. However, the novel aspects of what is being proposed are that the gazetteer service will also

1)    act as a ‘middleware’ server for other information services, such as those found in Digital Libraries, that wish to use geographic searching as part of their own service, without having to deal with the complexity of the wide variety of geographies that exist in the UK

2)    assist in the semi-automatic geographic indexing of descriptions of information objects through geo-parsing functions.

The project thus brings together many of the aspects (protocols, data modelling, geo-parsing) to be addressed by this workshop.

The gazetteer will be based upon the Alexandra Digital Library Gazetteer Content Standard with an emphasis on trying to ensure that geographic features are represented by their appropriate geometry. This means that settlements are represented as areas, rivers as lines and so on.  The gazetteer will contain large amounts of different types of boundary data thereby permitting ‘cross-walking’ from one geography to another. Although the focus is on near contemporary data, the database is designed to also handle historic data and thereby provide a link to another project in the UK which is looking at the historical aspect.

The UK provides a good opportunity to test many aspects of the ADL Gazetteer Content Standard.  First, the United Kingdom boundaries, be they physical, administrative or political can be very complex. They may be made up of more than one polygon, are frequently revised after comparatively short periods of use and derived from different sources. Second, our long history means that handling name changes (AUCHTERDERRAN in Fife, Scotland has at least 21alternative names or name spellings e.g. Auchterderay, Ochtirderay, Urchan, Hurkyndorath (1059)) and changes in the ‘footprint’ or the location of a place over time become important concerns.

Our presentation will describe some of the implementation problems we are encountering; highlight difficulties with existing descriptive terms for dealing with boundaries and places and posit questions for which we are struggling to find answers. For example, the variety of scales of the source data has serious implications when considering whether place X actually resides in area Y and may require the need to state a confidence in the result of any searches undertaken. While this is acceptable for interactive is, it is less so when communication is taking place between servers.

While the original intention was for a facility to support libraries and information services within tertiary education, interest in the project (particularly the service) from outside this community has been considerable. Stakeholders now include the national mapping agency, the national statistics agency, local and central government as well as museums, libraries and archivists. Part of the project will therefore involve managing this interest and channeling it to collectively formulate a strategy for turning the prototype into a sustainable long-term facility.

Notes

The UK Joint Information Systems Committee is the strategic advisory committee which works on behalf of the funding bodies for further and higher education in England, Scotland, Wales and Northern Ireland.

EDINA, based at Edinburgh University Data Library, is a JISC-funded national data centre. It offers the UK tertiary education and research community networked access to a library of data, information and research resources. EDINA runs a number of geo-spatial data services providing maps, gazetteer, data services and teaching resources.

The History Data Service (HDS) is part of both the UK Data Archive at the University of Essex and the Arts and Humanities Data Service. HDS collects, preserves, and promotes the use of digital resources, which result from or support historical research, learning and teaching.