RUBRIC Toolkit: Metadata Overview
Metadata is used:
to describe and provide the location of a resource to assist with discovery
to give information about its conditions of use
to govern technical and display issues
to contain long term preservation information
to maintain version information
to manage administrative aspects of the data
Much metadata is machine-generated by the repository, but it is important for repository managers to understand the extent to which metadata in their repositories can be customized for their own purposes. Refer to the section on Metadata and Entering Metadata: Guides and Tools for Repositories for more detailed information on the topic.
Resource discovery and IR interoperability are inseparable issues and both are tied to the effective use of metadata.
Choosing the Right Metadata Standard
Dublin Core is the simplest of metadata schema which ensures interoperability for search and retrieval among repositories and their harvesters. Whilst it is not the only available standard, it is considered to be the baseline metadata standard for the Open Archives Initiative Protocol for Metadata Harvesting. The Open Archive Initiative (OAI) develops the standard to promote interoperability among repositories, and OAI repositories will normally be configured to generate Dublin Core records by default.
While the simple Dublin Core schema provides a basic level of interoperability, the Open Archive Initiative additionally encourages the use of more granular schema. A repository should include other metadata schema to describe their resources and conditions of access more fully for the benefit of their users. Examples of other metadata schema include MARC and an extended version of Dublin Core known as Qualified Dublin Core. Other metadata schema are discussed in the Metadata section.
Metadata Schemes Points of Comparison is one of many useful articles that describes what to look for when comparing the range of standard schema available for specialised purposes.
Choosing a Metadata Standard For Research Discovery by UKOLN includes a checklist for selecting the right standard for your purpose.
The basic criteria to use are:
interoperability
extensibility and growth
sustainability
granularity
ease of use and existing skills
Images and videos present additional preservation, sustainability and rights issues to other resource types. There are specialist metadata schema for such non-text materials and these are also discussed briefly in the comprehensive Metadata section.
Metadata for Harvesting
The ARROW Discovery Service has produced a Harvesting Guide (currently being updated) that recommends different levels of metadata content for harvesting. The ARROW Discovery Service is an OAI compliant national harvester managed by the National Library of Australia.
It liaises with a range of international service providers to manage the harvesting conditions of Australian material, including:
The ARROW Discovery Service's ‘Public Funding, Public Knowledge, Public Access’ explains the agreements with Google and the OAIster service to secure higher rankings for its university repository harvested resources. OAIster's agreement with Yahoo! and Google is summarized at http://www.oaister.org/sru.html.
IR managers may liaise independently with the same providers or they may choose to register with the ARROW Discovery Service which can negotiate on their behalf.
The following terms are the Dublin Core elements that are used by OAI service providers for harvesting metadata records. (each element is repeatable and optional). The terms in bold type are the most essential:
title
subject (includes keywords and controlled vocabularies)
description (includes abstract or other summary)
type (e.g journal article, conference paper, thesis)
source
relation
coverage
creator
publisher
contributor
rights
date
identifier
language
format
The comprehensive Metadata section explains that all of the above elements will apply to the document or article deposited in the repository with one exception. At least one identifier that is also a resolvable link (URI) will need to point to the repository's metadata page that describes and links to that resource. Otherwise a harvesting service will direct users directly to that resource and bypass the repository.
The National Science Digital Library (NSDL) advises that best practice to ensure your IR is harvested is to register with the official OAI Registry.
Connecting with the Harvesters explains who to contact and how to register.
Australian Digital Theses
Australian Digital Theses (ADT) metadata and harvesting requirements vary slightly, even if this material is co-located with other repository material. ADT will still expect to be able to harvest those theses separately from the rest of the repository archive.
The ADT harvester only processes a limited range of Dublin Core elements:
dc.title
dc.creator
dc.subject
dc.description
dc.date
dc.language
dc.publisher
dc.rights
dc.identifier
dc.type
If there are any other DC elements in an ADT record (e.g. dc.format), they will be ignored by the ADT harvester for normal processing and indexing purposes.
Open Archive searching requires that ADT records in a repository be grouped and harvested as a discrete set of items separately from the other records in the repository. Sets Guidelines for Repository Implementers on the Open Archive Initiative website provides technical details for constructing Sets.
It is recommended that the dc.type or dc.relation element be used for the SetName for ADT harvesting. Enter '''Australasian Digital Thesis''' as the value for dc.type or "Australasian Digital Thesis Program" as the value for dc.relation.
Example:
- dc.type Australasian Digital Thesis
- dc.relation Australasian Digital Thesis Program
For ADT to harvest a repository according to OAI-PMH (Open Archive Initiative Protocol for Metadata Harvesting) standards, repository managers will need to inform ADT of:
the URL of your server
the SetSpec of the ADT records to be harvested
the SetName for the ADT records to be harvested (i.e. dc.type Australasian Digital Thesis)
Examples:
DSpace record:
URL: http://researchspace.auckland.ac.nz/dspace-oai/request
SetSpec: hdl_2292_2
SetName: PhD Theses
VITAL record without Collections:
URL: http://repository.usq.edu.au/oaiprovider
SetSpec: Australasian Digital Thesis
SetName: Australasian Digital Thesis
VITAL record with ADT items in a Collection:
URL: http://repository.usq.edu.au/oaiprovider
SetSpec: rubric:299
SetName: Australasian Digital Thesis
Further Guides to Metadata
Guides that have most relevance for repositories in higher education institutions:
Criteria for evaluating a metadata schema for a digital repository can be found at:
Choosing a Metadata Standard for Research Discovery – (UKOLN 2006) this discusses an up to date checklist of guidelines to assist one in choosing a metadata schema for a repository.
Repository Librarian and the Next Crusade: The Search for a Common Standard for Digital Repository Metadata (Goldsmith B B & Knudson, F 2006) - a comparative evaluation of MARCXML, Dublin Core, ONIX, PRISM and MODS.
“RUBRIC Toolkit: Metadata Overview” produced July 2007




