RUBRIC Toolkit: Establishing a Pilot Repository
Planning a Pilot
Developing a pilot Institutional Repository (IR) helps to plan and fine tune issues of policy, workflow, marketing and resourcing prior to a public launch.
When planning the pilot phase, consider:
The length of time allocated to the pilot phase
Ensure a deadline for migrating the pilot over to a live instance is set at the beginning of the process. Deliverables include:customisations (branding, look and feel)
import of data
data analysis (such as quality control checking)
workflow development and training
policy development
Responsibility for the pilot
Consider technical, administrative, marketing and budgetary roles. Will separate people be involved? How will this group communicate? Will there be a hierarchy? What are the implications for their current roles? What level of authority and decision making are granted?Software selection
System Options provides some assistance on the process of evaluating and selecting appropriate software; but you will need to establish your own institution's process for software evaluation and procurement.Interoperability
Identify systems the IR will have a relationship with, e.g. library catalogue, research database, or an authentication system such as LDAPdefine the nature of the relationships with other identified systems
outline the requirements of each system relationship (e.g. information/reporting, harvesting, exporting, importing)
determine which way the data will flow between identified systems
determine the frequency of data updates or transfers between the identified systems
what processes need to be established for data and workflow purposes?
who will be responsible for mapping the processes and how long will it take?
Sample software selection model
Multi-tiered Development & Version Control
Using Multiple Tiers
RUBRIC advises the use of a three-tiered approach for evaluating, configuring, and testing repository software rather than maintaining a single copy of repository software on a server and patching it repeatedly, doing upgrades and configuration changes on the live system.
1. Production tier | Pilots are for primarily for demonstrating the look and feel of the repository, as well as being used as a marketing tool |
2. Testing tier | Testing is where code that has been developed in the development tier is placed to ensure it functions as desired. This tier is kept as similar to the production tier as possible |
3. Development tier | Development is where technical staff learn about the software being evaluated, about piloting and eventually managing in production. Every change to configuration is done first in development and the changes committed into a Subversion repository before any code is moved to the next tier |
Subversion revision control software provides well managed version control rather than relying on copying configuration files from machine to machine. It is not essential to use the Subversion software, but some form of similar version control is strongly recommended.
Virtualised infrastructure provides a way to easily work with tiers 1 and 2. Virtualisation is not essential for these tiers, but it is strongly recommended for the development tier as it encourages developers to treat instances of repository software as disposable and be practiced at installing and configuring software.
RUBRIC suggests installing a repository with Tier 3 and proceeding as follows:
Technical staff install the repository software on virtual machines
Development machines are used to try out new versions of software, develop all configurations, such as colour schemes, templates and index configuration.all configuration is kept under version control via Subversion (SVN)
all changes are made on a development machine and checked-in to Subversion
Technical Staff edits the configuration of the live repository
Technical staff check out the same configuration as is running in production, change it then commit it back to SVN for testing.Technical staff tests new configuration
Technical staff checks out the changes they made on their development machines into testing and has independent confirmation that their changes are correct. Depending on the setup in use, this configuration may then need to be copied to a release branch. See the Subversion book by Collins-Sussman et al. (2004) for guidance.Technical staff deploys to production
Technical staff checks out the latest copy of the production-ready configuration. If there has been a mistake, then it is easy for the development staff to roll-back to an earlier, working version of the configuration.
Configuration
Institutional branding can be added to the pilot IR with minor cosmetic adjustments:
logo
institutional name
contact information
help files and an FAQ
It is also advisable to review the available database fields in the software and whether these are sufficiently comprehensive for your purposes. Information management experts in your organisation may want to review usability, consistency and interoperability standards with other systems.
Configuration Check List for IR Managers:
look and feel: is there an institutional 'look' that the repository needs to adhere to?
imparting context: such as a purpose statement
browsing: are users able to browse by all the fields necessary? If not, how do you create new browse fields?
searching: is there a simple search available? Is it obvious?
advanced searching: is the advanced search easy to use? Is it accurate?
help: is there a clear way for users to find online help?
information: is there information for users who wish to submit to the repository? Who do they contact?
This configuration table will assist with planning the configuration work and prioritising tasks.
Ease of Configuration:
configuration on a repository should be an up-front and infrequent task
configurations will probably involve staff with a high level of technical skill
configuration is easier to manage if stored in files with version control software utilised
it is not recommended that configuration work, such as customising input forms be done on a live server
when selecting software, a graphical user interface to make configuration easier is of no benefit if the system overall is harder to manage
develop, test and deploy configuration using a multi-tiered approach
Technical Reports on the RUBRIC website provides further information on configuration.
Customizing Screens
When branding a pilot IR, consider the following:
simplicity keeps users happier than fancy functionality
keep wording as unambiguous as possible
different browsers (ie Internet Explorer, Mozilla Firefox) may display the same page differently. If you heavily customise the skin of your web interface, test that the changes work in as many different browsers as possible
suggestions for a front page to the repository include:
introduction to the repository: which group it represents, its purpose, etc
“Simple Search”
links to Browsing, Advanced Search, Help, FAQs and Contacts
It is advisable to keep good records of your customisations for two reasons:
to easily recreate the customisations as part of your disaster recovery process
to easily review and reassess requirements for customisations when you upgrade or migrate to alternative software over time
Document information about customisation to ensure you can easily recreate the look and feel in new software versions and in the event of disaster recovery.
Making DSpace Your Own (Donahue and Salo 2006) is another example to assist with software customisation.
Metadata Requirements
Data about an item being deposited to an IR must be captured on entry. Known as “metadata”, this is structured information provided about or alongside a resource to inform users what that resource is about and how they can use it. It captures descriptive information about the item (for example, the author, title, date, publisher) as well as value-added information (such as a subject classification, administrative data, rights data).
Every IR manager needs to know:
why metadata is important
who manages the metadata, including quality control and policy
what metadata schemas are available
how to choose a metadata schema for resource discovery purposes
when to apply the metadata
how to link to resource discovery services
The Metadata section provides a detailed overview of these aspects of metadata and its role in an IR.
Pilot to Production
To transition from the pilot system to a production system, refer to
These sections give you the tools to plan the process of going live and consider the associated management issues of a live instance.
References and Further Reading
Refer to the Further Reading section at the end of the Toolkit for bibliographic details of works referenced in this section.
“RUBRIC Toolkit: Establishing a Pilot Repository” produced May 2007





