Digital Library Technologies, Protocols and Best Practices
at UT Austin
Table of Contents
1. Content Management
a.
Institutional Repository (IR) - D-SPACE
b. Proprietary Web Content Management System - STELLENT
2. Metadata
a.
Dublin Core
b.
METS and MODS
c.
EAD
d.
SFX and OpenURL
e.
OAI
f.
Guidelines and Best Practices
3. XML
4. Digital
Collection Building
5. Digitization
6. Server/Storage
Hardware/Software
7. Application
Programming
8. Authentication/Authorization
a.
LDAP
b.
EZ Proxy
c.
EID
d.
Shibboleth
9. Digital
Library Organizations and Standards/Guidance Bodies
1. Content Management
a.
Institutional Repository (IR)
b. Proprietary
Web Content Management System
1.a. Institutional Repository
(IR)
|
| |
The most widely deployed technology for IR is D-Space and
this is the technology that UT Austin will be using in its IR
implementation. We are currently on schedule to release our
D-Space with Theses & Dissertations on Jan 1, 2005. This
project will use the next release of D-Space - which will incorporate
a re-vamped database design and enable a more sophisticated
handling of metadata. See the IR
Project Timeline (Aaron Choate, June 2004).
Open Source IR Systems:
D-Space <http://www.dspace.org/>
Arno <http://www.uba.uva.nl/projecten/object.cfm?objectid=1A103F4F-A900-4FCF-9BA16965AAE3D75E>
CDSWare <http://cdsware.cern.ch/>
Eprints <http://www.eprints.org/>
Fedora <http://www.fedora.info/>
i-Tor <http://www.i-tor.org/en/>
MyCoRe <http://www.mycore.de/engl/>
Other D-Space Implementors:
<http://www.lib.cam.ac.uk/dspace/
>
<http://dspace.library.cornell.edu/index.jsp>
<https://ep.eur.nl/index.jsp>
<http://www.ucalgary.ca/library/dspace/>
The Institutional Repository concept has been significantly
informed by work done by the Consultative Committee on Space
Data Systems (CCSDS) through a document titled: Reference Model
for an Open Archival Information System <http://ssdoo.gsfc.nasa.gov/nost/wwwclassic/documents/pdf/CCSDS-650.0-B-1.pdf>
This document outlines the critial concepts and processes involved
in establishing a long-term digital archive. This is not a technical
reference document, but a conceptual reference model created
in an effort to define and organize critical concepts and terminology.
This document attempts to describe a life-cycle view of digital
content.
This is a recent survey and report on the state of institutional
repository implementation. It was conducted by the Publisher
and Library Learning Solutions group (PALS) and offers an excellent
summary of current activity. <http://www.palsgroup.org.uk/palsweb/palsweb.nsf/pubframe>
Interesting IR News:
<http://chronicle.com/free/2004/04/2004040901n.htm>
<http://www.oclc.org/research/projects/dspace/default.htm>
See 2.e. OAI for more information regarding
the Open Archives Initiative.
|
|
1.b. Proprietary Web
Content Management System (CMS)
|
| |
The University of Texas at Austin recently signed an agreement
with Software AG for use of the the Stellent content management
system <http://www.stellent.com>.
While Stellent offers a diverse array of products the University
licensed the following: Stellent Content Server, backup server
and development server. The content server allows for management
of metadata, workflow, versioning and other document management
features. The package UT Austin licensed also contains converters
for HTML, PDF and XML as well as developers tools for creating
and managing websites. This product suite is in the beginning
stages of deployment by a small group of campus developers.
The Library's role has been in establishing the metadata framework,
see Metadata
Recommendations (Bob Stewart, June 2004, Word).
The Library received a development server license and will operate
a version of Stellent. We are currently in the planning stages
of this project and expect to begin the process of mapping workflow
in early fall '04. See the CMS
Project Timeline (Rue Ramirez, July 2004, Word).
Open Source Content Management Systems:
Apache Lenya <http://cocoon.apache.org/lenya/>
Macromedia Spectra <http://spectrasource.macromedia.com/active/>
Midgard <http://www.midgard-project.org/>
MMBase <http://www.mmbase.org/>
Mysource Matrix <http://www.squiz.net/mysource_matrix>
OpenCms <http://www.opencms.org/opencms/en/>
Red Hat CMS <http://www.redhat.com/software/rha/cms/>
Typo3 <http://typo3.org/>
Zope Content Mgmt Framework/Collaborative
Portal Server and Plone <http://www.zope.org/>
Proprietary Content Management Systems:
Stellent <http://www.stellent.com/>
Interwoven <http://www.interwoven.com/>
Documentum <http://www.documentum.com/>
Merant <http://www.serena.com/>
Microsoft
Vignette <http://www.vignette.com>
|
2. Metadata
a.
Dublin Core
b. METS and MODS
c. EAD
d. SFX and OpenURL
e. OAI
f. Guidelines and
Best Practices
2. Metadata
|
| |
Metadata Registry Project <http://www.lib.utexas.edu/dls/dadg/metadata/requirements.html>
Metadata is central to what we hope to accomplish in the digital
library. Discussion to this point has focused mainly on what
the content of the catalog records should be and, to a lesser
extent, how these records will be put into a database, who will
produce them, and how they will be made available to OAI harvesters
and internet search engines.
|
2.a. Dublin
Core
|
| |
Dublin Core <http://dublincore.org/>
We began implementing Dublin Core 6 years ago and have since
implemented it in numerous digital collections projects at UT
Austin. The first implementation was in the Robert Runyon Photographic
Collection <http://www.lib.utexas.edu/dlp/project.html?project_id=runyon>
and more implementations can be found at Digital Library Projects
<http://www.lib.utexas.edu/dlp/index.html>.
|
2.b. METS
and MODS
|
| |
Our goal is to produce METS and MODS metadata for all archival
content.
Metadata Encoding and Transfer Schema (METS) <http://www.loc.gov/standards/mets/>
Metadata Object Description Language (MODS) <http://www.loc.gov/standards/mods/>
The University of Texas Libraries are in the midst of a project
to implement the Metadata Encoding and Transfer Schema (METS)
on the PCL Map Collection, see Proposal
to build a Digital Maps Collection Repository (Aaron Choate,
Spring 2004, Word). The metadata aspect of the digital library
is critical to all of our efforts. It will be the strength of
the metadata that will determine how well we provide access
to materials.
|
2.c. EAD
|
| |
Encoded Archival Description (EAD) Format <http://www.loc.gov/standards/ead/>
We have implemented EAD in the Texas Archival Resources Online
project (TARO) <http://www.lib.utexas.edu/taro/>.
|
2.d. SFX
and OpenURL
|
| |
We implemented ExLibris' SFX and OpenURL in May 2004. Learn
more at our FAQ page <http://www.lib.utexas.edu/sfx/>.
|
2.e. OAI
|
| |
Our goal is to make OAI metadata available for all of our
public content.
Open Archives Initiative (OAI) <http://www.openarchives.org/>
OAIster WEBSITE <http://oaister.umdl.umich.edu/o/oaister/>
A description of the OAI from their website, "The essence
of the open archives approach is to enable access to Web-accessible
material through interoperable repositories for metadata sharing,
publishing and archiving. It arose out of the e-print community,
where a growing need for a low-barrier interoperability solution
to access across fairly heterogeneous repositories lead to the
establishment of the Open Archives Initiative (OAI). The OAI
develops and promotes a low-barrier interoperability framework
and associated standards, originally to enhance access to e-print
archives, but now taking into account access to other digital
materials. As it says in the OAI mission statement "The
Open Archives Initiative develops and promotes interoperability
standards that aim to facilitate the efficient dissemination
of content."
D-Space is OAI conformant which means that the material in D-Space
repository is available to OAI harvesters - which means that
the materials in the repository are available on a broad scale.
See 1.a. IR for more on Institutional Repository.
From the OAIster website: "Our goal is to create a collection
of freely available, difficult-to-access, academically-oriented
digital resources that are easily searchable by anyone.
Institutions OAIster harvests OAI-enabled metadata from:
<http://oaister.umdl.umich.edu/o/oaister/viewcolls.html>
|
2.f. Guidelines
and Best Practices
|
| |
Virsual Resources Assocation (VRA) <http://www.vraweb.org/>
From the VRA website, "The Visual Resources Association
is a multi-disciplinary community of image management professionals
working in educational and cultural heritage environments. The
Association is committed to providing leadership in the field,
developing and advocating standards, and providing educational
tools and opportunities for its members."
The VRA has produced a set of guidelines for the cataloging
of cultural objects. Those guidelines can be found at the following
CCO URL:
<http://www.vraweb.org/CCOweb/index.html>
From the CCO's webiste, ""CCO provides guidelines
for selecting, ordering, and formatting data used to populate
catalog records. CCO is designed to promote good descriptive
cataloging, shared documentation, and enhanced end-user access.
Whether used locally to develop training manuals, or universally
as a guide to building consistent cultural heritage documentation
in a shared environment, CCO will contribute to improved documentation
and enhanced access to cultural heritage information."
|
3. XML
|
| |
XML Protocol <http://www.w3.org/XML/>
The UT Libraries use XML for the encoding of finding aids, electronic
documents and metadata. See Introducing
XML (Maria Esteva, May 2004, PowerPoint) for a general overview
of XML.
|
4. Digital
Collection Building
|
| |
This document <http://www.imls.gov/pubs/forumframework.htm>
produced by the Institute for Museum and Library Services (IMLS)
establishes a set of principles to inform the decision-making
about whether or not a collection should be digitized. For example,
"Collections principle 1: A good digital collection is
created according to an explicit collection development policy
that has been agreed upon and documented before digitization
begins."
"Collections principle 2: Collections should be described
so that a user can discover important characteristics of the
collection, including scope, format, restrictions on access,
ownership, and any information significant for determining the
collection's authenticity, integrity and interpretation."
|
5. Digitization
|
| |
The UT Austin Libraries are involved in numerous digitization
projects involving text/image, audio and video content. These
projects range in scope from exhibits to collection automation.
We work with bibliographers, faculty members and graduate students
in the production and dissemination of digital collections materials.
In addition to production of digital collections the Library
is involved in electronic publishing projects with the Texas
State Historical Association and the UT Press to assemble a
core set of materials on the history of Texas.
What follows in this section is a description of the hardware
and software we use in the digitization process as well as links
to various standards, guidelines and best practices.
Digital Assets Discussion Group <http://www.lib.utexas.edu/dls/dadg/resources/standards/index.html>
SCANNING EQUIPMENT FOR TEXT AND IMAGE
-Contex: Chroma TX 40
40
inch wide-format scanner
-Epson: 1640 XL
Flatbed
scanner
-Epson: 1640 XL
Flatbed
scanner
-Microtek: Artixscan 1800f
Flatbed
scanner with transparency adapter
-Microtek: ScanMaker 9800XL
Flatbed
scanner
-Nikon: Super Coolscan 4000
Slide
and film scanner
-QA/QC workstations
-A/V workstation
-Audio capture station
-I2S: Digibook 10000
Bound
book scanner
|
6. Server/Storage Hardware/Software
|
| |
The UT Austin Digital Library operates on a Sun server platform
that is redundant and failsafe behind the firewall.
SERVER PLATFORM
-Apache Web servers
-EZ Proxy
-Helix Universal Server
-Sun ONE Application Servers
-SFX
DATABASE PLATFORM
-MySQL Database Server
-LDAP Directory Server
-Z39.50 Database Server
SEARCH SERVERS
-Verity K2 Enterprise
DEVELOPMENT
-Apache
-Helix Universal Server
-Sun ONE Application Servers
STORAGE
-9TB Network Attached Storage
|
7 . Application Programming
PROGRAMMING STANDARDS
-
PHP
-
Java
-
Web Services
8. Authentication/Authorization
a. LDAP
b. EZ Proxy
c. EID
d. Shibboleth
|
8.d. Shibboleth
|
| |
Shibboleth <http://shibboleth.internet2.edu/>
Software that enables privacy-oriented inter-institutional access
management to licensed and/or otherwise non-public information
resources.
|
9. Digital Library Organizations
and Standards/Guidance Bodies
|
|
Council on Library and Information
Resources (CLIR) <http://www.clir.org/>
CLIR sponsors the Digital Library Federation and is a leading
organization the digital library community. CLIR publishes a
series of reports and newletters that provide important and
useful information to anyone involved in the meeting the challenges
of producing, hosting, disseminating and preserving digital
material.
The Digital Library Federation (DLF) <http://www.diglib.org/>
From the DLF website, "The Digital Library Federation (DLF)
is a consortium <http://www.diglib.org/about.htm>
of libraries and related agencies that are pioneering in the
use of electronic-information technologies to extend their collections
and services. Through its members, the DLF provides leadership
for libraries broadly by -
* identifying standards and "best
practices" for digital collections and network access
* coordinating leading-edge research-and-development
in libraries' use of electronic-information technology
* helping start projects and services
that libraries need but cannot develop individually. "
|
|
|
|
|