Accepted Presentations for the Fedora User Group Sessions
Session 1: Design Strategies for Repositories (Strategy/Requirements)
-
Best Practices in Developing a Digital Library Repository Using Fedora
Leslie JohnstonIn three years of development the University of Virgina Library has revisited decisions it made in establishing a digital library repository using Fedora for the management and delivery of digital collections. They have revised their implementation strategy, as well as starting the process to develop content models for new media types: video, datasets, audio, and GIS. This presentation will present processes used for the identification of functional requirements, inventorying of digital media, development of production standards, and the translation of the output of those processes into content models and disseminators.
-
Adventures in architecting and implementing digital repository services: a case study of the Tufts digital repository
Robert Chavez, Anoop Kumar, Nikolai SchwertnerThe Digital Repository Group at Tufts University has designed and implemented a Fedora-based service-oriented digital repository to provide for long-term management and integration of both new and legacy digital collections at Tufts University. This presentation describes the current architecture of the Tufts digital repository and several associated services that facilitate the use and dissemination of repository content. The end-user services that will be discussed include an ingestion service, a naming service, a personal publication service, a social tagging service, a collection browsing service, and the general dissemination services of the repository. A discussion of the following applications will illustrate the convergence of the systems oriented and customer oriented approaches and demonstrate a variety of dissemination types available to applications via the Tufts digital repository service: Perseus Project digital library and Open Content Alliance content, American Antiquarian Society Early American Voting Records web environment, Tufts Fine Arts Department web-based curriculum tool, and the Visual Understanding Environment.
-
Versioning of Digital Objects in a Fedora-based Repository
Matthias RazumeSciDoc is as a shared project of the Max Planck Society and FIZ Karlsruhe, with the aim to realize a platform for communication and publication in scientific research organizations. This presentation gives an overview on the complex versioning requirements for digital objects in the scope of the project eSciDoc and discusses our solution based on Fedora. Besides ensuring permanent access to the research results and research materials of the Max-Planck Society, the result of the entire eSciDoc project is intended to support scientific collaboration in future eScience scenarios. The goal requires a shift from traditional digital library systems to a more interactive environment in which information consumers become as well information producers. Collaborative authoring raises an issue familiar to software developers: versioning of digital objects. All intermediate or working versions of artifacts should become part of the repository, not just the final versions. Four major requirements pertaining to versioning have to be fulfilled by the underlying repository architecture: 1) Versioning on Object Level, 2) Fixed and Floating Object References, 3) Internal and public versions, and 4) Container objects.
Session 2: Workflow for Ingest and Validation (Technology)
-
Submission of Content to a Digital Object
Andreas Hense and Johannes MuellerThe prototype of a workflow system for the submission of content to a digital object repository is presented. It is based entirely on open-source standard components and features a service-oriented architecture. The front-end consists of Java Business Process Management (jBPM), Java Server Faces (JSF), and Java Server Pages (JSP). A Fedora Repository and a MySQL database management system serve as a back-end. The communication between front-end and back-end uses a SOAP minimal binding stub. In this presentation we present the prototype, show the software architecture, and discuss the possibilities and limitations of workflow creation by administrators.
-
Developing an Ingest Service for Fedora
Ryan Scherle and Muzaffer OzakcaAlthough the Fedora architecture holds great promise for storing and managing digital collections, tools for ingesting content into a Fedora repository do not provide the combination of power and flexibility needed to ingest many types of collections. As a result, the community has not standardized on a single method of ingestion. We will review the current tools available for ingesting content into Fedora and describe the need for a tool that combines the power of content-specific tools with the flexibility of more general-purpose tools. In addition we will present the architecture of the Ingest Service being built for the digital repository at Indiana University. The Ingest Service provides a simplified API for updating objects that have been placed in the repository, which will ease the creation of tools for cataloging and automatically ingesting objects from external sources.
-
RIFF: Referential, Rule and Schema based Content Model Validation Tool for Fedora
Stephan Drescher and Toke EskildsenLogical preservation systems require an active evaluation of the integrity of their hosted data and implemented data models. All parts of the system have to be referenced via standards to which we have given the name contracts. RIFF aims to capture different ways of designing Fedora data objects--later called content models--and gives these a framework for validation. RIFF distinguishes between object to object relational checks, object type checks (e.g. schemas) and specific object content checks. The framework includes autonomous validators for those tasks, which can also be combined with each other. RIFF checks can operate both on a running Fedora installation or independently on a filesystem's FOXML data repository (for example, pre-ingests).
Session 3: Preservation and Archiving (Strategy/Requirements)
-
Fedora Preservation Services: A Working Group Report
Ron JantzA major objective of the Fedora Preservation Services Working Group (WG) is to define the requirements and architecture for preservation services that can be integrated into Fedora. We believe our work will provide capabilities for Fedora users to build trusted repositories. To accomplish our objectives, the Working Group is specifying services and technologies that can be readily integrated into the Fedora Framework. In the specification process, the WG is focused on the underlying capabilities to support digital object persistence, life cycle management, multidisciplinary collections, and management of the repository environment (e.g. storage, memory, operating system, etc). Capabilities and features currently under consideration include checksum creation and validation, event management and messaging, and a repository history service. This presentation will provide a WG progress report with special emphasis on the concept architecture, key features, and the development roadmap for preservation services. The discussion will also cover how Fedora preservation services relate to the PREMIS data model and the audit checklist for trusted digital repositories.
-
The AGU Digital Archive: A Case Study
Carter GlassThe American Geophysical Union is a leading publisher of scholary journals in the Earth and space sciences and has published every article in a fully electronic version since 2002. As a scientfic publisher, AGU has an obligation to preserve its digital content. AGU selected FEDORA as a long-term 'deep' archive for its electronic publications. AGU built tools and processes on top of the FEDORA platform. The paper will describe how AGU selected FEDORA as an archiving platform, how AGU preserves electronic article components and meta-data, and AGU strategies for long-term preservation and migration. The complete AGU archive architecture will be presented as well as samples of working source code.
-
A service-oriented workflow for the ingest and preservation of complex digital objects
Mark HedgesThe Arts and Humanities Data Service (AHDS) is a body that maintains a repository for the preservation and dissemination of digital resources arising from research in arts and humanities disciplines, mostly resources resulting from publicly funded projects carried out by academics at UK institutions of higher education. These resources are very varied in size and data format, and can be of very complex structure. To facilitate the management of these collections, the AHDS is migrating its repository to a Fedora-based system. We will present the AHDS vision to develop a workflow that supports all the required pre-ingest functions, from initial deposit up to ingest into Fedora, while minimising the need for human intervention. For example, these functions will include the creation of metadata and the normalization of deposited files to formats suitable for preservation. Our approach is not to develop a monolithic tool, but a set of modular web services, each encapsulating a well-defined unit of functionality at an appropriate level of granularity, which can be configured and combined to produce workflows. Access control will be implemented for all user inputs to identify the agent who will in turn be recorded in the preservation metadata.
Session 4: Repository Exposure (Technology)
-
Simplifying Fedora Frontends with XForms and Fedora Disseminators
Matt ZumwaltOne of the primary barriers to adoption of Fedora is the difficulty in deploying custom frontends to the system. MediaShelf is currently refining techniques that leverage XForms and Fedora disseminators that address this challenge. We will present MediaShelf, LLC , which provides hosted online media archival solutions as well as consulting services focused on planning, development, and deployment of the Fedora Repository Architecture.
-
Building an Institutional Repository Interface Using EJB3 and JBoss Seam
Peter MurrayThe vision of the Ohio Digital Resource Commons (DRC) is to leverage statewide economies of scale with a content repository that enables higher education and other Ohio institutions to rapidly publish and comprehensively access the wealth of research, historic and creative materials produced by Ohio's scholarly communities. The tools and services of the DRC feature an interface that appears as if the repository is at a member's institution using the institution's URLs and institutional look-and-feel branding. In reality, OhioLINK maintains the underlying hardware and software—allowing institutions to redirect resources from building their own physical systems to adding and supporting content in the institution's repository space on the DRC. The first service facet to be constructed is the DRC Institutional Repository, featuring a community-oriented interface to ingesting and presenting PDF and related content. The DRC vision also encompasses other service facets on top of the content repository, including digital libraries, an online exhibition interface, e-journal publishing, integration with collaborative learning environments, ePortfolio, and electronic records management. This session will provide an overview of the Ohio DRC vision, a demonstration of initial DRC-IR functionality, and a discussion of using EJB3/SEAM to create an interface for a FEDORA repository.
-
Federated Authentication and Authorization for Fedora
Chi NguyenFedora's popularity amongst institutions is largely due to its scalability and flexibility to handle a large variety of data types. With the increased take up rate, the need to support federated authentication and flexible authorization is becoming more and more evident. The main drivers are the need by end users to share data across institutional boundaries, and the ability to specify new and different authorization policies for repository data over time. In this presentation, we outline a new architecture for authentication and authorization in Fedora, based on open standards, that will address the new authentication and authorization requirements while preserving the need to support multiple GUIs for Fedora.
Session 5: Project Highlights (Shorts)
-
NSDL 2.0: Building a Collaborative Digital Library
Dean Kraffthttp://NSDL.org has recently created the NSDL Data Repository (NDR) built on Fedora to serve as the premier source for science, technology, engineering and mathematics (STEM) education on the net. This repository is the foundation for creating an ecosystem of collaborative, contributory tools that fully integrate with the library. In this presentation we will discuss the challenges and significance of integrating repositories with semantic web technologies and "Web 2.0" tools. The NDR and its API support the creation, organization, discovery and dissemination of resources, sourced metadata statements about resources, arbitrary aggregations of resources and other aggregations, and agents, who serve as the providers of metadata and the selectors for aggregations. The NDR also supports arbitrary RDF-style relationships among resources and queries on those relationships. The production NDR, currently built primarily on aggregated OAI-PMH metadata records from over 125 collections, contains roughly 2 million science education resources, and currently about the same number of metadata statements about those resources. We will give an overview of the design of the NDR, showing the objects and relationships and describing the overall workflow. We will present the API, showing how it provides authenticated access to modify and query the NDR while preserving the constraints of the NDR data model. In addition to overviews of current and planned tools and services we will review applications of these tools to enhance science education, facilitate the transfer of science research to science education, and create a dynamic, living library of science, technology, engineering and mathematics education.
-
MPTStore: Implementing a fast, scalable, and stable RDBMS-backed triplestore for Fedora and the NSDL
Chris Wilper and Aaron BirklandThe National Science Digital Library (NSDL) hosts a large Fedora-based metadata repository that depends heavily on Fedora's RDF Resource Index. Currently, it holds about two million digital objects with an average of about 100 triples per object. Developers from Fedora and the NSDL have developed MPTStore, an alternative RDBMS-backed triplestore, with the explicit goal of supporting a very large, rapidly changing ResourceIndex in Fedora. We will describe the MPTStore based around the idea of mapped predicate tables (MPT). Each predicate in the triplestore is mapped to a single relation in a Relational database. A triple, therefore, is represented as a subject and object pair within a specific predicate relation. MPTStore itself is a Java library that provides access to the triples, manages mapping metadata and triple relations, and generates SQL queries for simple and complex queries. Currently, NSDL has adopted MPTStore as the ResourceIndex in its production deployment of Fedora. So far, this move has eliminated the RI corruptions and has greatly sped up ingest and update processes with bulk concurrent loads.
-
RepoMMan: using web services and BPEL to facilitate interaction with a Fedora repository
Richard Green,Chris Awre, Ian Dolphin, and Robert SherrattRepoMMan is developing a standards-based, flexible workflow tool through which users can interact with Fedora, using the repository not just as a public archive for finished works, but also as a tool that can support the development of such materials through a 'My Repository' facility. Using BPEL (the Business Process Execution Language) to orchestrate web service calls to Fedora and other software, RepoMMan is developing processes to manage a user's "works-in-progress". Automated metadata generation is the second major area being addressed by the RepoMMan project: this will be a further function of the tool that is being developed and will also be managed via BPEL and web service calls.
-
Fedora Outreach and Communications: A Working Group Report
Carol Minton Morris, Eric Jansson, Stacy Pennington, David DinhamEvery project has a story as the Fedora Outreach and Communications Group discovered by conducting the Fedora Users Interview Survey from August-October 2006. The Fedora community is a patchwork of compelling narratives with both broad and concise goals for their institutions and their projects. The purpose of the Survey was to piece narratives together to find out what distributed Fedora development means in the context of a variety of use cases. Survey responses provide a compelling framework for developing a strategic plan for Fedora outreach, communications, and marketing. This information has the potential to encourage the growth, sustainability and ongoing development of a Fedora that is not only a technical solution, but that is also a vibrant community of collaborators. In this presentation we will review the Fedora Users Interview Survey methodology, results, and significant findings related to developing a plan for increasing Fedora Outreach and Communications activities. We will conclude with suggestions for a Fedora outreach, communications, and marketing plan to increase public awareness of Fedora particularly among institutional leaders.
Session 6: Plenary
-
Fedora Project Plenary Presentation
Sandy Payette, Co-Director, Fedora ProjectAs many open source projects are discovering, there are many unknowns and challenges in sustainability. The directors of the Fedora Project and the Fedora Advisory Board are currently working on the transition plan for moving Fedora into a non-profit organization named Fedora Commons. We are working on multiple strategies to facilitate startup of this new entity and to maximize the potential for sustainability over the long haul. This entails pursuing new funding opportunities, building strong relationships with related open source projects, and focusing on outreach to new communities and new potential adopters. With more funding and more invested stakeholders, we plan for the Fedora Commons to take on a more ambitious agenda that we describe as “Scholarly and Scientific Service Oriented Architecture.” In this presentation, I will discuss plans for the Fedora Project in 2007 and the vision for the new Fedora Commons organization that is emerging.
-
Risk, What Risk: Choosing Fedora for the National Science Digital Library
Kaye HoweIn a world of technical and social innovation--from the proverbial high school student who cobbles together a killer app in a basement, to institutional leaders who go out on a limb promising end-to-end solutions on tight deadlines--the National Science Digital Library's (NSDL) development strategies fit somewhere in the middle. NSDL made the decision to build a Fedora-based technical platform to enable user participation and collaboration across over 200 partner digital libraries and other science, technology, engineering, and mathematics discipline communities in support of NSDL's educational mission. Dr. Howe will discuss the Fedora vision, why it inspired her management team to "take the leap," and how she balanced inherent risk and investment against long-term benefits. She will give an overview of a unique Fedora-enabled educational application and discuss why it solves a fundamental problem for K12 teachers. And finally, she will reflect on why she believes that, in the end, you will achieve none of your goals by avoiding risks.






