Introduction to Fedora 4

David Wilcox (DuraSpace)

Tuesday, 06.09.2016, 13:30 – 15:00 & 15:30 – 17:00

Fedora is a flexible, extensible repository platform for the preservation, management and dissemination of digital content. Fedora 4, the new, revitalized version of Fedora, includes vast improvements in scalability, linked data capabilities, research data support, modularity, ease of use and more. Both new and existing Fedora users will be interested in learning about and experiencing these new features and functionality first-hand.

This tutorial will provide an introduction to and overview of Fedora 4, with a focus on the latest features. After an initial review of the most significant new features, participants will learn about best practices and community standards for modeling data in Fedora 4. Attendees will be given pre-configured virtual machines that include Fedora 4 bundled with the Solr search application and a triplestore that they can install on their laptops and continue using after the workshop. These virtual machines will be used to participate in a hands-on session that will give attendees a chance to experience Fedora 4 by following step-by-step instructions. This section will demonstrate how to create and manage content in Fedora 4 in accordance with linked data best practices. Finally, participants will learn how to search and run SPARQL queries against content in Fedora using the included Solr index and triplestore.


Building Digital Library Collections with Greenstone 3

David Bainbridge (University of Waikato)

Wednesday, 07.09.2016, 13:30 – 15:00 & 15:30 – 17:00

This tutorial is designed for those who want an introduction to building a digital library using an open source software program. The course will focus on the Greenstone digital library software. In particular, participants will work with the Greenstone Librarian Interface, a flexible graphical user interface designed for developing and managing digital library collection. Attendees do not require programming expertise, however they should be familiar with HTML and the Web, and be aware of representation standards such as Unicode, Dublin Core and XML.

The Greenstone software has a pedigree of approaching two decades, with over 1 million downloads from SourceForge. The premier version of the software has, for many years, been Greenstone 2. This tutorial will introduce users to Greenstone 3—a complete redesign and reimplementation of the original software to take better advantage of newer standards and web technologies that have been developed since the original implementation of Greenstone. Written in Java, the software is more modular in design to increase the flexibility and extensibility of Greenstone. Emphasis in the tutorial is placed on where Greenstone 3 goes beyond what Greenstone 2 can do. Through the hands-on practical exercises participants will, for example, build collections where geo-tagged metadata embedded in photos is automatically extracted and used to provide a map-based view in the digital library of the collection.


Text mining workflows for indexing archives with automatically extracted semantic metadata

Riza Batista-Navarro (University of Manchester), Axel Soto (University of Manchester)

Thursday, 08.09.2016, 9:00 – 10:30

With the vast amounts of textual data that many digital libraries hold, finding information relevant to users has become a challenge. The unstructured and ambiguous nature of natural language in which documents are written, poses a barrier to the accessibility and discovery of information. This can be alleviated by indexing documents with semantic metadata, e.g., by tagging them with the terms that indicate their “aboutness”. As manually indexing these documents is impracticable, automatic tools capable of generating semantic metadata and building search indexes have become attractive solutions. In this tutorial, we aim to demonstrate how digital library developers and managers (who do not necessarily have the expertise on natural language processing and text mining) can use the Argo text mining platform to develop their own customised, modular workflows for automatic semantic metadata generation and search index construction. In this way, we are providing digital library practitioners with the necessary technical know-how on building semantic search indexes without any programming effort, owing to Argo’s graphical interface for workflow construction and execution. We believe that this in turn will allow various digital libraries to build search systems that will enable their users to more efficiently find and discover information of interest.