Friday 15 April 2011

Tagging Places on Old Maps: The DME Scenario

Following the productive Pelagios workshop at KCL, we (DME) have been busy tweaking our infrastructure to interoperate according to the "Pelagios Principles".

For DME, the situation is slightly different than for Pelagios' other data partners: First, our existing data model is already based on OAC - which does probably make our transition somewhat easier. Second, instead of owning an extensive existing data set, our "asset" is an end-user toolkit for manual annotation and semantic tagging of multimedia content (affectionately referred to as YUMA - the YUMA Universal Multimedia Annotator).

Tagging Maps with Pleiades References


In other words: instead of mapping existing place references in our data to URIs in the Pleiades namespace, our primary task was to modify YUMA in such a way that users can, from now on, tag media items with references to Pleiades places!

This has turned out to be quite straightforward: The simplest way by which users can add semantic tags to their annotations in YUMA is through an auto-completion textbox: As the user starts typing, available tags will appear in a drop-down box underneath the typed text. (The screenshot above shows this for the 'Corsica' example mentioned in an earlier post by Mathieu.)

Making Pleiades place references available as tags in YUMA involved two steps:
  • First, we needed to incorporate a list of Pleiades place names (and their URIs) into our Tag Server. The Tag Server is the component which hosts tag vocabularies and provides the tag suggestions for the auto-completion hints. We got a Pleiades CSV data dump, from which we built an index using Apache Lucene, and set up the index as an additional tag source.
  • Second, we needed to provide a 'Pelagios' view on our internal data to the outside world. The reason for this is that our existing RDF representation (although based on OAC) is close, but not identical to the Pelagios model. We therefore set up an alternative RDF output channel, mapped to a .pelagios suffix. Appending this suffix to the annotation URI will return Pelagios-compatible annotations, in either RDF/XML, N3 or Turtle notation based on content negotiation.
The 'Corsica' annotation, for example, is available at this URI: 

http://dme.ait.ac.at/yuma-server/api/annotation/618.pelagios

It will resolve to the following RDF (abbreviated for better readability, key parts red):

@prefix oac: <http://www.openannotation.org/ns/> .

<http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif>
   a  oac:Target .

<http://dme.ait.ac.at/yuma-server/api/annotation/618>
   a  oac:Annotation ;
   <http://purl.org/dc/elements/1.1/creator>
      "rsimon" ;
   <http://purl.org/dc/elements/1.1/title>
      "Corsica" ;
   <http://purl.org/dc/terms/created>
      "2011-04-11 11:12:34.295" ;
   <http://purl.org/dc/terms/modified>
      "2011-04-11 11:12:55.747" ;
   oac:hasBody
      <http://pleiades.stoa.org/places/991339> ;
   oac:hasTarget
      <http://dme.ait.ac.at/yuma-server/api/annotation/618#target> .
<http://dme.ait.ac.at/yuma-server/api/annotation/618#target>
   a  oac:ConstrainedTarget ;
   oac:constrainedBy
      <http://dme.ait.ac.at/yuma-server/api/annotation/618#ct> ;
   oac:constrains
      <http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif> .

Lessons Learned


Two concluding remarks: first, I would eventually like to see our 'YUMA OAC representation' unified with the Pelagios one. This is currently hindered a bit by the way the OAC model is organized (i.e. with only one body per annotation allowed, and no official specification on structured bodies). Right now, the difference is therefore that our oac:Body is a separate node that includes all elements that comprise the user's annotation (title, text body, multiple tags, etc.), whereas in Pelagios we consider the Pleiades reference to be the oac:Body directly. (For comparison, the YUMA OAC representation of the example above is here.)

Second, one of the benefits that we gain for YUMA from linking annotations to Linked Data resources is that we can get extra context information by dereferencing them. For example: if we link to DBpedia we can get things such as labels in different languages or alternative spellings for person names. This, in turn, provides us with a very lightweight mechanism for including (at least limited) multilingual and synonym search functionality in our system. (There's a video screencast demonstrating this here.) With our Pleiades integration, we don't quite get this yet: What's still missing is a mapping between Pleiades URIs and matching resources from e.g. Geonames or DBpedia where possible. (However, Pleiades+ might be just the way to address this I guess!)

No comments:

Post a Comment