Reflections on implementing Linked Open Data on participating Nordic spatial infrastructure projects (Bergen)
Øyvind Liland Gjesdal and Peder Gammeltoft
It is always interesting to try and put action behind one’s words. At the recent Nordic Spatial Humanities workshop in Bergen, we had set the goal to try and implement a common Linked Open Data (LOD) ontology for all the participating projects: Icelandic Saga Map, Mapping Saints, Norse World, Norwegian place-names, and Swedish place-name register. It was with some trepidation that the Bergen team, Henrik Askjer and the blog’s authors, Øyvind and Peder, set this goal. As you can read in the previous post, some of the participating projects were LOD-compatible, whereas others were struggling with practical implementations and theoretical concepts.
The challenge was to get people into LOD ‘thinking’, at the same time as being able to align data to the same ontology and data model. We understood, from what we saw at the previous workshop and collected from the discussions there, that the participating institutions’ data were both complex and quite heterogeneous, with ‘spatiality’ as the one combining element between them all.
Our chosen model for the workshop was to try to map our data to CIDOC-CRM (CIDOC CRM Special Interest Group 2022) using the profile defined in the Linked Art Data Model. “The Linked Art Data Model is an application profile that can be used to describe cultural heritage resources, with a focus on artworks and museum-oriented activities. It defines common patterns and terms to ensure that the resulting data can be easily used and is based on real-world data and use cases.” (Linked Art Community 2022a) These common patterns describe and show examples of using the CIDOC-CRM concept model in practice, combined with using Getty thesaurus vocabulary to describe types. For our purposes, we focused mainly on implementing the Places component (Linked Art Community 2022b). In addition, Getty Thesaurus vocabulary is now also available using the linked.art profile (Getty 2022).
We also implemented the existing place types as SKOS concepts and used them in addition to, or instead of, the Getty vocabularies. The documentation from the Linked art community has examples and illustrations of the model we could use when we went through the OpenRefine process for writing Resource Description Framework (RDF) output.
The first step we had to take was to find which entities were present in our data and how we would like to name them and give them URLs. To keep our URLs persistent, we used OpenRefine’s ability to call out to Python (Jython) to create a UUID version 3 based on a given seed. For example, the log from our Chronicles RDF export shows this example from our Norse World OpenRefine project:
Create new column work_uuid based on column Work by filling 2687 rows with jython:import uuid return str(uuid.uuid3(uuid.NAMESPACE_DNS, "http://norseworld.uu.se/work/id" + value.encode('utf-8')))
Once we had populated all URLs we wanted to use for entities in OpenRefine, we started the RDF-transformation using the RDF extension (Atescomp 2022). We added some new namespaces (crm: and skos:) and used the vocabulary import utility to import the vocabularies which are downloadable for CRM (version 7.1.1 is the latest version with downloadable ontology) and SKOS. The RDF extension then gave us simple access to our model in combination with the Linked art documentation. In addition, OpenRefine offers a preview window for having a fast feedback cycle on our mappings.
Figure 1. RDF transform.
Figure 2. RDF preview.
Some modeling differed between different datasets, and we solved this by looking at and implementing other linked.art components for some datasets. For example, the Mapping Saints dataset implemented patterns from the Object component (Linked Art Community 2022c) and could also have been used for the Person component models.
On the last day of the workshop, we published our results into an Apache Jena Fuseki endpoint (Apache Sofware foundation 2022). We then wrote some example SPARQL queries for querying the datasets we had created during the workshop. When our queries did not give us federated results across the datasets we expected, we found further mapping inconsistencies that we promptly corrected and republished, giving us our desired outcomes across the datasets.
Figure 3. SPARQL Query.
Further, we discussed that it would be helpful to point to the same things across the datasets. Most of our datasets use internal vocabularies that could map to common ones, i.e., place types in the TGN thesauri and Wikidata/TGN for places. It would also be beneficial to expand our experiment to the complete datasets, to republish them in an endpoint for querying, and possibly use it in a front-end application like Sampo-UI (Ikkala et al. 2021).
We are thankful to all the participants for all the enthusiasm and work done during the workshop. We're happy that we could work through examples from dataset to modeling to publishing and querying diverse data. After the workshop some of the participants have worked on their own datasets and the Icelandic Saga Map already offers a proof-of-concept API offering JSON-LD of some of its entities over REST-API, based on the model from our workshop.
 Work in Norse World stands for «a text preserved in one or more sources that the data are collected from». For more information, see 'Work and related metadata'.
The computations were performed on the Norwegian Research and Education Cloud (NREC), using resources provided by the University of Bergen and the University of Oslo. Available at https://www.nrec.no.
Atescomp, 2022. AtesComp/rdf-transform. Available at https://github.com/AtesComp/rdf-transform
CIDOC CRM Special Interest Group, 2022. CIDOC-CRM Available at https://www.cidoc-crm.org/
Apache Software Foundation, 2022. Apache Jena, Available at: https://jena.apache.org/
Linked Art Community, 2022a. Model. Available at https://linked.art/model/
Linked Art Community, 2022b. Places. Available at https://linked.art/model/place/
Linked Art Community, 2022c. Object production and destruction. Available at Object Production and Destruction (linked.art)
Code For Science and Society, 2022. OpenRefine. Available at https://openrefine.org/
W3C. 2009. SKOS Simple Knowledge Organization System Reference. Available at https://www.w3.org/TR/skos-reference/
Getty, 2022. Getty vocabularies. Available at http://vocab.getty.edu/