Workshopping Nordic Spatial Data Humanities (Uppsala)
Sara Ellis-Nilsson and Alexandra Petrulevich
As referenced in the previous blogpost, the Uppsala workshop participants first reflected on questions concerning how the projects were considered to be – or be a part of – spatial research infrastructures, as well as on the challenges that the projects have faced during their development and thereafter. It is important to note that the participating projects varied from the completed to the in-progress, which influenced the survey responses and the workshop discussions.
In order to delve deeper into the ways in which the project members have worked, as well as how they have met the challenges they have faced, two overarching questions were posed to the participants during the workshop:
What are the projects' challenges on the way to LOD/FAIR? Any ideas of possible solutions?
What standards are needed in terms of common metadata, vocabularies, and ontologies?
Of course, these interconnected topics led to wide-ranging discussions, of which the main points will be summarized in this blogpost. Important take-aways from the workshop were
1) the importance and challenges of linked open data (LOD) and the application of the FAIR (Findable, Accessible, Interoperable, Reusable) data principles, and
2) the procedures involved in choosing and applying standards (vocabularies, ontologies) in terms of reliable and viable authorities.
All projects agreed that basic funding is needed to enable the application of these methods and to ensure sustainability. Most agreed that an ample variety of funding does exist for the initial phase – to create and build new digital resources – which is encouraging. However, this type of funding should be seen as merely an important first step in the lifetime of a digital resource. Indeed, projects must be maintained, and they require concrete support throughout all phases – including post-project. This support is absolutely vital in order to maintain digital resources in the long term. A realization of the importance of solutions to issues of long-term, post-project sustainability is dawning in some Nordic research funding institutions. These organizations have decided to put the responsibility on the individual projects, requiring researchers to consider and plan for this aspect in their Data Management Plans. However, the funding and personnel aspects – including the overarching framework for infrastructural support onsite or on the national level – that are required in order to accomplish long-term goals of sustainability are often uncertain, unsupported, and/or underdeveloped. This experience was shared across projects from all regions, not just the Nordic countries.
LOD and FAIR
Understandably, the discussion touching on the pros and cons of LOD and the FAIR principles also linked to issues of long-term sustainability and the desire to ensure this for all of the projects. Some participants were more optimistic than others about the feasibility of achieving true, long-term sustainability. In fact, a question about what is “long term” was also raised, with numbers ranging from 5–10 years for the actual project, with any add-ons or project enrichments increasing the time a project stayed “alive”. Having an institution that can host and maintain a resource is of course imperative in terms of the feasibility of long-term sustainability. Indeed, applying the FAIR principles was seen as essential as well in order to ensure sustainability in the ongoing development and adaptation of projects.
A number of the project groups started working with LOD from day one, while others wished that they had the time and resources to implement linked data but found that these were lacking. Indeed, it was stressed that, in order to effectively apply LOD, the method had to be included in the initial planning phase and was not easily implemented after the fact. Moreover, in some projects, the reason that there was no plan to implement LOD (even from the beginning) was that there was no institutional buy-in or support – even if the project was interested. This latter point was raised by a number of groups, with the sentiment that it was a struggle to get people, including developers, to understand that LOD is important for sustainable digital projects. Some mentioned that the lack of resources was an obstacle to implementing LOD.
An interesting point was made about the fact that different countries have different procedures related to LOD, with for example, Finland requiring research projects to use LOD. In Norway, projects are expected to make their data open, FAIR, and compatible; however, although datasets are openly published (with identifiers forming the basis of URIs), it was also acknowledged that this work is far from complete.
The workshop heard from a number of speakers who have been working with LOD, FAIR, and international collaborations within the fields of spatial data humanities. In Sweden, Swedish Open Cultural Heritage (SOCH) has developed an aggregate to enable participating cultural heritage institutions to make their collections accessible via linked data. One thing they have learned from the early implementation is the need for an intuitive API (application programming interface) and an easily understood user’s manual. The workshop also heard about the rule developed by the Pelagios network: focus on the wider community’s needs and wishes. This approach has worked well for this network (it has existed for over 10 years); however, what works for some projects does not necessarily fit all research-based digitization projects that have a time-limit and specific goals to meet. The World Historical Gazetteer’s approach is to solicit place data and historical datasets from (mostly) researchers and other users, which applies a community-based model of development. While these approaches are inspiring – and inspired – it was observed that other models are of course also necessary as there rarely is a one-size-fits-all solution.
Standards, Vocabularies, Ontologies
All groups agreed that common standards are vital in order to build and establish truly sustainable infrastructures. But how do we decide on what standards we use, and which authorities should be standard? Of course, it became clear that not all groups have the same prerequisites for sustainability, and this influences the extent to which they need to become interdependent on other infrastructures or integrate the resources they themselves develop with others.
All groups, however, were unanimous in their support for avoiding silos, that is when projects create something new but that cannot be used or linked to by others. The discussion of various practical modelling practices focussed on, for instance, the CIDOC Conceptual Reference Model CIDOC-CRM, which was created as a tool to share and integrate cultural heritage information. Some examples of its application and how to model using this method were mentioned, e.g. https://linked.art and https://docs.cordh.net. In addition, we discussed Linked Open Vocabularies, a resource which contains various ontologies for relevant vocabularies.
Some groups raised the issue that there remains a need for a common understanding of the types of ontologies and vocabularies that already exist. In addition, it is essential that we indicate uncertain data, e.g. dates of material and contexts. A standard that incorporates this is essential, as is an international agreement about these types of authorities. However, using international standards for national collections can be complicated work. Not all authorities work for all projects; thus, it was stressed that it is important to create your own if none of the existing lists work and then share this data.
An important part of the discussion, towards the end of the workshop, focussed on the need to consider minorities and the fact that standards do not necessarily work for everyone. Indeed, not all information should necessarily be fully open and accessible, which necessitates a slightly different model than proposed by LOD which works best with material that is CC-0 licensed. This challenge of applying LOD within projects with sensitive information also raised ethical concerns. This led to the conclusion that LOD as a concept is fine, but it does not always work with concrete project designs. It was noted that sector-specific examples are needed to ease implementation in more projects and promote future work with LOD.
Thus, the need for flexibility – an important advantage with the digital – should not be lost in the work towards sustainable digital research infrastructures. This aspect includes the integration of other perspectives and enabling different ways to organize, grant access to, and present data. A stable infrastructure is one that allows for other projects to re-use developed tools, data, models, and platforms. Moreover, it is essential that there are opportunities to outsource the development of a LOD project if the expertise does not exist at your own institution, while maintenance should be done in-house, by the hosting institution.
Directly connected to the need for sustainability, the workshop participants stressed the necessity of digital support, e.g. via centres for digital humanities or similar, while emphasizing that these support functions need to be adjusted to allow for the development of individual expertise at one’s home institution or a partner institution. This support is also required at cultural heritage institutions. One solution to this problem, at least in the long-term, involves training humanities students in how to organize a dataset and introducing students to digital tools in undergrad. It will expose more researchers to the concepts and considerations needed, so that they consider from the outset how datasets should be formed in order to be processed and machine readable. Moreover, this effort could be extended to encouraging and enabling those researchers without the skills, but with the interest, to work in digital environments.
Dissemination of information is an important piece of puzzle and vital in enabling long-term sustainability. There are a lot of projects out there, and the workshop participants agreed that it can be a challenge to make researchers aware of the data that already exists and that it is possible to use maps actively to do research. Maps are not just illustrations! They can be used to visualize, structure, and analyse data – assisting in the communication of results and a way to organize datasets. In addition, structures to pass projects on to keep the data living are especially important in the case of smaller projects which have no time to develop after their completion (e.g. the Icelandic Saga Map). Letting others take over or “adopt” projects also leads to their sustainability and aids in the dissemination of information.
Last, but not least, in terms of sustainability, there was an agreement that the personal/human parts of sustainability, as well as the “logistical” (funding, technical aspects, and long-term technical requirements) are of vital importance to digital projects. Communication among researchers and developers, as well as a certain aspect of technical training for researchers in digital projects, will help ensure success. Moreover, effective communication of results, methods, and data outside of a project is essential in the continued distribution and sharing of current knowledge.
About the advantages of the digital, including flexibility: Cohen, Daniel and Rosenzweig, Roy (2006). Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web. University of Pennsylvania Press. Available at https://chnm.gmu.edu/digitalhistory/ . Accessed 2022-09-28.
Consortium for Open Research Data in the Humanities, https://docs.cordh.net
Linked Art, https://linked.art
Linked Open Vocabularies, https://lov.linkeddata.es/dataset/lov