nodegoat Workshop series organised by the SNSF SPARK project "Dynamic Data Ingestion"

CORE Admin
Geographic visualisation of a dataset collected as part of the SNSF SPARK project 'Dynamic Data Ingestion': geographical origins of medieval scholars stored in the university history databases Projet Studium Parisiense, ASFE Bologna, Repertorium Academicum Germanicum, and Ottocentenario Universita di Padova.

nodegoat has been extended with new features that allow you to ingest data from external resources. You can use this to enrich your dataset with contextual data from sources like Wikidata, or load in publications via a library API or SPARQL endpoint. This extension of nodegoat has been developed as part of the SNFS SPARK project 'Dynamic Data Ingestion (DDI): Server-side data harmonization in historical research. A centralized approach to networking and providing interoperable research data to answer specific scientific questions'. This project has been initiated and led by Kaspar Gubler of the University of Bern.

Because this feature is developed in nodegoat, it can be used by any nodegoat user. And because the Ingestion processes can be fully customised, they can be used to query any endpoint that publishes JSON data. This new feature allows you to use nodegoat as a graphical user interface to query, explore, and store Linked Open Data (LOD) from your own environment.

These newly developed functionalities built upon the Linked Data Resource feature that was added to nodegoat in 2015. This initial development was commissioned by the TIC-project at the Ghent University and Maastricht University. This feature was further extended in 2019 during a project of the ADVN.

Workshop Series

We will organise a series of four virtual workshops to share the results of the project and explore nodegoat's data ingestion capabilities. These workshops will take place on 28-04-2021, 05-05-2021, 12-05-2021, and 26-05-2021. All sessions take place between 14:00 and 17:00 CEST. The workshops will take place using Zoom and are recorded so you can watch a session to catch up.

The first two sessions will provide you with a general introduction to nodegoat: in the first session you will learn how to configure your nodegoat environment, while the second session will be devoted to importing a dataset. In the third session you will learn how to run ingestion processes in order to enrich any dataset by using external data sources. The fourth session will be used to query other data sources to ingest additional data.[....]

Continue readingComment

How to store uncertain data in nodegoat: ambiguous identities

CORE Admin

This blog post is part of a series on storing uncertain datat in nodegoat: 'How to store uncertain data in nodegoat', 'Incomplete source material', 'Conflicting information', 'Ambiguous identities'.

There are many entities that share a name. This is often the case for cities (e.g. Springfield), or people (e.g. Francis Bacon). When you encounter such a name in a source, the context usually provides you with enough clues to know which of the entities is meant. However, in some cases the context is too vague or the entities too similar to be certain. In these cases you need to resort to interpretation and disambiguation. This is genuine scholarly work, since you always have to interpret your sources.

This blog post will describe a case in which disambiguation is needed. We will use the example of a research process that aims to reconstruct scholarly networks in the 17th and 18th century. In a research process that deals with scholarly networks, the source material will largely consist of citations and mentions in documents.

The disambiguation process will be described by means of a snippet taken from a publication by an anonymous author in 1714 with the title 'An account of the Samaritans; in a letter to J---- M------, Esq;' (ESTC Citation No. N16222).

This blog post uses the data model that was created in the nodegoat guide 'Create your first Type', and will use elements from the guide 'Add External Identifiers', and from the guide 'Add Source References'.

To store 'mentioned' statements, you can use the Type that was created in the guide 'Add Source References' and add a new Sub-Object in which mentions can be saved. To change the model, go to Model and edit the Type 'Publication'. Switch to the tab 'Sub-Object' and create a new Sub-Object with the name 'Mention'. Set the Date to 'None' and Location to 'None'. In the tab 'Description', click the green 'add' button twice to create three Sub-Object Descriptions. Name the first 'Person', the second 'Page Number', and the third 'Notes'. Set the value type for 'Person' to 'Reference: Type' and select the Type 'Person'. Set the value type for 'Page Number' to 'Integer' and set the value type for 'Notes' to 'Text'.

These settings are not set in stone. Adjust them so that they work for your project.[....]

Continue readingComment

A Wikidata/DBpedia Geography of Violence

CORE Admin

We have taken data available in Wikidata and DBpedia on 'Military Conflicts' to create this interactive visualisation in nodegoat:


From the outside, it can be a challenge to keep up with all the developments within the ever expanding universe of wiki*/*pedia. So it's good to be reminded now and then of all the structured data that has become available thanks to their efforts:

This looks pretty neat, especially since Wikidata currently has over 947 million triples in their data store. Since battles usually have a place and a date, it would be nice to import this data into a data design in nodegoat and visualise these battles through time and space (diachronic geospatiality ftw).[....]

Continue readingComment

Data modeling and database development for historians (slides)

CORE Admin

This week we gave a two-day workshop on data modeling and database development for historians. This workshop was part of the course Databases for young historians. This course was sponsored by the Huizinga Instituut, Posthumus Instituut, Huygens-ING  and the Amsterdam Centre for Cultural Heritage and Identity (ACHI, UvA) and was hosted by Huygens-ING.

We had a great time working with a group of historians who were eager to learn how to conceptualise data models and how to set up databases. We discussed a couple of common issues that come up when historians start to think in terms of 'data':

  • How to determine the scope of your research?
  • How to deal with unknown/uncertain primary source material?
  • How to use/import 'structured' data?
  • How to reference entries in a dataset and how to deal with conflicting sources?
  • How to deal with unique/specific objects in a table/type?

These points were taken by the horns (pun intended) when every participant went on to conceptualise their data model. To get a feel for classical database software (tables, primary keys, foreign keys, forms,  etc..), they set up a database in LibreOffice Base. Finally, each participant created their own data model in nodegoat and presented their model and first bits of data.[....]

Continue readingComment

Linked Data vs Curation Island

CORE Admin

You can now use nodegoat to query SPARQL endpoints like Wikidata, DBpedia, the Getty Vocabularies (AAT, ULAN, TGN), and the British Museum. Through the nodegoat graphic interface you query linked data resources and store their URIs within your dataset. This means that you can search all people in Wikidata using the string 'Rembrandt' and select the URI of your choice (e.g. ''). By doing so, you add external identifiers to your dataset and introduce a form of authority control in your data. This will help to disambiguate objects (like persons/artworks with similar names) and also enhances the interoperability of your dataset. Both these aspects make it easier to share and reuse datasets.

These two advantages (data disambiguation and data interoperability) are useful for researchers who work on small(-ish) but complex datasets. Researchers who feel that 'automated' research processes are unattainable for them as their data may be dispersed, heterogeneous, incomplete, or only available in an analogue format, are more likely to rely on something like the old fashioned card catalogue system in which all relevant objects and their varying attributes and relations are described. Luckily, we can also use digital tools to create and maintain card catalogues (databases). For a historian who is mapping the art market of a seventeenth century Dutch town, a database is a very powerful tool to store and analyse all objects (persons, artworks etc.) and the relations between these objects. Still, if no external identifiers are used, this dataset is nothing but a curated island (even if the data is published!).

Curation Island

Curation & Linked Data

The process we describe here aims to connect the craftsmanship of research in the humanities to the interconnected world of massive repositories, graph databases and authority files. Other useful purposes of linked data resources for the humanities have already been described extensively, like using aggregation queries to analyse large collections, thesaurus comparison/matching, or performing automated metadata reconciliation as described by the Free Your Metadata initiative.[....]

Continue readingComment

Geographic visualisation of biographies of scholars. Tobias Winnerling (Heinrich-Heine-Universität Düsseldorf), project: "Wer Wissen schafft. Gelehrter Nachruhm und Vergessenheit 1700 – 2015".

Social Network Graph of the network around Dutch engineer Cornelis Meijer. Project: "Mapping Notes and Nodes in Networks".