How to store uncertain data in nodegoat: ambiguous identities

CORE Admin

This blog post is part of a series on storing uncertain datat in nodegoat: 'How to store uncertain data in nodegoat', 'Incomplete source material', 'Conflicting information', 'Ambiguous identities'.

There are many entities that share a name. This is often the case for cities (e.g. Springfield), or people (e.g. Francis Bacon). When you encounter such a name in a source, the context usually provides you with enough clues to know which of the entities is meant. However, in some cases the context is too vague or the entities too similar to be certain. In these cases you need to resort to interpretation and disambiguation. This is genuine scholarly work, since you always have to interpret your sources.

This blog post will describe a case in which disambiguation is needed. We will use the example of a research process that aims to reconstruct scholarly networks in the 17th and 18th century. In a research process that deals with scholarly networks, the source material will largely consist of citations and mentions in documents.

The disambiguation process will be described by means of a snippet taken from a publication by an anonymous author in 1714 with the title 'An account of the Samaritans; in a letter to J---- M------, Esq;' (ESTC Citation No. N16222).

This blog post uses the data model that was created in the nodegoat guide 'Create your first Type', and will use elements from the guide 'Add External Identifiers', and from the guide 'Add Source References'.

To store 'mentioned' statements, you can use the Type that was created in the guide 'Add Source References' and add a new Sub-Object in which mentions can be saved. To change the model, go to Model and edit the Type 'Publication'. Switch to the tab 'Sub-Object' and create a new Sub-Object with the name 'Mention'. Set the Date to 'None' and Location to 'None'. In the tab 'Description', click the green 'add' button twice to create three Sub-Object Descriptions. Name the first 'Person', the second 'Page Number', and the third 'Notes'. Set the value type for 'Person' to 'Reference: Type' and select the Type 'Person'. Set the value type for 'Page Number' to 'Integer' and set the value type for 'Notes' to 'Text'.

These settings are not set in stone. Adjust them so that they work for your project.

Hottinger, Hottinger, Hottinger, or Hottinger

Go to Data and select the Type 'Publication'. Click 'Add Publication'. Enter the details of the publications and scroll down to the Sub-Object Editor. Click on the plus sign next to 'Mention'. You can now enter which people were mentioned in this publication, add a page number, and add notes about the nature of this mention.

The snippet we will use in this example looks as follows:

An account of the Samaritans; in a letter to J---- M------, Esq;, p. 5

This Chronicle is in the Samaritan Character, tho' the Language be Arabick (...). Hottinger transcribed it, with a Design to translate and print it, but he not living to perform that, Fabricius gives us out of his Book, the Contents of several Chapters of it, which he reckoned were Forty seven. It begins with Moses, his appointing Joshua to be Commander of the Lord's People.

Two people are mentioned in this snippet: 'Hottinger' and 'Fabricius'. Let's start with the first. Click on the input field next to the label 'Person' and start typing 'Hottinger'. Since it is the first time this person occurs in the publication, they are not in the dataset yet and need to be added as an Object in the Type 'Person'. You can do this by clicking the green 'new' button that is shown in the dropdown menu.

In the dialogue that appears you can enter the details of this person. We only know one name, so enter this name as a 'Family Name'. Entering the name of this person results in a dataset that is able to state that the name 'Hottinger' is mentioned in this publication. To be able to make the statement which person is mentioned in this publication, you need to find out which person the author meant when he referred to the name 'Hottinger'.

The guide 'Add External Identifiers' describes how you can use external identifiers for the identification of people. In this case, adding a VIAF ID for the person 'Hottinger' allows you to state which person the publication refers to. Follow the guide 'Add External Identifiers' to learn how to add external identifiers to your dataset.

To add a VIAF ID for the person 'Hottinger', enter his name in the input field of the the Object Description 'VIAF URI'.

The results show that there are many scholars with this name, and that some of them lived around the same time. From this list, it is impossible to know which 'Hottinger' the author referred to. To get more information for each candidate, click the grey 'filter' button. Enter the name 'Hottinger' in the input field with the magnifying glass. The results show a clickable link that takes you to the page of a person on VIAF.org.

A connection to the VIAF service allows you to start the disambiguation process from within your nodegoat environment. Click the URLs to explore the biographies and publications of these four people with the name 'Hottinger'.

Most VIAF pages provide a link to Wikidata, which gives you contextual information on these people. It turns out that the first three Hottingers are all Swiss theologians: father, son, and grandson. The fourth Hottinger is a Swiss crystallography specialist, so an unlikely person to be mentioned in a book on the Samaritans.

When you inspect the published works of the members of the Hottinger family, it turns out that only the grandfather published on Oriental languages. With this information, you can now state that the anonymous author referred to the person 'Johann Heinrich Hottinger (1620-1667)' when he mentioned the name 'Hottinger'.

Select the VIAF ID of the relevant Hottinger (44737362) and enter the other information of this person that you have gathered during this process. Click 'Save Person'.

This person is now stored in the Type 'Person', including their VIAF ID. The added person has also been entered in the input field for the mentioned person. Enter the other details to complete this Sub-Object.

Click on the plus sign next to 'Mention' to enter the mentioned 'Fabricius'. Since 'Fabricius' is not in the dataset, add this person as a new Object. When you select the VIAF ID, you will again have a number of candidates. In this case the authors most likely refers to the German scholar Johann Albert Fabricius (1668-1736) who has the VIAF ID '24623353'.

Repeat these steps for the other mentions you want to store. Click 'Save Publication'.

Concluding Remarks

Storing external identifiers for the people in your dataset helps you to disambiguate them. Just as you can do this for people, you can also add external identifiers for each publication in your dataset. For the publication 'An account of the Samaritans; in a letter to J---- M------, Esq;' you could add the ESTC Citation No. N16222. Whenever you encounter titles with similar names, such an identifier helps you to disambiguate them.

You create a dataset with a lot of value by storing an external identifier for each Object. Your disambiguation processes will eventually lead to a dataset of publications with an ID (e.g. N16222), plus a relationship to the mentioned people who all have a VIAF ID (e.g. 44737362, 24623353). After you have used this data for your own historical network analysis project, you can publish the dataset in a repository. Because the books and people have been identified by means of external identifiers (ESTC Citation Numbers, VIAF IDs) other researchers can easily reuse it, as the result of the disambiguation processes have been stored in the dataset.

Read the other blog posts in this series: 'Incomplete source material' and 'Conflicting information'.

Tobias Winnerling contributed to this blog post as part of his project 'Charting the process of getting forgotten within the humanities, 18th-20th centuries: a historical network research analysis' (MSCA Action 789672, October 2018-September 2019). Read more about this project on his website fading18-20.hypotheses.org.

Comments

Add Comment
Rosemary Graver

Thanks for sharing this post with us, I will share this on my platform https://www.eworldtrade.com/importers/cbd-isolate-buyer/

Geographic visualisation of biographies of scholars. Tobias Winnerling (Heinrich-Heine-Universität Düsseldorf), project: "Wer Wissen schafft. Gelehrter Nachruhm und Vergessenheit 1700 – 2015".

Social Network Graph of the network around Dutch engineer Cornelis Meijer. Project: "Mapping Notes and Nodes in Networks".