Ingest Publication Data

In this guide we will ingest publication data from the SPARQL endpoint of the Bibliothèque nationale de France. We will run a query that finds all the available publications of an author based on their VIAF identifier. We will make use of the selection of people ingested in the guide 'Ingest Biographical Data' and we will make use of the VIAF identifiers ingested in the guide 'Ingest External Identifiers'. If you already have a list of people with a VIAF identifier, you can use your own data.

Model  

To store this publication data we need to add a new Object Type. Go to Model and go to the tab 'Object Types'. Click 'Add Object Type'. Enter the name of this Type in the 'Name' field: 'Publication'.

Uncheck the 'Fixed Field' and 'In Overviews' options for the Object Name, as we will generate the Object Name based on Object Descriptions.   Read the guide on creating your first Object Type to learn more about these settings.

Specify four Object Description with the name 'Title', 'Author', 'Year', and 'URI'. Set the value type of the 'Title' Object Description to 'String' and set the value type of 'Author' to 'Reference: Object Type' and select 'Person'. Set the value type of the 'Year' Object Description to 'Date' and set the value type of 'URI' to 'External'. Keep the 'Overview' option checked for all Object Descriptions. Check the option 'Name' and 'Quick Search' for 'Title', 'Author' and 'Year'.  

Click 'Save Type' to save your Object Type.

Linked Data Resource  

Go to Model and go to 'Linked Data'. Click 'Add Linked Data Resource' and give the resource a name like 'BNF Data Publications'. Enter the request URL 'https://data.bnf.fr/sparql?default-graph-uri=&query=' in the 'URL' input field and enter '&format=json' in the 'URL Options'. Leave the 'URL Headers' empty.

We will use this SPARQL query:

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT ?publication ?title ?date ?viaf_id
WHERE {
?author owl:sameAs <http://viaf.org/viaf/109228530>. 
BIND("109228530" AS ?viaf_id) .
?publication dcterms:creator|dcterms:contributor ?author .
?publication dcterms:title ?title filter (lang(?title) = "fr").
?publication dcterms:date ?date .
}

The Ingestion Process that we will create needs to be able to adjust the VIAF identifier when it runs this query for every Object of the Type 'Person'. We can facilitate this by indicating the parts that can be modified when the query runs.

We do this by using the [query] and [variable] tags. The [query] tag is used to indicate an adjustable part of the query in which one or multiple [variable] tags exist. [variable] tags are assigned a value by the Ingestion Process every time the query runs.

In order to be able to link the ingested publications to the author already present in your nodegoat environment, we need the VIAF identifier to be present in the response data as well. To achieve this we bind (i.e. BIND("[variable=id]109228530[/variable]" AS ?viaf_id) the provided variable to a variable that will be present in the output of the query.

When we implement this, the query looks like this:

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT ?publication ?title ?date ?viaf_id
WHERE {
[query=viaf]
?author owl:sameAs <http://viaf.org/viaf/[variable=id]109228530[/variable]>. 
BIND("[variable=id]109228530[/variable]" AS ?viaf_id) .
[/query]
?publication dcterms:creator|dcterms:contributor ?author .
?publication dcterms:title ?title filter (lang(?title) = "fr").
?publication dcterms:date ?date .
}

Paste this query in the 'Query' input field. Click the green 'test' button to run the query. This action will combine the values in the 'URL' field, the 'URL options' field and the 'Query' field.

After the query has run, the results are shown in the 'Response' input field. This field allows you to inspect the results and to verify that all specified variables have been returned. By clicking the green 'use' button, the returned data populates the mapping options.

After the 'use' button is clicked, you can select the position in the returned data that contains the URI and Label. To do this, change the dropdown menu next to the label 'URI' to: {"results":{"bindings":{"[]":{"publication":{"value":""}}}}} and change the dropdown menu next to the label 'Label' to: {"results":{"bindings":{"[]":{"Title":{"value":""}}}}}.

To process the 'date' value we a custom key/value pair. Enter the name for the key: 'Date'. Change the dropdown menu for this value to: {"results":{"bindings":{"[]":{"date":{"value":""}}}}}. You can create a Linked Data Conversion to convert the date to a year if you wish to store the year only.

Click 'Save Linked Data Resource' to store this configuration.

Ingestion Processes  

Ensure that you have the Object Type 'Publication' as well as the Ingestion Processes enabled in you project by going to Management and select 'Projects'. Edit your project and enable the System Process 'Ingestion' as well as the Object Type 'Publication'.

Go to the Data section of your environment. Go to the tab 'Processes', click 'Ingestion', and click 'Add Ingestion'.

Give the Ingestion Process a name like 'Store BNF Data Publications'. Use the dropdown menu with the label 'Source' to select the Linked Data Resource 'BNF Data Publications'. Use the dropdown menu with the label 'Target' to select the Object Type 'Publication'. Since we are adding new Objects you can leave the mode of the Ingestion Process to 'Add New Objects'.

Disregard the form section 'Query / Filter External Resource By Value'.

We want to run the query for every Object of the Type 'Person' and assign the VIAF identifier of each person to the previously defined variable 'viaf: id'. To configure this, select the Object Type 'Person' next to the label 'Use' in the form section 'Query External Resource By Object Value'. You can use the blue 'filter' button to select a sub-set of Objects of the Type 'Person' (e.g. people that have a VIAF identifier). Use the first dropdown menu next to label 'Query' to select the variable 'viaf: id' and use the second dropdown menu to select the Object Description 'VIAF Identifier'. With these settings in place, the Ingestion Process will run a query for every Object of the Type 'Person' and will assign the value of the 'VIAF Identifier' Object Description (e.g. '109228530') to the variable 'viaf: id'.  

Use the form section 'Map External Resource To Data Model' to connect the returned variables to elements in the selected Object Type 'Publication'. In this case you connect the variable 'uri' to the Object Description 'URI'. You connect the variable 'label' to the Object Description 'Title'. You connect the variable 'date' to the Object Description 'Year' and you connect the variable 'viaf_id' to the Object Description 'Author'. The referenced Object Type 'Person' appears and you can use the dropdown menu 'Element that will be used to make a Reference. If left blank, Quicksearch Descriptions will be used.' to select the 'VIAF Identifier' Object Description. This will be used to identify the people in your environment by means of the returned VIAF Identifier.  

With these settings in place, the Ingestion Process will run a query for every Object of the Type 'Person' and will add the returned data to the Object Type 'Publication'.

Click 'Save Ingestion'.

You now see the newly created Ingestion Process listed in your overview of Ingestion Processes. Click the green 'run' button on the right side of this overview to run this Ingestion Process.

Click 'Run Ingestion' to run the Ingestion Process. The Ingestion Process runs and informs you about the results. If everything went correctly you now have ingested new Objects of the Type 'Publication'. Open the Object Type 'Publication' to see the newly ingested Objects. Switch to the Object Type 'Person' and find a person who has a VIAF Identifier. Open this person and open the tab 'Cross-Referenced'. You will now see the ingested data that is cross-referenced to this person via the authorship reference that has been stored in the Object Description 'Author' of each publication Object.

Appendix: Query other Libraries  

You can run a similar query to the SPARQL endpoint of any library. The Koninklijke Bibliotheek has a SPARQL endpoint to which you can send this query:

SELECT DISTINCT ?publication ?title ?date ?viaf_id
WHERE {
?person schema:sameAs <http://viaf.org/viaf/109228530>.
BIND("109228530" AS ?viaf_id) .
?publication schema:author ?person.
?publication schema:name ?title.
?publication schema:publication ?publication_node.
?publication_node schema:startDate ?date.
}

You can configure a Linked Data Resource with the URL 'http://data.bibliotheken.nl/sparql?default-graph-uri=&query=' and URL Options '&format=json&timeout=0&debug=on'. Paste this query in the 'Query' input field:

SELECT DISTINCT ?publication ?title ?date ?viaf_id
WHERE {
[query=viaf]
?person schema:sameAs <http://viaf.org/viaf/[variable=id]109228530[/variable]>.
BIND("[variable=id]109228530[/variable]" AS ?viaf_id) .
[/query]
?publication schema:author ?person.
?publication schema:name ?title.
?publication schema:publication ?publication_node.
?publication_node schema:startDate ?date.
}

You can configure the other settings of this Linked Data Resource in the same way as the 'BNF Data Publications' Linked Data Resource. By using the same configuration, you do not need to create a new Ingestion Process. Run the 'Store BNF Data Publications' again and change the Source to the previously configured Linked Data Resource that connects to the Koninklijke Bibliotheek SPARQL endpoint.

After this has run, you will see the newly added Objects of the Type 'Publication'. These new Objects will also appear in the updated list of Cross-Referenced Objects when you open an Object of the Type 'Person'.