As a result of our cooperation with nodegoat's institutional partners, we have been able to develop a RESTful API for nodegoat.
The API provides an additional interface to query and store data to your Projects in nodegoat. We have integrated the API with nodegoat's core functionalities and have optimised it for large operations. The API can also be used to update the Data Design, which allows you to update specific attributes of a Type, or upload a whole Data Design with multiple Type templates in one go.
You can use the Project settings to configure what parts of your data are exposed through the API. The API can be configured to require authentication or allow for public access.
In case you want to use the API with your own research data, get in touch!
We have enabled the API for a demo domain. You can access this domain by logging in to nodegoat.net with the username 'demo' and password 'demo'. The following cURL commands give you a JSON package with the information that has been entered on the French intellectual Ernest Renan. You can also click on the URL to view the output in your web browser.
You can now use nodegoat to query SPARQL endpoints like Wikidata, DBpedia, the Getty Vocabularies (AAT, ULAN, TGN), and the British Museum. Through the nodegoat graphic interface you query linked data resources and store their URIs within your dataset. This means that you can search all people in Wikidata using the string 'Rembrandt' and select the URI of your choice (e.g. 'https://www.wikidata.org/wiki/Q5598'). By doing so, you add external identifiers to your dataset and introduce a form of authority control in your data. This will help to disambiguate objects (like persons/artworks with similar names) and also enhances the interoperability of your dataset. Both these aspects make it easier to share and reuse datasets.
These two advantages (data disambiguation and data interoperability) are useful for researchers who work on small(-ish) but complex datasets. Researchers who feel that 'automated' research processes are unattainable for them as their data may be dispersed, heterogeneous, incomplete, or only available in an analogue format, are more likely to rely on something like the old fashioned card catalogue system in which all relevant objects and their varying attributes and relations are described. Luckily, we can also use digital tools to create and maintain card catalogues (databases). For a historian who is mapping the art market of a seventeenth century Dutch town, a database is a very powerful tool to store and analyse all objects (persons, artworks etc.) and the relations between these objects. Still, if no external identifiers are used, this dataset is nothing but a curated island (even if the data is published!).
Curation & Linked Data
The process we describe here aims to connect the craftsmanship of research in the humanities to the interconnected world of massive repositories, graph databases and authority files. Other useful purposes of linked data resources for the humanities have already been described extensively, like using aggregation queries to analyse large collections, thesaurus comparison/matching, or performing automated metadata reconciliation as described by the Free Your Metadata initiative.[....]
The accessibility and flexibility of nodegoat allows for a collaborative and ongoing data entry and data curation process. The experience learns that data consistency becomes a challenge as soon as data entry processes become collaborative or are executed over longer periods of time. Especially when the data structure is complex and data sources are ambiguous, consistency is an increasingly prominent factor. To ensure uniform identification of each object within the dataset, the name of an object should both be consistent and inclusive.
Within nodegoat the name of each object can be a plain text field, generated dynamically, or a combination of the two. When generated dynamically, the object name can be build from its definitions for consistency and include the definitions from other named objects for inclusiveness. A rather exhaustive naming scheme for a painting could look like this:
By generating object names dynamically, changes in named objects (such as artist and city in the example of the painting) are also reflected accordingly in the name of the objects.
Due to the unrestricted relational nature of the naming algorithm there is a potential problem for recursion. Recursion can be introduced directly (e.g. the name of a person includes the name of the person's parents) or further down the naming scheme. By limiting recursion to a single step it is possible to leverage this feat and include family ties within a person's name without running into an infinite loop.
In a future blog post we will discuss the possibility to complement the dynamic generation of object names with conditional formatting.
Within nodegoat we are working on combining data management functionalities with the ability to seamlessly analyse and visualise data. nodegoat can be used as any other database application as it allows users to define, update and query multiple data models. However, as soon as data is entered into the environment, various analytical tools and visualisations become available instantly. Tools such as in-depth filtering, diachronic geographical mappings, diachronic social graphs, content driven timelines, and shortest path calculation enable a user to explore the context of each piece of data. The explorative nature of nodegoat allows users to trailblaze through data; instead of working with static ‘pushes’ – or exports – of data, data is dynamically ‘pulled’ within its context each time a query is fired. This approach produces a number of advantages, opportunities, and challenges we plan to discuss in this and future blog posts.
To kick off, let’s consider an example: the provenance of paintings. Should an art historian decide to deal with this research question within nodegoat, they will first conceptualise a data model based on the kind of data that needs to be included (e.g. persons, studios, paintings, collections, museums) and the relevant relations (e.g. created by, sold by, inherited by, exhibited in). This data model then has to be set up in nodegoat and subsequently be filled with pieces of evidence (see the nodegoat FAQ to learn more about this). As soon as the first objects have been entered and their relations have been identified, these objects can be plotted on a map, be viewed in a social graph, or simply: they become part of the network. Now, a question such as ‘how is an artist connected to a specific museum via an art dealership?’ becomes tangible by using functionalities such as shortest path calculation between objects and in-depth filtering.
nodegoat runs in a web browser, making it is accessible from any device connected to the internet. Working in a web based environment allows for the implemention of collaborative projects and simultaneous access to the same dataset. Multiple users (who have been assigned varying clearance levels) can enter, update and inspect data. Using this approach, a researcher or research group can decide to design a data model in nodegoat and start entering data into this data model alone, together or with a larger group. [....]