From the outside, it can be a challenge to keep up with all the developments within the ever expanding universe of wiki*/*pedia. So it's good to be reminded now and then of all the structured data that has become available thanks to their efforts:
This looks pretty neat, especially since Wikidata currently has over 947 million triples in their data store. Since battles usually have a place and a date, it would be nice to import this data into a data design in nodegoat and visualise these battles through time and space (diachronic geospatiality ftw).[....]
This week we gave a two-day workshop on data modeling and database development for historians. This workshop was part of the course Databases for young historians. This course was sponsored by the Huizinga Instituut, Posthumus Instituut, Huygens-ING and the Amsterdam Centre for Cultural Heritage and Identity (ACHI, UvA) and was hosted by Huygens-ING.
We had a great time working with a group of historians who were eager to learn how to conceptualise data models and how to set up databases. We discussed a couple of common issues that come up when historians start to think in terms of 'data':
How to determine the scope of your research?
How to deal with unknown/uncertain primary source material?
How to use/import 'structured' data?
How to reference entries in a dataset and how to deal with conflicting sources?
How to deal with unique/specific objects in a table/type?
These points were taken by the horns (pun intended) when every participant went on to conceptualise their data model. To get a feel for classical database software (tables, primary keys, foreign keys, forms, etc..), they set up a database in LibreOffice Base. Finally, each participant created their own data model in nodegoat and presented their model and first bits of data.[....]
You can now use nodegoat to query SPARQL endpoints like Wikidata, DBpedia, the Getty Vocabularies (AAT, ULAN, TGN), and the British Museum. Through the nodegoat graphic interface you query linked data resources and store their URIs within your dataset. This means that you can search all people in Wikidata using the string 'Rembrandt' and select the URI of your choice (e.g. 'https://www.wikidata.org/wiki/Q5598'). By doing so, you add external identifiers to your dataset and introduce a form of authority control in your data. This will help to disambiguate objects (like persons/artworks with similar names) and also enhances the interoperability of your dataset. Both these aspects make it easier to share and reuse datasets.
These two advantages (data disambiguation and data interoperability) are useful for researchers who work on small(-ish) but complex datasets. Researchers who feel that 'automated' research processes are unattainable for them as their data may be dispersed, heterogeneous, incomplete, or only available in an analogue format, are more likely to rely on something like the old fashioned card catalogue system in which all relevant objects and their varying attributes and relations are described. Luckily, we can also use digital tools to create and maintain card catalogues (databases). For a historian who is mapping the art market of a seventeenth century Dutch town, a database is a very powerful tool to store and analyse all objects (persons, artworks etc.) and the relations between these objects. Still, if no external identifiers are used, this dataset is nothing but a curated island (even if the data is published!).
Curation & Linked Data
The process we describe here aims to connect the craftsmanship of research in the humanities to the interconnected world of massive repositories, graph databases and authority files. Other useful purposes of linked data resources for the humanities have already been described extensively, like using aggregation queries to analyse large collections, thesaurus comparison/matching, or performing automated metadata reconciliation as described by the Free Your Metadata initiative.[....]