Data modeling and database development for historians (slides)

CORE Admin

This week we gave a two-day workshop on data modeling and database development for historians. This workshop was part of the course Databases for young historians. This course was sponsored by the Huizinga Instituut, Posthumus Instituut, Huygens-ING  and the Amsterdam Centre for Cultural Heritage and Identity (ACHI, UvA) and was hosted by Huygens-ING.

We had a great time working with a group of historians who were eager to learn how to conceptualise data models and how to set up databases. We discussed a couple of common issues that come up when historians start to think in terms of 'data':

  • How to determine the scope of your research?
  • How to deal with unknown/uncertain primary source material?
  • How to use/import 'structured' data?
  • How to reference entries in a dataset and how to deal with conflicting sources?
  • How to deal with unique/specific objects in a table/type?

These points were taken by the horns (pun intended) when every participant went on to conceptualise their data model. To get a feel for classical database software (tables, primary keys, foreign keys, forms,  etc..), they set up a database in LibreOffice Base. Finally, each participant created their own data model in nodegoat and presented their model and first bits of data.[....]

Continue readingComment

Linked Data vs Curation Island

CORE Admin

You can now use nodegoat to query SPARQL endpoints like Wikidata, DBpedia, the Getty Vocabularies (AAT, ULAN, TGN), and the British Museum. Through the nodegoat graphic interface you query linked data resources and store their URIs within your dataset. This means that you can search all people in Wikidata using the string 'Rembrandt' and select the URI of your choice (e.g. 'https://www.wikidata.org/wiki/Q5598'). By doing so, you add external identifiers to your dataset and introduce a form of authority control in your data. This will help to disambiguate objects (like persons/artworks with similar names) and also enhances the interoperability of your dataset. Both these aspects make it easier to share and reuse datasets.

These two advantages (data disambiguation and data interoperability) are useful for researchers who work on small(-ish) but complex datasets. Researchers who feel that 'automated' research processes are unattainable for them as their data may be dispersed, heterogeneous, incomplete, or only available in an analogue format, are more likely to rely on something like the old fashioned card catalogue system in which all relevant objects and their varying attributes and relations are described. Luckily, we can also use digital tools to create and maintain card catalogues (databases). For a historian who is mapping the art market of a seventeenth century Dutch town, a database is a very powerful tool to store and analyse all objects (persons, artworks etc.) and the relations between these objects. Still, if no external identifiers are used, this dataset is nothing but a curated island (even if the data is published!).


Curation Island

Curation & Linked Data

The process we describe here aims to connect the craftsmanship of research in the humanities to the interconnected world of massive repositories, graph databases and authority files. Other useful purposes of linked data resources for the humanities have already been described extensively, like using aggregation queries to analyse large collections, thesaurus comparison/matching, or performing automated metadata reconciliation as described by the Free Your Metadata initiative.[....]

Continue readingComment

nodegoat as an Interactive Museum Installation: 20.000 letters visualised through time and space

CORE Admin
The installation is located in the first section of the permanent exhibition. The wooden table has a cut-out (elevated) map of Europe as its surface. The visualisation is projected by a Barco F35 projector (WQXGA resolution). Visitors can interact with the installation by means of capacitive sensors.

We have developed an interactive installation for the new GRIMMWELT museum in Kassel, Germany. The installation visualises and lets visitors freely interact with the full correspondence network of Jacob and Wilhelm Grimm, involving a total of 20.000 letters and 1400 correspondence partners in a timespan of 80 years. The dataset of letters has been created by the Arbeitsstelle Grimm-Briefwechsel at the Institut für deutsche Literatur of the  Humboldt-Universität zu Berlin. We have developed the visualisation in cooperation with SPIN: Study Platform on Interlocking Nationalisms at the University of Amsterdam.

The installation implements a new geographical visualisation mode 'Movement' in nodegoat, in addition to the already available line-based 'Connection' mode. The Movement mode uses WebGL rendering (GPU) to animate large collections of objects smoothly. This mode also allows for a wide range of configuration parameters to finetune the visualisation to various scenarios. Due to the open and generic nature of nodegoat, we can now make use of the Movement mode for any other relevant dataset.

This short clip shows the new visualisation mode from within nodegoat:


A high resolution 1440p version of this clip is available here.[....]

Continue readingComment

nodegoat Workshop at the Text Encoding Initiative Conference in Lyon 26-10-2015

CORE Admin

Cheveux © Marie-Jeanne Gauthé, via http://tei2015.huma-num.fr/en/.

During this year's Text Encoding Initiative Conference in Lyon, from 26 to 31 October 2015, we will host a nodegoat workshop. The workshop will last a full day and will take place on 26 October. Register here.

In this workshop we will support participants to employ explorative visualisations based on their own TEI data by means of nodegoat. A good example of how nodegoat can be used to create, manage, visualise, analyse and present structured data is the project on romantic nationalism by Joep Leerssen of the University of Amsterdam. The public interface of this collaborative research project can be consulted via http://romanticnationalism.net, or read more about it in the brochure (PDF).[....]

Continue readingComment

nodegoat Workshop at the Historical Network Conference in Lisbon on 16-9-2015

CORE Admin

nodegoat workshop at the eighth HNR workshop Vom Text zum Netzwerk und zurück. Über die Wechselwirkungen im historischen Forschungsprozess 5/6 April 2014.

During this year's Historical Network Research conference in Lisbon 15-18 September, we will host a nodegoat workshop. The title of this workshop is: Conceptualise and Set Up a Historical Network Research Workflow. We will focus on conceptualising a data model for your own research question and explore the possibilities of storing your data structurually and creating interactive space/time visualisations. The workshop will last a full day and will take place on 16 September.

As nodegoat is a web-based data modeling and management tool that is equiped with functionalities to produce time-aware network analytics and visualisations, it is well suited for historical network analysis.[....]

Continue readingComment

Mapping Memory Landscapes in nodegoat, the Indonesian killings of 1965-66

CORE Admin

nodegoat is developed as a collaborative research environment that supports participatory research projects. To test its ability to combine various participatory roles with its ability to digest complex and heterogeneous data, we spent two weeks in Semarang, Indonesia working with a group of students to reveal an infrastructure of violence. These students interviewed survivors of state-sanctioned violence and entered the information they gathered directly into nodegoat. Based on these interviews, the students visited a number of sites and interviewed people who lived or worked on these sites. As the data came from personal accounts only, the visualisations that are produced in nodegoat can be characterised as memory landscapes. In this blog post we will describe both the process and the methodology of this project.

The Dutch Institute for War, Holocaust and Genocide Studies (NIOD) has set up a cooperation with the Universitas Katolik Soegijapranata (UNIKA) in Semarang, Indonesia that aims to address the anti-communist/leftist violence of 1965-66 in Semarang and the following years. The project that has emerged from this cooperation, ‘Memory Landscapes and the Regime Change of 1965-66 in Semarang’, is led by dr. Martijn Eickhoff (NIOD) and has resulted in two workshops at the UNIKA University in Semarang organised by Donny Danardono. The first workshop took place in January 2013, the second workshop was held in June 2014. During these two workshops students from UNIKA collected data on anti-communist/leftist violence by combining oral history and anthropological site research. The data includes relations between people as well as locations connected to the events of 1965 and the following years (e.g. places of mob violence, temporary detention, interrogation, torture, murder and mass burial).  [....]

Continue readingComment

Reversed Classification

CORE Admin

Working with data in the humanities, we’ve noticed that the debate on classifications is often focused on the definition of the classification and not so much on what it identifies. A well known example is of course ‘nationality’, but also a (historical) occupation/capacity and even seemingly unproblematic classifications like ‘the nineteenth century’ pose several problems.

Looking at data from an object-oriented perspective, using predefined classifications seems counterintuitive. Objects should define themselves by means of their varying attributes. Nodes and clusters emerge on the basis of correlation between objects.

Nevertheless, we understand the need to be able to identify these clusters in a structured manner without the need to perform sequences of filters. These ‘structured clusters’ should be able to be ordered, analysed and explored. For this reason, we have taken up the challenge to equip nodegoat with a functionality that allows for the definition of these clustered by means of fuzzy filtering settings. We have defined this process as ‘reversed classification’. Although we have merely conceptualised the challenge, and have yet to implement this, we want to share our ideas behind this.

In general, classifications emphasise a convention of value and vocabulary. The direction of a classification is outward, relating to the convention unidirectionally. In effect, the classification is unable to communicate/negotiate with the network it classifies. The reversal of classification opens up the convention by disclosing its parameters. Reversal allows the classification to be scrutinised, reconfigured and re-evaluate the objects it classifies.

Simply put: instead of identifying classifications and assigning these to objects in a dataset (like ‘sculptor’ or ‘German’), a user defines a multi-faceted filter spanning multiple datasets in which they define any number of parameters that are associated with a classification. This will reverse the classifying process as the definition of the classification is identified by the exchange between parameters of the classification and attributes of the object. [....]

Continue readingComment