Iterative Data Modelling
In the past years, we have given various nodegoat workshops to groups of scholars and students. Even though the entry level of the participants varied from workshop to workshop there were similar challenges that emerged every time. These challenges can be grouped into the following three questions:
- What is a relational database?
- My material is very vague/ambiguous/uncertain/contradictory/unique/special, how can I use this in a database?
- How do I use the nodegoat interface?
Since most of the workshops we give are nodegoat-specific, we aim to teach participants how to do data modelling from within the nodegoat interface. Because of this, and as a result of the usual time constraints (often half a day), we have to leave the first two fundamental questions largely untouched. To remedy this, we have written two blog posts in which we aim to cover the first two questions. The third question is being addressed in the nodegoat video tutorials, the FAQ & forum, and in the near future the documentation.
- The first blog post "What is a Relational Database?" introduces the concept of a relational database from the point of view of a research process. Instead of starting with an abstract data model, we describe the way in which a relational database can be of use for any research process that deals with variety and complexity in sources.
- The second blog post "Formulating Ambiguity in Databases" discusses the way in which scholars in the humanities can work with 'data' and how they can overcome challenges related to vague or contradictory sources.
- The third question is dealt with in the nodegoat video tutorials.
Iterative Data Modelling as a Teaching Practice and a Research Method
While teaching about databases, we realised that it is very helpful to identify these three questions as distinct levels. These three questions cover three levels of data modelling: the conceptual level, the logical level, and the interface level.
Conceptual Level Wikipedia article
Understanding what a relational database is: being able to think in an abstract manner about different kinds of information and the relationships between these different kinds of information. Once you have mastered this, you are able to conceptualise a data model.
Logical level Wikipedia article
Understanding how to properly define the data model in a database: being able to come up with a logical data model that is able to store the different kinds of data in your research process and addresses the ambiguities in your research data.
Interface level Wikipedia article
Once you have a conceptual data model and you are able to create a logical data model, you are ready to translate these to an actual database application.
In our experience, it is important to acknowledge these three distinct levels in your data modelling process. This prevents you from getting stuck on the level of the interface, while struggling with a conceptual question. Your own understanding of your data model and conceptual comprehension matters most. The database technology used is only the means of accessing that data model.
Once you have established these three levels as distinct phases in your data modelling process, you can also start to move between these levels. By taking an iterative approach, you can 'prototype' conceptual data models in your database application to see whether they work out and are able to answer your research questions. You can also start with a simple conceptual data model and build this in your database application to learn about the functionalities of the application. You can then go back to your conceptual data model and expand it.
At the DH2017 conference in Montreal this August, we will present a paper on iterative data modelling in the humanities.