The corpus application is developed by the INT. The backend of the application is the BlackLab Lucene based search engine developed for corpora with token-based annotation (http://inl.github.io/BlackLab/). The web-based frontend is a further development of the corpus-frontend application developed by INT (https://github.com/INL/corpus-frontend) in CLARIN and CLARIAH projects. Its design is inspired by the first version of the OpenSoNaR user interface by Tilburg and Radboud University (https://github.com/Taalmonsters/WhiteLab2.0).
The Corpus Middelnederlands in the current release is a collection of 372 documents from the period of 1300-1550. The main sources for this corpus are the rhyming texts and prose texts from the CD-ROM Middle Dutch, compiled bij the INL/INT and published in 1998 by the Flemish Standaard Uitgeverij and the Dutch Sdu. (The CD-ROM texts, taken from the corpus Gysseling have not been included here, for these texts please consult the online Corpus Gysseling.).
This corpus is in part an extension of the Corpus Middelnederlands that was integrated in the Nederlab portal in 2018. Two completely French texts were removed. The metadata has also been corrected.
|Corpus Gysseling (book)||Corpus Gysseling (corpus application)||Corpus Middle Dutch (CD-rom)||Corpus Middle Dutch (corpus application)|
|rhyming texts (1300-1550)||rhyming texts (1300-1550)|
|prose texts (1300-1550)||prose texts (1300-1550)|
|additional texts, e.g. Hattem (1300-1550)|
The Corpus Middelnederlands presented in this application contains classical works of Middle Dutch literature like Beatrijs, Van den vos Reynaerde, the abele spelen, the stories about King Arthur or about Charlemagne, all texts from the famous Gruuthuse manuscript (including the Egidius song), but also many of the lesser known or less researched texts, such as prose adaptations of the rhyming knight’s tales (the so-called ‘chapbooks’), collections of songs such as the Antwerp Songbook, several Bible translations, hagiographies, books of prayer, chronicles, and all kinds of religious, didactic and scientific treatises, medical manuals and recipes.
A number of texts belonging to the so-called artes literature have been added to these source texts, such as the Hattem manuscript (C5) and Van der proprieteyten der dinghen ('On the Properties of Things') by Bartholomeus Anglicus – both published and made available by the Werkgroep Middelnederlandse Artesliteratuur (WEMAL) – and the Circa Instans and the Trotula.
All texts from the CD-ROM Middle Dutch included in this corpus have already been published in print, in either diplomatically or critically published editions. New editions or improved versions of existing editions published after the publication of the CD-ROM Middle Dutch are not included in this corpus.
The texts presented here are as complete as possible. Overlaps or duplications have been avoided, except in cases where these other versions offer additional Middle Dutch text material or different translations of the same source text. For Van Maerlant's Spiegel historiael for instance this means that not only the text of the De Vries and Verwijs edition is given, but also the text of all fragments that fill textual gaps in this edition. For Tondalus’ visioen, the different translations of the same Latin source are included. A copy of a text containing interesting textual variants on the other hand, has not been included.
Contrary to other historical corpora of the Dutch Language Institute (Instituut voor de Nederlandse Taal), the Corpus Middelnederlands has not yet been annotated with part of speech and lemma. To make the corpus more accessible, suggestions for query expansion are given, using het INT lexicon service with the historical computational lexicon GiGaNT-HILEX.
The current version of GiGaNT-HILEX in the lexicon service contains the lexicon modules based on the Dictionary of the Dutch Language and the Dictionary of Middle Dutch.
If you want to make use of this service, please contact Katrien Depuydt (email@example.com).
When referring to the present website, please use the following reference:
Corpus Middelnederlands (January 2021) [Online service]. Available at the Dutch Language Institute: http://hdl.handle.net/10032/tm-a2-r9
Part of the corpus data is already available. When using the data, please use the following reference:
Corpus Middelnederlands (Version 1.0) (1998) [Data set]. Available at the Dutch Language Institute.
Software available at https://github.com/INL/BlackLab
Does, Jesse de, Jan Niestadt en Katrien Depuydt (2017), Creating research environments with BlackLab. In: Jan Odijk and Arjan van Hessen (eds.) CLARIN in the Low Countries, pp. 151-165. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi
For the corpus frontend:
Software available at: https://github.com/INL/corpus-frontend