The MIMORE project: simultaneous searching in three related databases

MIMORE

A Microcomparative Morphosyntactic Research tool

The MIMORE tool enables researchers to investigate morphosyntactic variation in the Dutch dialects by searching three related databases with a common on-line search engine. The search results can be visualized on geographic maps and exported for statistical analysis. The three databases involved are DynaSAND, DiDDD and GTRP.

DynaSAND

The data in DynaSAND, the dynamic syntactic atlas of the Dutch dialects, were collected between 2000 and 2005 by oral interviews (fieldwork and telephone) in about 300 locations across The Netherlands, Belgium and a small part of north-west France. Dialect speakers were asked to judge and/or translate some 150 test sentences. DynaSAND makes available the full recordings and transcriptions of these interviews. Together, the DynSAND data cover the syntactic variation in the Dutch language area in the left periphery of the clause (the complementizer system and complementizer agreement), variation in subject pronoun form depending on syntactic position, subject pronoun doubling, cliticization on YES/NO, the reflexive system, fronting constructions (Wh-clauses, relative clauses, topicalization), word order and morphological variation in verb clusters, negation and quantification.

DialectkaartDiDDD

The data in DiDDD (Diversity in Dutch DP Design) were collected between 2005 and 2009 with oral and written interviews in about 200 locations in the Dutch language area, with a methodology highly parallel to DynaSAND. The data involve translations of and judgements on test sentences. For 30 interviews there are sound recordings which have been lined up with their transcriptions. The DIDDD data cover the morphosyntactic variation within nominal groups, in particular possessives, partitives, noun ellipsis, the demonstrative system, the numeral modification system, what-for constructions, quantitative er, adjectival inflection, negation and exclamatives.

GTRP

The data in GTRP (Goeman, Taeldeman, van Reenen Project) were collected between 1979 and 2000 with oral interviews in about 600 locations in the Dutch language area. Informants were asked to translate words or short sentences. Parts of the transcriptions have been lined up with the sound recordings. The morphological data in GTRP include plural forms of nouns, diminutives, gender on nouns and adjectives, comparatives, superlatives, verbal inflection including participles, subject, object and possessive pronouns.

MIMORE

With the MIMORE search engine one can search these three databases simultaneously, with text strings, part of speech tags and syntactic variables. The researchers can combine categories and features into complex tags (see figure 1) or use predefined tags. All categories and features are linked to the ISOCAT-standards. Since all sentences have a location code, the morphosyntactic phenomena found in a set of sentences resulting from a search can be automatically plotted on a geographic map. It is possible to include more than one morphosyntactic phenomenon in one map, thus visualizing potential correlations between these phenomena. There is also a user-friendly function to export the data for external use, e.g. a statistical programme.

TaalTeam MeertensThe importance of combining the three databases can be illustrated with the following example.

  • The GTRP database includes data on attributive possessive pronouns in singular and plural nominal groups and possessive pronouns in predicative position.
  • The DIDDD database provides data on possessive pronouns combined with nominal possessors as in Piet z’n auto ‘Piet his car’ and de bakker z’n auto ‘the baker his car’, with noun ellipsis de bakker z’n ‘the baker his’ and with questions words (wie zijn auto).
  • DynaSAND contains information on possessive pronouns in complex reflexives, e.g. zijn eigen ‘his own’. By combining these data one can derive the complete possessive paradigm for each dialect that occurs in all three databases, including possessive inflection.

Webserver

One of the deliverables of a CLARIN-project is a webserver showing the working of the tools and/or the curated data. Via a webserver each humanities scholar with “CLARIN-permission” must be able to use the tool and/or data via the Internet from his or her own place.

Type Link
Info General information
Webservice Mimore/search
Manual documentation.pdf

Film

In 2011 a promotion film was made of the MIMORE-project. This film can be seen here.

CLARIN Centre

Meertens Instituut

Project leader

Sjef Barbiers