Orð og tunga - 01.06.2014, Blaðsíða 64
52
Orð og tunga
in regional newspapers, thus printed texts. What is new about our
project on modern German is that we work with regionally balanced
corpora. Surprisingly, whereas corpora on earlier periods of German
up until the eighteenth century are all regionally balanced,2 there are
no such corpora for the nineteenth and the twenty-first century. So
for present-day German, we use the corpus of our Variantengrammatik
project which is a joint project of a German, an Austrian and a Swiss
team, which are now based at the universities of Zurich, Salzburg and
Graz.3 For the nineteenth and the beginning of the twentieth century,
the Salzburg team compiled a small corpus of newspaper texts which
we refer to as the Kaiserreich ('Empire') corpus. The two corpora are,
of course, very different in size: The corpus of the Variantengrammatik
project comprises more than 600 million words, the Kaiserreich corpus
so far only 100,000 tokens.
In a view 'from below', we focus on spoken language data or writ-
ten data which are as close to speech as possible. For present-day Ger-
man, we used the so-called Pfeffer corpus with interview data mostly
from the 1960s (Pfeffer & Lohnes 1984, with a total of 670,000 words)
and we also looked at maps from our Atlas ofColloquial Gcrman (AdA)
(cf. Elspafi & Möller 2003ff.). The historical data 'from below' con-
sist of a 880,000 words corpus of nineteenth century emigrant letters
(mainly based on the corpus of Elspafi 2005). Again, both corpora are
(more or less) regionally balanced.
4 Case studies
Based on the principle that variation is inherent to a modern standard
language and with regard to our corpora, our case studies focus on
the following research c^uestions:
• How much variation did printed German allow in the nineteenth
century?
• Is (and was) the variation in non-standard German similar to or
rather different from printed standard German? Which tenden-
2 Such as the corpora of the Middle High German (1050-1350) and the Early New
High German (1350-1650) grammar or the German Manchester Corpus (1650-
1800), cf. Paul (2007), Reichmann & Wegera (1993) and Scheible et al. (2011).
3 The project is funded by the major research grant organizations of the three
countries: the Schweizerischer Nationalfónds (SNF) [100015L-134895], the Deutsche
Forschungsgemeinschaft (DFG) [EL 500/3-1] and the Austrian Sciencc Fund (FWF) [I
716-G18].