Gripla - 2020, Page 28
27
Since we are specifically interested in determining whether A or C’s
divergent text is closer to the rest of the parallel text, this test environment
is indifferent towards the question of whether A and C chapters 1–4 are,
as Erichsen suggested, abbreviations of a lost text. It is more important for
this stylometric setup that we have the two versions of the parallel text of
substantial length. Once the documents are split in this manner, we arrive
at the word counts in Table 1.
Are these documents of sufficient length for stylometric purposes?
A-divergent in particular is quite short, possibly so short that any re-
sults would not be able to be explained by anything other than random
chance. Maciej Eder has studied the matter for a range of poetic and prose
corpora, attempting to arrive at a shortest acceptable length for reliable
stylometric authorship attribution.59 He observes that some corpora, such
as English novels, require documents to be at least 5000 words in length
before they provide acceptable results in stylometric authorship attribu-
tion. Meanwhile, results on Latin prose samples become acceptable at
2500 words.60
It remains unclear where, precisely, we should place Old Norse saga
prose on this spectrum. From literature on the vocabulary of the Ís-
lendinga sögur, we can confidently state that saga texts have a rather small
vocabulary when compared to modern Icelandic texts.61
A STYLOMETRIC ANALYSIS OF LJóSVETNINGA SAGA
table 1 — Document sizes
Document Word count Distinct terms
A-parallel 4002 1151
A-divergent 3277 913
C-parallel 4013 1159
C-divergent 4641 1280
59 Maciej Eder, “Does Size Matter? Authorship Attribution, Small Samples, Big Problem,”
Digital scholarship in the Humanities 30:2 (2015): 167–182.
60 Maciej Eder, “Does Size Matter?” 180.
61 The narrowness of saga vocabulary relative to, for instance, modern Icelandic texts was
proved quantitatively in the latter part of the 1980s and early 1990s, as is discussed in
Örnólfur Thorsson, “Orð af orði: hefð og nýmæli í Grettlu” (Doctoral thesis, University
of Iceland, 1994): 35–36.