Orð og tunga - 01.06.2012, Blaðsíða 25
Matthew Whelpton: From human-oriented dictionaries
15
5 Conclusion
This paper began by setting up a contrast between the demands placed
on the traditional dictionary for human use and the lexical resource
for nlp use. As the human user brings a vast amount of world and
cultural knowledge to the task of dictionary use, supplemented by
robust common sense reasoning skills, the dictionary creator can as-
sume all sorts of semantic information as understood; as a computer
brings nothing to the lexical semantic resource, independent of the al-
gorithms it has been programmed with, the creator of an nlp resource
must include a rich set of information in a systematic and explicit
manner and in a format which is suitable for algorithmic manipula-
tion. It is not surprising then to find the creators of each of these re-
sources treading the delicate line between the modelling of linguistic
organisation and of conceptual organisation.
As the final discussion concerning the differences between saldo
and nsm show, there is also a tension between potentially universal
properties of linguistic organisation and the idiosyncratic properties
of particular languages. nsm aims at a universal paraphrase language
for the conceptual primitives underlying lexical organisation in hu-
man languages; saldo is emphatically monolingual in its approach.
The tension between universal and particular is built into WordNet:
at the root of the WordNet hierarchies are abstract terms such as “en-
tity" which serve to root the forest of hyponymy hierarchies beneath
them and which are likely to be shared by wordnets for other lan-
guages; but the bulk of the relational information represented is po-
tentially idiosyncratic and reflected in the distribution of lexical gaps
and the elaboration of hyponymy distinctions further down the tree.
Nevertheless, the Princeton WordNet was developed as an analysis of
English lexical semantic organisation and as such is a monolingual re-
source. Similarly, DanNet was explicitly monolingual in its methodol-
ogy, basing its structure on a monolingual corpus-based dictionary,
rather than translation from the Princeton WordNet. This monolin-
gual emphasis is shared by both Icelandic resources presented in this
volume, which seek to characterise the lexical semantic organisation
of Icelandic in its own terms, without importing a structure from re-
sources developed for other languages (e.g. by translation of Word-
Net or DanNet).
Another important characteristic shared by all three of the resour-