Orð og tunga - 01.06.2012, Blaðsíða 48
38
Orð og tunga
Schiitze, Hinrich. 1993. Word Space. í: S. J. Hanson, J. D. Cowan & C. L. Giles
(útg.). Advances in Neural Information Processing Systems, 5, bls. 895-902.
San Mateo, Kaliforníu: Morgan Kaufmann.
Sigrún Helgadóttir. 2004. Mörkuð íslensk málheild. í: Samspil tungu og tækni,
bls. 65-71. Reykjavík: Menntamálaráðuneytið.
Snara. http://snara.is. (30.06.2011)
Whelpton, Matthew. 2012. From luiman-oriented dictionaries to computer-
oriented lexical resources - trying to pin down words. (Þetta hefti).
WordNet. http://www.princeton.edu/wordnet/. (20.10.2011)
Abstract
This article describes the work on a semantic database for Icelandic language
technology. Tlie database is being developed using a monolingual approach with
automatic methods for the extraction of semantic information from texts. Both
pattem based and statistical methods are used, as well as a hybrid methodology.
Tlne database already contains about 134,000 words, primarily nouns, and more than
one million relations. The number of relations might change during the last stage of
the development which consists of automatically validating the results. This will be
done e.g. by using results of one extraction method to support or reject the results of
another.
The structure of the database is not based on hierarchies, like for example the
Princeton WordNet, but rather on clusters of strongly related words and semantic
relations often describing common sense knowledge and associations.
After release, in the beginning of 2012, the database will be freely available.
Lykilorð
merkingarbrunnur, orðanet, máltækni, merkingarvensl, merkingarupplýsingar
Keywords
semantic database, wordnet, language téchnology, semantic relations, semantic
information
Anna B. Nikulásdóttir
Háskóli íslauds
anna.b.nik@gmk.de