Íslenskt orðanet

Íslenskt orðanet (Icelandic wordnet) describes semantic relations of Icelandic words and phrasemes as they appear as semantically unambiguous units. The basis of the project is a collection of phrasemes and compounds with a standardised representation which includes more than 200,000 phrasemes of various kinds and about 100,000 compounds.


About Íslenskt orðanet
Íslenskt orðanet (Icelandic wordnet) is a research project which analyses and describes semantic relations of Icelandic words and phrasemes. The methodology is based on the prerequisite that the semantic relations are indicated by the syntagmatic relations as they appear in collocations and other word combinations. The basis of the project is a collection of phrasemes and compounds with a standardised representation which includes more than 200,000 phrasemes of various kinds and about 100,000 compounds. The collection combines material from Stóra orðabókin um íslenska málnotkun (‘The Big Dictionary of Icelandic Usage’, Jón Hilmar Jónsson 2005) and the phraseological database of The Árni Magnússon Institute for Icelandic Studies. Data has also been gathered from the web site Tímarit.is. All this material is linked to a lemma list that combines about 250 thousand single-word and multi-word lemmas.

The semantic relations in question are of various kinds. The clearest and closest relations constitute synonyms and antonyms but the synonym relations vary in closeness. The difference is partially identified by distinguishing between synonyms and near-synonyms. For estimating the relations, the emphasis is laid on the evidence of the material, where the goal is to obtain numeric evidence of semantic proximity and the semantic relatedness of the words compared. The analysis also returns semantically homologous vocabulary which is further sorted and placed under particular concepts and semantic fields.

The lemmas are semantically unambiguous which has profound impact on the description of the semantic relations. To name an example, the arguments of verbs are taken to be a part of the lemma, and verbal combinations of various kinds have independent status within the lemma list.

In most general dictionaries, individual lemmas appear as form-based units where the entry can be devided in different senses and numbered sub-divisions, as appropriate. In Icelandic wordnet, however, the focus is on the lemma as a monosemous lexical unit. This widens the scope of the lemma list compared to traditional semasiological dictionaries and the lemma list depends on whether the potential lemma shows clear relations to other lemmas.

Multi-word lemmas are prominent in the lemma list of the Icelandic wordnet. Their coordinated representation makes it possible to mark the lemma strings syntactically and by doing so obtain active interaction between syntactic and semantic classification.

The database Þesárus contains all the data of the project. Selected part of the data is then delimited and published on the website ordanet.is as a separate entity.

Jón Hilmar Jónsson
Research professor
Árni Magnússon Institute for Icelandic Studies
Office: Neshagi 16
Work phone: +354-525-4436
Fax: +354-562-7242
e-mail: jhj@hi.is
Web page: http://www.lexis.hi.is/jhj/jhj.html/

Jónsson, Jón Hilmar. 2009a. Ordforbindelser: Grunnelementer i ordboken? LexicoNordica 16: 161-179. 2009.

Jónsson, Jón Hilmar. 2009b. Lemmatisation of Multi-word Lexical Units: Motivation and Benefits. Henning Bergenholtz, Sandro Nielsen & Sven Tarp (eds). Lexicography at a Crossroads. Dictionaries and Encyclopedias Today, Lexicographical Tools Tomorrow. Bls. 165-194. Bern: Peter Lang. 2009.

Jónsson, Jón Hilmar. 2009c. Lexicographic description: An onomasiological approach on the basis of phraseology. Lexicography in the 21st Century. In honour of Henning Bergenholtz. Edited by Sandro Nielsen and Sven Tarp. Bls. 257-280. Amsterdam: John Benjamins Publishing Company. 2009.

Jónsson, Jón Hilmar. 2012a. Að fanga orðaforðann: orðanet í þágu orðabókar. Orð og tunga 14: 39-65.

Jónsson, Jón Hilmar. 2012b. Adverb og adverbialer: En forsømt ordklasse i ordbøkene. Í: Eaker, Birgit o.fl. (ritstj.) Nordiska studier i lexikografi 11. Rapport från Konferensen om lexikografi i Norden Lund 24-27 maj 2011. Bls. 367-376. Lund 2012.

Jónsson, Jón Hilmar. 2005. Stóra orðabókin um íslenska málnotkun. Reykjavík: JPV útgáfa. 2005

Jónsson, Jón Hilmar and Þórdís Úlfarsdóttir. 2011. Íslenskt orðanet: Et skritt mot en allmennspråklig onomasiologisk ordbok. LexicoNordica 18: 87-109.