home edit page issue tracker

This page pertains to UD version 2.

UD Faroese OFT

Language: Faroese (code: fo)
Family: Indo-European, Germanic

This treebank has been part of Universal Dependencies since the UD v2.2 release.

The following people have contributed to making this treebank part of UD: Daniel Zeman, Bjartur Mortensen, Francis Tyers.

Repository: UD_Faroese-OFT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.2

License: CC BY-SA 4.0

Genre: wiki

Questions, comments? General annotation questions (either Faroese-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [ftyers (æt) hse • ru]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS annotated manually
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually, natively in UD style

Description

This is a treebank of Faroese based on the Faroese Wikipedia.

The treebank is based on sentences from the Faroese Wikipedia. The whole Wikipedia was analysed using Trond Trosterud’s tools for Faroese.[1] We took all the sentences and discarded those with unknown words.

The remaining sentences were manually annotated for Universal Dependencies and the morphology and POS tags were converted deterministically using a lookup table. Errors in the original morphology and disambiguation were corrected where found.

The treebank contains a lot of copula sentences and very little first or second person, as can be expected from Wikipedia texts.

  1. http://gtweb.uit.no/cgi-bin/smi/smi.cgi?text=%C3%81+tunguni+eru+sm%C3%A1ar+tenn.&action=analyze&lang=fao&plang=eng

Acknowledgments

The morphology and preliminary disambiguation was done by Trond Trosterud’s finite-state morphology and constraint grammar for Faroese.

If you use this treebank in your work, please cite:

@inproceedings{tyersetal18-faroese,
author = {Francis M. Tyers and Mariya Sheyanova and Alexandra Martynova and Pavel Stepachev and Konstantin Vinogradovsky},
title = {Multi-source synthetic treebank creation for improved cross-lingual dependency parsing},
booktitle = {Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)},
pages = {144--150},
year = 2018
}

Statistics of UD Faroese OFT

POS Tags

ADJADPADVCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJVERBX

Features

CaseDefiniteDegreeGenderMoodNumberNumTypePersonPronTypeReflexTenseVerbFormVoice

Relations

aclacl:cleftacl:relcladvcladvmodamodapposauxaux:passcasecccc:preconjccompcompoundconjcopcsubjdepdetdiscourseexplflatiobjmarknmodnmod:possnsubjnsubj:passnummodobjoblorphanparataxispunctrootxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Verbs with Reflexive Core Objects

Relations Overview