UD Warlpiri UFAL
Language: Warlpiri (code: wbp
)
Family: Pama-Nyungan
This treebank has been part of Universal Dependencies since the UD v2.2 release.
The following people have contributed to making this treebank part of UD: Daniel Zeman.
Repository: UD_Warlpiri-UFAL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.2
License: CC BY-SA 4.0
Genre: grammar-examples
Questions, comments? General annotation questions (either Warlpiri-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [zeman (æt) ufal • mff • cuni • cz]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually, natively in UD style |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
A small treebank of grammatical examples in Warlpiri, taken from linguistic literature.
Acknowledgments
The initial set of example sentences is taken from Timothy Shopen (ed.) (2007): Language Typology and Syntactic Description, Volume I: Clause Structure (second edition). Cambridge University Press, Cambridge, UK. ISBN 978-0-521-58156-1
Statistics of UD Warlpiri UFAL
POS Tags
ADJ – ADP – AUX – NOUN – PRON – PROPN – PUNCT – VERB
Features
Case – Clitic – Mood – Number – Number[obj] – Person – Person[dat] – Person[obj] – Person[sdat] – PronType – Tense – VerbForm
Relations
acl – advcl – advmod – amod – aux – case – iobj – nmod:poss – nsubj – obj – obl – obl:tmod – punct – root – xcomp
Tokenization and Word Segmentation
- This corpus contains 55 sentences, 306 tokens and 314 syntactic words.
- This corpus contains 56 tokens (18%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus does not contain words that contain both letters and punctuation.
- This corpus contains 8 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 3 types of multi-word tokens. Examples: Ngarrkangkuka, Karlirna, Ngarrkarna.
Morphology
Tags
- This corpus uses 8 UPOS tags out of 17 possible: ADJ, ADP, AUX, NOUN, PRON, PROPN, PUNCT, VERB
- This corpus does not use the following tags: DET, NUM, ADV, SCONJ, CCONJ, PART, INTJ, SYM, X
- This corpus contains 5 lemmas tagged as pronouns (PRON): ngaju, ngajulu, nyuntu, rna, yali
- This corpus contains 0 lemmas tagged as determiners (DET):
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): ka
- There are 2 (de)verbal forms:
- Fin
- VERB: jarntirni, nyangu
- Inf
- VERB: ngunanjakurra, purlanyjarlarni, wajilipinyjarlarni, yinyjarlarni, ngarninjakurra, ngunanjakurraku, pantirninjakurra, wantinjakurra
Nominal Features
- Dual
- ADP: kulkurrujarra
- NOUN: wawirrijarra, yuwarlijarrarla
- Pauc
- NOUN: karlipatu
- Plur
- AUX: kalu, kalujana
- NOUN: Ngarrkapatu
- Sing
- AUX: karnangku, karna, kanpaju, karnapalangu, kanpa, kapirna, kapirnarla
- NOUN: kurdungku, maliki, Ngarrkangku, kurdu, kurduku, miyi, Karntagku, karntaku, yujuku, Ngarrka
- PRON: ngaju, ngajulurlu, nyuntuku, nyuntu, nyuntulurlu, rna, Ngajulurlurna, ngajuku, nyuntukurra
- VERB: Yungurnarla
- Abs
- ADJ: Wita
- NOUN: karli, maliki, Ngarrka, miyi, kurdu, karnta, ngapa, wawirri, yankirri, Lungkarda
- PRON: ngaju, nyuntu
- All
- NOUN: karrukurra
- PRON: nyuntukurra
- Cau
- NOUN: warrkijangka
- Cns
- NOUN: miyiwanawana
- Com
- NOUN: nantuwurlajinta
- Dat
- NOUN: kurduku, karntaku, warluku
- PRON: nyuntuku, ngajuku, yaliki
- VERB-Inf: ngunanjakurraku
- Ela
- NOUN: pirlingirli
- Erg
- ADJ: witangku
- NOUN: ngarrkangku, kurdungku, karntangku, Karntagku
- PRON: ngajulurlu, nyuntulurlu, Ngajulurlurna
- PROPN: Japanangkarlu
- Erg,Gen
- NOUN: warlkurrukurlurlu
- Gen
- NOUN: watiyakurlu
- Ins
- NOUN: kurlartarlu
- Loc
- NOUN: kardingka, karrungka, ngulyangka, wajirrkinyirla, parrjarla, yuwarlijarrarla
- Per
- NOUN: yurutuwana
Degree and Polarity
Verbal Features
- Ind
- VERB-Fin: jarntirni, nyangu
- Fut
- AUX: kapirna, kapirnarla
- Past
- VERB: nyangu, panturnu, Yungurnarla
- VERB-Fin: nyangu
- Pres
- AUX: ka, kaju, karnangku, karla, karna, kalu, kanpaju, karnapalangu, kalujana, kanpa
- VERB: wajilipinyi, nyanyi, jarntirni, parnkami, purlami, yinyi, marrijarrimi, milkiyirrarni, ngunami, pantirni
- VERB-Fin: jarntirni
Pronouns, Determiners, Quantifiers
- Dem
- PRON: yaliki
- Prs
- PRON: ngaju, ngajulurlu, nyuntuku, nyuntu, nyuntulurlu, Ngajulurlurna, ngajuku, nyuntukurra
- 1
- AUX: karnangku, karna, karnapalangu, kapirna, kapirnarla
- PRON: ngaju, ngajulurlu, rna, Ngajulurlurna, ngajuku
- VERB: Yungurnarla
- 2
- AUX: kanpaju, kanpa
- PRON: nyuntuku, Ngajulurlu, nyuntu, nyuntulurlu, nyuntukurra
- 3
- AUX: kalu, kalujana
Other Features
- Clitic
- Yes
- PRON: rna
- Yes
- Number[obj]
- Dual
- AUX: karnapalangu
- Plur
- AUX: kalujana
- Sing
- AUX: kaju, karnangku, kanpaju
- Dual
- Person[dat]
- 3
- AUX: karla, kapirnarla, karlajinta
- VERB: Yungurnarla
- 3
- Person[obj]
- 1
- AUX: kaju, kanpaju
- 2
- AUX: karnangku
- 3
- AUX: karnapalangu, kalujana
- 1
- Person[sdat]
- 3
- AUX: karlajinta
- 3
Syntax
Auxiliary Verbs and Copula
- This corpus does not contain copulas.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: ka.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN-Abs (10)
- VERB--NOUN-Erg (15)
- VERB--PRON (2)
- VERB--PRON-Abs (6)
- VERB--PRON-Erg (8)
- VERB-Fin--NOUN-Erg (10)
- VERB-Inf--NOUN-Abs (1)
- VERB-Inf--NOUN-Dat (3)
- VERB-Inf--NOUN-Erg (2)
- obj
- VERB--NOUN-Abs (26)
- VERB--NOUN-Dat (1)
- VERB--PRON-Abs (3)
- VERB--PRON-Dat (4)
- VERB-Fin--NOUN-Abs (10)
- VERB-Inf--NOUN-Abs (6)
- iobj
- VERB--NOUN-Dat (5)
- VERB--PRON-Dat (1)
- VERB-Inf--NOUN-Dat (2)
Relations Overview
- This corpus uses 2 relation subtypes: nmod:poss, obl:tmod
- The following 1 main types are not used alone, they are always subtyped: nmod
- The following 23 relation types are not used in this corpus at all: csubj, ccomp, vocative, expl, dislocated, discourse, cop, mark, appos, nummod, det, clf, conj, cc, fixed, flat, compound, list, parataxis, orphan, goeswith, reparandum, dep