Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Dependency structures  





2 Function words  





3 Notes  





4 References  














Universal Dependencies







 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Universal Dependencies, frequently abbreviated as UD, is an international cooperative project to create treebanks of the world's languages.[1] These treebanks are openly accessible and available. Core applications are automated text processing in the field of natural language processing (NLP) and research into natural language syntax and grammar, especially within linguistic typology. The project's primary aim is to achieve cross-linguistic consistency of annotation, while still permitting language-specific extensions when necessary. The annotation scheme has it roots in three related projects: Stanford Dependencies,[2] Google universal part-of-speech tags,[3] and the Interset interlingua[4] for morphosyntactic tagsets. The UD annotation scheme uses a representation in the form of dependency trees as opposed to a phrase structure trees. At the present time (January 2022), there are just over 200 treebanks of more than 100 languages available in the UD inventory.

Dependency structures

[edit]

The UD annotation scheme produces syntactic analyses of sentences in terms of the dependencies of dependency grammar. Each dependency is characterized in terms of a syntactic function, which is shown using a label on the dependency edge. For example:[5]

First UD picture

This analysis shows that she, him, and a note are dependents of the left. The pronoun she is identified as a nominal subject (nsubj), the pronoun him as an indirect object (iobj) and the noun phrase a note as a direct object (obj) -- there is a further dependency that connects atonote, although it is not shown. A second example:

UD picture 2

This analysis identifies it as the subject (nsubj), is as the copula (cop), and for as a case marker (case), all of which are shown as dependents of the root word her, which is a pronoun. The next example includes an expletive and an oblique object:

UD picture 3

This analysis identifies there as an expletive (expl), food as a nominal subject (nsubj), kitchen as an oblique object (obl), and in as a case marker (case) -- there is also a dependency connecting thetokitchen, but it is not shown. The copula is in this case is positioned as the root of the sentence, a fact that is contrary to how the copula is analyzed in the second example just above, where it is positioned as a dependent of the root.

The examples of UD annotation just provided can of course give only an impression of the nature of the UD project and its annotation scheme. The emphasis for UD is on producing cross-linguistically consistent dependency analyses in order to facilitate structural parallelism across diverse languages. To this end, UD uses a universal POS tagset for all languages—although a given language does not have to make use of each tag. More specific information can be added to each word by means of a free morpho-syntactic feature set. The universal labels of dependency links can be specified with secondary relations, which are indicated as a secondary label behind a colon, e.g. nsubj:pass, following the "universal:extension" format.

Function words

[edit]

Within the dependency grammar community, the UD annotation scheme is controversial. The main bone of contention concerns the analysis of function words. UD chooses to subordinate function words to content words,[6] a practice that is contrary to most works in the tradition of dependency grammar.[7] To briefly illustrate this controversy, UD would produce the following structural analysis of the sentence given:

Fourth UD picture, illustrates analysis of function words

This example is taken from the article here.[8] An alternative convention for showing dependencies is now used, different from the convention above. Since the syntactic functions are not important for the point at hand, they are excluded from this structural analysis. What is important is the manner in which this UD analysis subordinates the auxiliary verb will to the content verb say, the preposition to to the pronoun you, the subordinator that to the content verb likes, and the particle to to the content verb swim.

A more traditional dependency grammar analysis of this sentence, one that is motivated more by syntactic considerations than by semantic ones, looks like this:[9]

UD picture 5

This traditional analysis subordinates the content verb say to the auxiliary verb will, the pronoun you to the preposition to, the content verb likes to the subordinator that, and the content verb swim to the participle to.

Notes

[edit]
  1. ^ de Marneffe, Marie-Catherine; Manning, Christopher D.; Nivre, Joakim; Zeman, Daniel (13 July 2021). "Universal Dependencies". Computational Linguistics. 47 (2): 255–308. doi:10.1162/coli_a_00402. S2CID 219304854.
  • ^ "Stanford Dependencies". nlp.stanford.edu. The Stanford Natural Language Processing Group. Retrieved 8 May 2020.
  • ^ Petrov, Slav (11 Apr 2011). "A Universal Part-of-Speech Tagset". arXiv:1104.2086 [cs.CL].
  • ^ "Interset". cuni.cz. Institute of Formal and Applied Linguistics (Czech Republic). Retrieved 8 May 2020.
  • ^ The three example analyses that appear in this section have been taken from the UD webpage here, examples 3, 21, and 23.
  • ^ The choice was led by Nivre (2015).
  • ^ The controversy surrounding UD and the status of function words in dependency grammar in general are discussed at length in Osborne & Gerdes (2019).
  • ^ The structure is (1b) in Osborne & Gerdes (2019) article.
  • ^ This structure is (1c) in Osborne & Gerdes (2019) article.
  • References

    [edit]
    Retrieved from "https://en.wikipedia.org/w/index.php?title=Universal_Dependencies&oldid=1184603957"

    Categories: 
    Linguistic units
    Syntax
    Dependency grammar
    Hidden categories: 
    Articles with short description
    Short description matches Wikidata
    Articles lacking in-text citations from April 2019
    All articles lacking in-text citations
     



    This page was last edited on 11 November 2023, at 13:29 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki