udapi.core.root module¶
Root class represents the technical root node in each tree.
- class udapi.core.root.Root(zone=None, comment='', text=None, newpar=None, newdoc=None)[source]¶
Bases:
Node
Class for representing root nodes (technical roots) in UD trees.
- address()[source]¶
Full (document-wide) id of the root.
The general format of root nodes is: root.bundle.bundle_id + ‘/’ + root.zone, e.g. s123/en_udpipe. If zone is empty, the slash is excluded as well, e.g. s123. If bundle is missing (could occur during loading), ‘?’ is used instead. Root’s address is stored in CoNLL-U files as sent_id (in a special comment).
- property bundle¶
Return the bundle which this tree belongs to.
- comment¶
- create_empty_child(**kwargs)[source]¶
Create and return a new empty node within this tree.
This root-specific implementation overrides Node.create_empty_child()’. It is faster because it does not set `deps and ord of the newly created node. It is up to the user to set up these attributes correctly. It is used in udapi.block.read.conllu (where speed is important and thus, only raw_deps are set up instead of deps).
- create_multiword_token(words=None, form=None, misc=None)[source]¶
Create and return a new multi-word token (MWT) in this tree.
The new MWT can be optionally initialized using the following args. Args: words: a list of nodes which are part of the new MWT form: string representing the surface form of the new MWT misc: misc attribute of the new MWT
- property descendants¶
Return a list of all descendants of the current node.
The nodes are sorted by their ord. This root-specific implementation returns all the nodes in the tree except the root itself.
- property descendants_and_empty¶
- property document¶
- empty_nodes¶
- flatten(deprel='root')[source]¶
Flatten the tree (i.e. attach all nodes to the root) and reset all deprels.
- This is equivalent to
- for node in root.descendants:
node.parent = root node.deprel = ‘root’
but it is faster.
- get_sentence(if_missing='detokenize')[source]¶
Return either the stored root.text or (if None) root.compute_text().
Args: if_missing: What to do if root.text is None? (default=detokenize)
detokenize: use root.compute_text() to compute the sentence.
empty: return an empty string
warn_detokenize, warn_empty: in addition emit a warning via logging.warning()
fatal: raise an exception
- is_descendant_of(node)[source]¶
Is the current node a descendant of the node given as argument?
This root-specific implementation returns always False.
- json¶
- property multiword_tokens¶
Return a list of all multi-word tokens in this tree.
- newdoc¶
- newpar¶
- property parent¶
Return dependency parent (head) node.
This root-specific implementation returns always None.
- remove(children=None)[source]¶
Remove the whole tree from its bundle.
Args: children: a string specifying what to do if the root has any children.
The default (None) is to delete them (and all their descendants). warn means to issue a warning.
- property sent_id¶
ID of this tree, stored in the sent_id comment in CoNLL-U.
- shift(reference_node, after=0, move_subtree=0, reference_subtree=0)[source]¶
Attempts at changing the word order of root result in Exception.
- text¶
- property token_descendants¶
Return all tokens (one-word or multi-word) in the tree.
ie. return a list of core.Node and core.MWT instances, whose forms create the raw sentence. Skip nodes, which are part of multi-word tokens.
For example with: 1-2 vámonos _ 1 vamos ir 2 nos nosotros 3-4 al _ 3 a a 4 el el 5 mar mar
[n.form for n in root.token_descendants] will return [‘vámonos’, ‘al’, ‘mar’].
- property zone¶
Return zone (string label) of this tree.