udapi.block.read.addtext module

read.AddText is a reader for adding word-wrapped plain-text to existing trees.

class udapi.block.read.addtext.AddText(zone='', add_newpar=True, **kwargs)[source]

Bases: BaseReader

A reader for plain-text files to be stored to existing trees.

For example LitBank conll files are segmented to sentences and tokenized, but the SpacesAfter attributes are missing. We need to load the original (raw) texts, which are not tokenized and not segmented, only word-wrapped (to 70 characters per line).

Args: add_newpar: add newpar CoNLL-U annotations on empty lines (and the beginning of file)

static is_multizone_reader()[source]

Can this reader read bundles which contain more zones?.

This implementation returns always False.

process_document(document)[source]

Process a UD document