udapi.block.read.text module¶
Text class is a reader for word-wrapped plain-text files.
- class udapi.block.read.text.Text(rstrip='\r\n ', empty_line='new_sentence', **kwargs)[source]¶
Bases:
BaseReaderA reader for plain-text files with sentences on one or more lines.
Sentences are separated by one or more empty lines. Newlines within sentences are substituted by a space.
Args: rstrip: a set of characters to be stripped from the end of each line.
Default=’rn ‘. You can use rstrip=’n’ if you want to preserve any space or ‘r’ (Carriage Return) at end of line, so that udpipe.Base keeps these characters in SpacesAfter. As most blocks do not expect whitespace other than a space to appear in the processed text, using this feature is at your own risk.
- empty_line: how empty lines are handled. Default ‘new_sentence’ preserves
the current behaviour (empty lines mark sentence boundaries). Use ‘keep’ to read the entire file content into a single sentence (tree), including empty lines. Use ‘newpar’ to behave like ‘new_sentence’ but also set root.newpar = True on each sentence.