udapi.block.tokenize package

Submodules

udapi.block.tokenize.onwhitespace module

Block tokenize.OnWhitespace

class udapi.block.tokenize.onwhitespace.OnWhitespace(zones='all')[source]

Bases: udapi.core.block.Block

“Base tokenizer, splits on whitespaces, fills SpaceAfter=No.

process_tree(root)[source]

Process a UD tree

static tokenize_sentence(string)[source]

A method to be overriden in subclasses.

udapi.block.tokenize.simple module

Block tokenize.Simple

class udapi.block.tokenize.simple.Simple(zones='all')[source]

Bases: udapi.block.tokenize.onwhitespace.OnWhitespace

Simple tokenizer, splits on whitespaces and punctuation, fills SpaceAfter=No.

static tokenize_sentence(string)[source]

A method to be overriden in subclasses.

Module contents