udapi.block.tokenize.simple module

Block tokenize.Simple

class udapi.block.tokenize.simple.Simple(keep_spaces=False, **kwargs)[source]

Bases: OnWhitespace

Simple tokenizer, splits on whitespaces and punctuation, fills SpaceAfter=No.

static tokenize_sentence(string)[source]

A method to be overriden in subclasses.