udapi.block.ud.de.addmwt module¶
Block ud.de.AddMwt for heuristic detection of German contractions.
According to the UD guidelines, contractions such as “am” = “an dem” should be annotated using multi-word tokens.
Notice that this should be used only for converting existing conllu files. Ideally a tokenizer should have already split the MWTs.