udapi.block.ud.ca.addmwt module¶
Block ud.ca.AddMwt for heuristic detection of Catalan contractions.
According to the UD guidelines, contractions such as “del” = “de el” should be annotated using multi-word tokens.
Note that this block should be used only for converting legacy conllu files. Ideally a tokenizer should have already split the MWTs.