udapi.block.ud.id.addmwt module

Block ud.id.AddMwt cuts the clitic “-nya” in Indonesian (preprocessed with MorphInd whose output is stored in MISC attribute MorphInd).

class udapi.block.ud.id.addmwt.AddMwt(zones='all', if_empty_tree='process', **kwargs)[source]

Bases: AddMwt

Detect and mark MWTs (split them into words and add the words to the tree).

multiword_analysis(node)[source]

Return a dict with MWT info or None if node does not represent a multiword token.

postprocess_mwt(mwt)[source]

Distribute the MorphInd analysis to the two parts so that we can later use it to fix the lemmas of verbs.