udapi.block.corefud.fixentityacrossnewdoc module

class udapi.block.corefud.fixentityacrossnewdoc.FixEntityAcrossNewdoc(zones='all', if_empty_tree='process', **kwargs)[source]

Bases: Block

Fix the error reported by validate.py –coref: “[L6 Coref entity-across-newdoc] Same entity id should not occur in multiple documents” by making the entity IDs (eid) unique in each newdoc document.

This block uses Udapi’s support for loading GUM-like GRP document-wide IDs (so the implementation is simple, although unnecessarily slow). After applying this block, IDs of all entities are prefixed with document numbers, e.g. “e45” in the 12th document changes to “d12.e45”. If you prefer simple eid, use corefud.IndexClusters afterwards.

process_document(doc)[source]

Process a UD document