udapi.block.read.conll2012 module

“Conll2012 is a reader block for the coreference in CoNLL-2012 format.

This implementation was tested on the LitBank files only (and quickly on Portuguese Corref-PT and Summ-it++v2), so far. LitBank does not use most of the columns, so the implementation should be improved to handle other types of CoNLL-2012 files.

class udapi.block.read.conll2012.Conll2012(attributes='docname,_,ord,form,_,_,_,_,_,_,_,_,coref', emptyval='_', **kwargs)[source]

Bases: Conllu

A reader of the Conll2012 files.

parse_comment_line(line, root)[source]

Parse one line of CoNLL-U and fill sent_id, text, newpar, newdoc in root.

parse_node_line(line, root, nodes)[source]
read_tree()[source]

Load one (more) tree from self.filehandle and return its root.

This method must be overriden in all readers. Usually it is the only method that needs to be implemented. The implementation in this base clases raises NotImplementedError.

read_tree_from_lines(lines)[source]
read_trees()[source]

Load all trees from self.filehandle and return a list of their roots.

This method may be overriden in a reader if a faster alternative to read_tree() is needed. The implementation in this base clases raises NotImplementedError.