udapi.tool.udpipeonline module

Wrapper for UDPipe online web service.

class udapi.tool.udpipeonline.UDPipeOnline(model, server='https://lindat.mff.cuni.cz/services/udpipe/api')[source]

Bases: object

Wrapper for UDPipe online web service.

list_models()[source]
perform_request(params, method='process')[source]
perform_request_urlencoded(params, method='process')[source]

Perform a request using application/x-www-form-urlencoded to preserve LF newlines.

This avoids CRLF normalization done by the email MIME serializer, ensuring that the content of the ‘data’ field retains Unix LF (”

“) exactly as provided.

process_document(doc, tokenize=True, tag=True, parse=True, resegment=False, ranges=False)[source]

Delete all existing bundles and substitute them with those parsed by UDPipe.

segment_text(text)[source]

Segment the provided text into sentences returned as a Python list.

tag_parse_tree(root, tag=True, parse=True)[source]

Tag (+lemmatize, fill FEATS) and parse a tree (already tokenized).

tokenize_tag_parse_tree(root, resegment=False, tag=True, parse=True, ranges=False)[source]

Tokenize, tag (+lemmatize, fill FEATS) and parse the text stored in root.text.

If resegment=True, the returned list of Udapi trees may contain multiple trees. If ranges=True, each token will contain node.misc[TokenRange] will contain character level 0-based ranges, e.g. 0:2.