udapi.block.eval.f1 module¶
Block eval.F1 for evaluating differences between sentences with P/R/F1.
eval.F1 zones=en_pred gold_zone=en_gold details=0
prints something like:
predicted = 210
gold = 213
correct = 210
precision = 100.00%
recall = 98.59%
F1 = 99.29%
eval.F1 gold_zone=y attributes=form,upos focus='(?i:an?|the)_DET' details=4
prints something like:
=== Details ===
token pred gold corr prec rec F1
the_DET 711 213 188 26.44% 88.26% 40.69%
The_DET 82 25 19 23.17% 76.00% 35.51%
a_DET 0 62 0 0.00% 0.00% 0.00%
an_DET 0 16 0 0.00% 0.00% 0.00%
=== Totals ===
predicted = 793
gold = 319
correct = 207
precision = 26.10%
recall = 64.89%
F1 = 37.23%
This block finds differences between nodes of trees in two zones
and reports the overall precision, recall and F1.
The two zones are “predicted” (on which this block is applied)
and “gold” (which needs to be specified with parameter gold
).
This block also reports the number of total nodes in the predicted zone
and in the gold zone and the number of “correct” nodes,
that is predicted nodes which are also in the gold zone.
By default two nodes are considered “the same” if they have the same form
,
but it is possible to check also for other nodes’ attributes
(with parameter attributes
).
As usual:
precision = correct / predicted
recall = correct / gold
F1 = 2 * precision * recall / (precision + recall)
The implementation is based on finding the longest common subsequence (LCS) between the nodes in the two trees. This means that the two zones do not need to be explicitly word-aligned.
- class udapi.block.eval.f1.F1(gold_zone, attributes='form', focus=None, details=4, **kwargs)[source]¶
Bases:
BaseWriter
Evaluate differences between sentences (in different zones) with P/R/F1.
Args: zones: Which zone contains the “predicted” trees?
Make sure that you specify just one zone. If you leave the default value “all” and the document contains more zones, the results will be mixed, which is most likely not what you wanted. Exception: If the document conaints just two zones (predicted and gold trees), you can keep the default value “all” because this block will skip comparison of the gold zone with itself.
gold_zone: Which zone contains the gold-standard trees?
- attributes: comma separated list of attributes which should be checked
when deciding whether two nodes are equivalent in LCS
- focus: Regular expresion constraining the tokens we are interested in.
If more attributes were specified in the
attributes
parameter, their values are concatenated with underscore, sofocus
should reflect that e.g.attributes=form,upos focus='(a|the)_DET'
. For case-insensitive focus use e.g.focus='(?i)the'
(which is equivalent tofocus='[Tt][Hh][Ee]'
).- details: Print also detailed statistics for each token (matching the
focus
). The value of this parameter
details
specifies the number of tokens to include. The tokens are sorted according to the sum of their predicted and gold counts.