Meghana Kshirsagar a Ph.D. student at Carnegie Mellon pointed out an error in Table 8 of the CL journal paper "Frame-Semantic Parsing". She found that for the experimental setting with gold frames, tabulated in rows 7 and 8 in the aforementioned table, at inference time, SEMAFOR was including gold spans along with the candidate set of automatic spans to be considered for argument identification, thus artificially bloating the precision, recall and F1. The revised numbers are:
Naive decoding: Precision=0.78650 Recall=0.72848 Fscore=0.75638 (row 7)
Beam search decoding: Precision=0.80395 Recall=0.72839 Fscore=0.76431 (row 8)
Thus, results have the same trend, but the absolute numbers are lower. This problem also affects the numbers in Table 9, keeping the trends the same.
None of the other results in the article are affected by this error.Powered by liveSite