All numbers are Accuracy/CWS.
BASIC: Threshold was not optimized on the test data.
OPT:
Threshold was optimized on the test data.
The results were generated using the alignments in the files /u/nlp/rte/data/byformat/align/simple/*.xml
|
Source |
Train on all other sources, test on this source (OPT uses one threshold per source) |
Put all sources together, do random 10-fold CV (OPT uses one threshold per fold) |
15-fold CV within each source only. (OPT uses one threshold for entire source; not per fold.) |
Train & test on each source only. |
||||
|
BASIC |
OPT |
BASIC |
OPT |
BASIC |
OPT |
BASIC |
OPT |
|
|
ALL |
0.582 / 0.616 |
0.648 / 0.676 |
0.59 / 0.598 |
0.642 / 0.669 |
-- |
-- |
-- |
-- |
|
ATM Dev |
0.472 / 0.66 |
0.527 / 0.68 |
0.416 / 0.425 |
0.5 / 0.512 |
0.5 / 0.281 |
0.583 / 0.372 |
0.722 / 0.759 |
0.805 / 0.823 |
|
Brandeis Dev |
0.54 / 0.617 |
0.675 / 0.722 |
0.594 / 0.626 |
0.675 / 0.668 |
0.459 / 0.483 |
0.594 / 0.449 |
0.81 / 0.906 |
0.864 / 0.93 |
|
Cycorp Dev |
0.611 / 0.595 |
0.611 / 0.595 |
0.444 / 0.393 |
0.666 / 0.704 |
0.444 / 0.544 |
0.527 / 0.531 |
0.861 / 0.879 |
0.861 / 0.879 |
|
LCC-H Dev |
0.6 / 0.578 |
0.657 / 0.712 |
0.657 / 0.76 |
0.6 / 0.666 |
0.4 / 0.434 |
0.571 / 0.618 |
0.885 / 0.916 |
0.885 / 0.916 |
|
LCC-M Dev |
0.5 / 0.525 |
0.633 / 0.578 |
0.566 / 0.472 |
0.666 / 0.628 |
0.533 / 0.54 |
0.6 / 0.551 |
0.933 / 0.929 |
0.966 / 0.948 |
|
MIT Dev |
0.5 / 0.532 |
0.7 / 0.69 |
0.466 / 0.597 |
0.433 / 0.58 |
0.5 / 0.384 |
0.6 / 0.641 |
0.933 / 0.925 |
0.966 / 0.961 |
|
PARC Dev |
0.657 / 0.642 |
0.671 / 0.67 |
0.789 / 0.838 |
0.828 / 0.9 |
0.684 / 0.617 |
0.697 / 0.632 |
0.973 / 0.965 |
0.973 / 0.965 |
|
Stanford Dev |
0.566 / 0.544 |
0.633 / 0.635 |
0.533 / 0.507 |
0.5 / 0.476 |
0.266 / 0.261 |
0.566 / 0.681 |
0.733 / 0.745 |
0.766 / 0.765 |
|
UTD-ICSI Dev |
0.675 / 0.778 |
0.702 / 0.787 |
0.594 / 0.706 |
0.648 / 0.687 |
0.351 / 0.247 |
0.567 / 0.467 |
0.756 / 0.882 |
0.81 / 0.89 |
|
Pascal Dev1 |
|
|
|
|
|
|
|
|
|
Pascal Dev2 |
|
|
|
|
|
|
|
|
|
Pascal Test |
|
|
|
|
|
|
|
|