Weight Learning Results

All numbers are Accuracy/CWS.

BASIC: Threshold was not optimized on the test data.
OPT: Threshold was optimized on the test data.

The results were generated using the alignments in the files /u/nlp/rte/data/byformat/align/simple/*.xml

Source	Train on all other sources, test on this source (OPT uses one threshold per source)		Put all sources together, do random 10-fold CV (OPT uses one threshold per fold)		15-fold CV within each source only. (OPT uses one threshold for entire source; not per fold.)		Train & test on each source only.
Source	BASIC	OPT	BASIC	OPT	BASIC	OPT	BASIC	OPT
ALL	0.582 / 0.616	0.648 / 0.676	0.59 / 0.598	0.642 / 0.669	--	--	--	--
ATM Dev	0.472 / 0.66	0.527 / 0.68	0.416 / 0.425	0.5 / 0.512	0.5 / 0.281	0.583 / 0.372	0.722 / 0.759	0.805 / 0.823
Brandeis Dev	0.54 / 0.617	0.675 / 0.722	0.594 / 0.626	0.675 / 0.668	0.459 / 0.483	0.594 / 0.449	0.81 / 0.906	0.864 / 0.93
Cycorp Dev	0.611 / 0.595	0.611 / 0.595	0.444 / 0.393	0.666 / 0.704	0.444 / 0.544	0.527 / 0.531	0.861 / 0.879	0.861 / 0.879
LCC-H Dev	0.6 / 0.578	0.657 / 0.712	0.657 / 0.76	0.6 / 0.666	0.4 / 0.434	0.571 / 0.618	0.885 / 0.916	0.885 / 0.916
LCC-M Dev	0.5 / 0.525	0.633 / 0.578	0.566 / 0.472	0.666 / 0.628	0.533 / 0.54	0.6 / 0.551	0.933 / 0.929	0.966 / 0.948
MIT Dev	0.5 / 0.532	0.7 / 0.69	0.466 / 0.597	0.433 / 0.58	0.5 / 0.384	0.6 / 0.641	0.933 / 0.925	0.966 / 0.961
PARC Dev	0.657 / 0.642	0.671 / 0.67	0.789 / 0.838	0.828 / 0.9	0.684 / 0.617	0.697 / 0.632	0.973 / 0.965	0.973 / 0.965
Stanford Dev	0.566 / 0.544	0.633 / 0.635	0.533 / 0.507	0.5 / 0.476	0.266 / 0.261	0.566 / 0.681	0.733 / 0.745	0.766 / 0.765
UTD-ICSI Dev	0.675 / 0.778	0.702 / 0.787	0.594 / 0.706	0.648 / 0.687	0.351 / 0.247	0.567 / 0.467	0.756 / 0.882	0.81 / 0.89
Pascal Dev1
Pascal Dev2
Pascal Test