Thanks for reading our paper and visiting this project page! If you have any questions, feel free to email us.
Dataset:
The Stanford Rare Word (RW) Similarity Dataset could now be downloaded here.
Morphologically-trained word vectors:
Based on Huang et al. (2012)'s embeddings (HSMN+csmRNN): [ embeddings (text) ] [ words (text) ] [ parameters (mat) ].
Based on Collobert et al. (2011)'s embeddings (CW+csmRNN): [ embeddings (text) ] [ words (text) ] [ parameters (mat) ].
Note:
WS353 | MC | RG | SCWS* | RW | |
---|---|---|---|---|---|
HSMN+csmRNN | 64.70 | 71.73 | 65.42 | 44.10 | 22.55 |
CW+csmRNN | 58.49 | 60.84 | 61.19 | 49.31 | 32.06 |
Citation: