RTE Quickstart

Following are instructions for having the RTE system annotate and generate entailment classifications for a set of textual entailment problems.

  1. Make sure your JavaNLP account environment is properly set up. Please refer to the setup section here (TODO: create an "idiot-proof" version of the account setup instructions).
  2. Log onto an available JavaNLP machine.
  3. Either:
    1. Format your problems into a KBE format file (TODO: link to the format). The file MUST end with the extension .kbe.xml. For these instructions, the file is named $NAME.kbe.xml, with $NAME standing in for the base-name of your dataset.
    2. Copy your KBE formatted file into /u/nlp/rte/data/byformat/kbe/.
    or:
    1. Format your problems into an RTE format file (TODO: link to the format). The file MUST end with the extension .xml. For these instructions, the file is named $NAME.xml, with $NAME standing in for the base-name of your dataset.
    2. Copy your RTE formatted file into /u/nlp/rte/data/byformat/rte/.
  4. The current version of the RTE scripts now live in $JAVANLP_HOME/src/edu/stanford/nlp/rte/bin/.
  5. Execute rte-pipeline $NAME. This runs the RTEPipeline class, which annotates the sentences and deposits the results into /u/nlp/rte/data/byformat/info/. The filename will be $NAME.pipeline.info.xml.
  6. Execute rte-infer $NAME.pipeline. Note: the suffix .pipeline must be present.
  7. The following files will be written out:
    /u/nlp/rte/data/byformat/guess/$NAME.pipeline Entailment classifications, by problem ID # and YES|NO result.
    /u/nlp/rte/data/byformat/guess/$NAME.pipeline.guesses.tsv Classifications as per above, with feature scores.
    /u/nlp/rte/data/byformat/align/stochastic/$NAME.pipeline.align.xml Alignments for problems, along with other information. Use the stylesheet info_to_html.xml, located at /u/nlp/rte/data/byformat/info/info_to_html.xml to render an HTML version.

Other random notes

  1. Lexical resource similarity scores are cached. If you change what a lexical resource computes, then you need to nuke its cache, so that it is regenerated. At present you do this simply by deleting the cache file. Look in the /u/nlp/rte/cache/ directory.