Turns a text file into trees for use in a RNTN classifier such as
the treebank used in the Sentiment project.
The expected input file is one sentence per line, with sentences
separated by blank lines. The first line has the main label of the sentence together with the full sentence.
Lines after the first sentence line but before
the blank line will be treated as labeled sub-phrases. The
labels should start with the label and then contain a list of
tokens the label applies to. All phrases that do not have their own label will take on the main sentence label!
1 Today is not a good day.
3 good day
3 a good day
(next block starts here)
By default the englishPCFG parser is used. This can be changed
with the -parserModel flag. Specify an input file
If a sentiment model is provided with -sentimentModel, that model
will be used to prelabel the sentences. Any spans with given
labels will then be used to adjust those labels.