edu.stanford.nlp.parser.lexparser
Class EnglishTreebankParserParams.EnglishTrain

java.lang.Object
  extended byedu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams.EnglishTrain
Enclosing class:
EnglishTreebankParserParams

public static class EnglishTreebankParserParams.EnglishTrain
extends Object


Field Summary
static boolean collinsBaseNP
          Mark base NPs _and_ add a NP node if alone, as in Collins
static boolean joinJJ
          Joint comparative and superlative adjective with positive.
static boolean joinPound
          Join pound with dollar.
static boolean markCC
          Mark phrases which are conjunctions.
static boolean markCC2
          Mark phrases which are conjunctions, more like Charniak.
static boolean markContainedVP
           
static int selectiveSplitLevel
          Set the support * KL cutoff level (1-4) -- use of 2 or 3 is good.
static int sisterSplitLevel
          Set the support * KL cutoff level (1-4) for sister splitting -- don't use it, as far as we can tell so far
static boolean splitAux
          Make special tags for forms of BE and HAVE.
static boolean splitAuxBetter
          Make special tags for forms of BE and HAVE.
static boolean splitBaseNP
          Mark base NPs.
static boolean splitCC
          Annotates "and", and "or" specially - the point of this isn't so clear to cdm since this is most conjunctions.
static boolean splitCC2
          Alternative: separate off [Bb]ut and &.
static boolean splitIN
          Annotate prepositions with a S parent (putative subordinating conjunctions) differently from others (real prepositions).
static boolean splitIN2
          Annotate prepositions 3 ways: S* parent, N* parent or rest (generally predicative ADJP, VP).
static boolean splitIN3
          Annotate prepositions 6 ways: real feature engineering.
static boolean splitJJCOMP
          Put a special tag on 'adjectives with complements'.
static boolean splitMoreLess
          Specially mark the comparative/superlative words: less, least, more, most
static boolean splitNOT
          Annotates forms of "not" specially as tag "NOT".
static boolean splitNPADV
          Retain NP-ADV annotation.
static boolean splitNPNNP
          Mark NP-NNP.
static boolean splitNPTMP
          Retain NP-TMP annotation.
static boolean splitNumNP
          Mark "numeric NPs".
static boolean splitPercent
          Mark the nouns that are percent signs.
static boolean splitPoss
          Give a special tag to NPs which are possessive NPs (end in 's)
static boolean splitPPJJ
          A special test for "such" mainly ("such as Fred").
static boolean splitRB
          Split modifier (NP, AdjP) adverbs from others.
static int splitSGapped
          Mark specially S nodes with "gapped" subject (control, raising).
static boolean splitSTag
          Mark S nodes according to verbal tag.
static boolean splitTMP
          Retain all -TMP annotation.
static boolean splitTRJJ
          Put a special tag on 'transitive adjectives' with NP complement, like 'due May 15' -- it also catches 'such' in 'such as NP', which may be a good.
static boolean splitVP
          Add (head) tags to VPs
static boolean splitVP2
          Add (head) tags to VPs, but collapse "finite" ones together.
static boolean splitVP3
          Add only verbal tags to VPs, collapsing "finite" ones together.
static boolean unaryDT
          Mark "Intransitive" DT.
static boolean unaryIN
          Mark "Intransitive" IN.
static boolean unaryPRP
          "Intransitive" PRP.
static boolean unaryRB
          Mark "Intransitive" RB.
static boolean vpSubCat
          Pitiful attempt at marking V* preterms with their surface subcat frames.
 
Constructor Summary
EnglishTreebankParserParams.EnglishTrain()
           
 
Method Summary
static void display()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

splitIN

public static boolean splitIN
Annotate prepositions with a S parent (putative subordinating conjunctions) differently from others (real prepositions). OK.


splitIN2

public static boolean splitIN2
Annotate prepositions 3 ways: S* parent, N* parent or rest (generally predicative ADJP, VP). Better than sIN. Good.


splitIN3

public static boolean splitIN3
Annotate prepositions 6 ways: real feature engineering. Great.


splitPercent

public static boolean splitPercent
Mark the nouns that are percent signs. Slightly good.


joinPound

public static boolean joinPound
Join pound with dollar.


joinJJ

public static boolean joinJJ
Joint comparative and superlative adjective with positive.


splitPPJJ

public static boolean splitPPJJ
A special test for "such" mainly ("such as Fred"). A wash, so omit


splitTRJJ

public static boolean splitTRJJ
Put a special tag on 'transitive adjectives' with NP complement, like 'due May 15' -- it also catches 'such' in 'such as NP', which may be a good. Matches 658 times in 2-21 training corpus. Wash.


splitJJCOMP

public static boolean splitJJCOMP
Put a special tag on 'adjectives with complements'. This acts as a general subcat feature for adjectives.


splitMoreLess

public static boolean splitMoreLess
Specially mark the comparative/superlative words: less, least, more, most


unaryDT

public static boolean unaryDT
Mark "Intransitive" DT. Good.


unaryRB

public static boolean unaryRB
Mark "Intransitive" RB. Good.


unaryPRP

public static boolean unaryPRP
"Intransitive" PRP. Wash


unaryIN

public static boolean unaryIN
Mark "Intransitive" IN. Minutely negative.


splitCC

public static boolean splitCC
Annotates "and", and "or" specially - the point of this isn't so clear to cdm since this is most conjunctions. "But" different?


splitCC2

public static boolean splitCC2
Alternative: separate off [Bb]ut and &. Exclusive with splitCC. Marginally better than splitCC. Good.


splitNOT

public static boolean splitNOT
Annotates forms of "not" specially as tag "NOT". BAD


splitRB

public static boolean splitRB
Split modifier (NP, AdjP) adverbs from others. This does nothing if you're already doing tagPA.


splitAux

public static boolean splitAux
Make special tags for forms of BE and HAVE. Positive PCFG effect, but neutral to negative in Combo, and impossible if use gPA.


splitAuxBetter

public static boolean splitAuxBetter
Make special tags for forms of BE and HAVE. Add in "s" = "'s" and delve further to disambiguate "'s" as BE or HAVE. Theoertically good, but no practical gains


vpSubCat

public static boolean vpSubCat
Pitiful attempt at marking V* preterms with their surface subcat frames. Bad so far.


splitVP

public static boolean splitVP
Add (head) tags to VPs


splitVP2

public static boolean splitVP2
Add (head) tags to VPs, but collapse "finite" ones together. Good.


splitVP3

public static boolean splitVP3
Add only verbal tags to VPs, collapsing "finite" ones together.


splitSTag

public static boolean splitSTag
Mark S nodes according to verbal tag. Bad.


markContainedVP

public static boolean markContainedVP

markCC

public static boolean markCC
Mark phrases which are conjunctions. Seems marginal


markCC2

public static boolean markCC2
Mark phrases which are conjunctions, more like Charniak. Not yet implemented. Need to annotate _before_ annotate children! Charniak: np or vp with two or more np/vp children, a comma, cc or conjp, and nothing else.


splitSGapped

public static int splitSGapped
Mark specially S nodes with "gapped" subject (control, raising). 1 is basic version. 2 is better mark S nodes with "gapped" subject. 3 seems best on small training set, but all of these are too similar; 4 can't be differentiated. 5 is done on tree before empty splitting. (Bad!?)


splitNumNP

public static boolean splitNumNP
Mark "numeric NPs". Probably bad?


splitPoss

public static boolean splitPoss
Give a special tag to NPs which are possessive NPs (end in 's)


splitBaseNP

public static boolean splitBaseNP
Mark base NPs. Good.


collinsBaseNP

public static boolean collinsBaseNP
Mark base NPs _and_ add a NP node if alone, as in Collins


splitNPTMP

public static boolean splitNPTMP
Retain NP-TMP annotation. Good.


splitNPADV

public static boolean splitNPADV
Retain NP-ADV annotation.


splitNPNNP

public static boolean splitNPNNP
Mark NP-NNP. Bad.


splitTMP

public static boolean splitTMP
Retain all -TMP annotation.


selectiveSplitLevel

public static int selectiveSplitLevel
Set the support * KL cutoff level (1-4) -- use of 2 or 3 is good.


sisterSplitLevel

public static int sisterSplitLevel
Set the support * KL cutoff level (1-4) for sister splitting -- don't use it, as far as we can tell so far

Constructor Detail

EnglishTreebankParserParams.EnglishTrain

public EnglishTreebankParserParams.EnglishTrain()
Method Detail

display

public static void display()


Stanford NLP Group