public class SurfacePatternFactory extends PatternFactory
Modifier and Type | Class and Description |
---|---|
static class |
SurfacePatternFactory.Genre |
PatternFactory.PatternType
Modifier and Type | Field and Description |
---|---|
static boolean |
addPatWithoutPOS
Add patterns without POS restriction as well: One of this and
usePOS4Pattern has to be true. |
static int |
maxWindow4Pattern
Consider contexts less than or equal to these many tokens -- total of left
and right contexts be can double of this.
|
static int |
minWindow4Pattern
Consider contexts longer or equal to these many tokens.
|
static int |
numMinStopWordsToAdd
If the whole (either left or right) context is just stop words, add the
pattern only if number of tokens is equal or more than this.
|
static boolean |
useCoarsePOS
Use first two letters of the POS tag
|
static boolean |
useContextNERRestriction
If the NER tag of the context tokens is not the background symbol,
generalize the token with the NER tag
|
static boolean |
useFillerWordsInPat
Ignore words like "a", "an", "the" when matching a pattern.
|
static boolean |
useNextContext
Consider contexts on the right of a token.
|
static boolean |
usePOS4Pattern
Use POS tag restriction in the target term: One of this and
addPatWithoutPOS has to be true. |
static boolean |
usePreviousContext
Consider contexts on the left of a token.
|
static boolean |
useTargetParserParentRestriction
Adds the parent's tag from the parse tree to the target phrase in the patterns
|
fillerWords, ignoreWordRegex, numWordsCompound, numWordsCompoundMapped, numWordsCompoundMax, useLemmaContextTokens, useNER, useStopWordsBeforeTerm, useTargetNERRestriction
Constructor and Description |
---|
SurfacePatternFactory() |
Modifier and Type | Method and Description |
---|---|
static java.util.Set<SurfacePattern> |
getContext(java.util.List<CoreLabel> sent,
int i,
java.util.Set<CandidatePhrase> stopWords) |
static java.util.Map<java.lang.Integer,java.util.Set> |
getPatternsAroundTokens(DataInstance sent,
java.util.Set<CandidatePhrase> stopWords) |
static boolean |
isASCII(java.lang.String text) |
static void |
setUp(java.util.Properties props) |
doNotUse, getPatternsAroundTokens, setUp
@ArgumentParser.Option(name="usePOS4Pattern") public static boolean usePOS4Pattern
addPatWithoutPOS
has to be true.@ArgumentParser.Option(name="useCoarsePOS") public static boolean useCoarsePOS
@ArgumentParser.Option(name="addPatWithoutPOS") public static boolean addPatWithoutPOS
usePOS4Pattern
has to be true.@ArgumentParser.Option(name="minWindow4Pattern") public static int minWindow4Pattern
@ArgumentParser.Option(name="maxWindow4Pattern") public static int maxWindow4Pattern
@ArgumentParser.Option(name="usePreviousContext") public static boolean usePreviousContext
@ArgumentParser.Option(name="useNextContext") public static boolean useNextContext
@ArgumentParser.Option(name="numMinStopWordsToAdd") public static int numMinStopWordsToAdd
@ArgumentParser.Option(name="useTargetParserParentRestriction") public static boolean useTargetParserParentRestriction
@ArgumentParser.Option(name="useContextNERRestriction") public static boolean useContextNERRestriction
@ArgumentParser.Option(name="useFillerWordsInPat") public static boolean useFillerWordsInPat
public static void setUp(java.util.Properties props)
public static java.util.Set<SurfacePattern> getContext(java.util.List<CoreLabel> sent, int i, java.util.Set<CandidatePhrase> stopWords)
public static boolean isASCII(java.lang.String text)
public static java.util.Map<java.lang.Integer,java.util.Set> getPatternsAroundTokens(DataInstance sent, java.util.Set<CandidatePhrase> stopWords)