|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.sequences.SeqClassifierFlags
public class SeqClassifierFlags
Flags for sequence classifiers. Documentation for general flags and flags for
NER can be found in the Javadoc of
NERFeatureFactory
. Documentation for the flags
for Chinese word segmentation can be found in the Javadoc of
edu.stanford.nlp.wordseg.ChineseSegmenterFeatureFactory
.
Property Name | Type | Default Value | Description |
useQN | boolean | true | Use Quasi-Newton (L-BFGS) to find minimum. NOTE: Need to set this to false if using other minimizers such as SGD. |
QNsize | int | 25 | Number of previous iterations of Quasi-Newton to store (this increases memory use, but speeds convergence by letting the Quasi-Newton optimization more effectively approximate the second derivative). |
QNsize2 | int | 25 | Number of previous iterations of Quasi-Newton to store (used when pruning features, after the first iteration - the first iteration is with QNSize). |
useInPlaceSGD | boolean | false | Use SGD (tweaking weights in place) to find minimum (more efficient than the old SGD, faster to converge than Quasi-Newtown if there are very large of samples). Implemented for CRFClassifier. NOTE: Remember to set useQN to false |
tuneSampleSize | int | -1 | If this number is greater than 0, specifies the number of samples to use for tuning (default is 1000). |
SGDPasses | int | -1 | If this number is greater than 0, specifies the number of SGD passes over entire training set) to do before giving up (default is 50). Can be smaller if sample size is very large. |
useSGD | boolean | false | Use SGD to find minimum (can be slow). NOTE: Remember to set useQN to false |
useSGDtoQN | boolean | false | Use SGD (SGD version selected by useInPlaceSGD or useSGD) for a certain number of passes (SGDPasses) and then switches to QN. Gives the quick initial convergence of SGD, with the desired convergence criterion of QN (there is some rampup time for QN). NOTE: Remember to set useQN to false |
evaluateIters | int | 0 | If this number is greater than 0, evaluates on the test set every so often while minimizing. Implemented for CRFClassifier. |
evalCmd | String | If specified (and evaluateIters is set), runs the specified cmdline command during evaluation (instead of default CONLL-like NER evaluation) | |
evaluateTrain | boolean | false | If specified (and evaluateIters is set), also evaluate on training set (can be expensive) |
tokenizerOptions | String(null) | Extra options to supply to the tokenizer when creating it. |
Constructor Summary | |
---|---|
SeqClassifierFlags()
|
|
SeqClassifierFlags(Properties props)
Create a new SeqClassifierFlags object and initialize it using values in the Properties object. |
Method Summary | |
---|---|
String |
getNotNullTrueStringRep()
note that this does *not* return string representation of arrays, lists and enums |
void |
setProperties(Properties props)
Initialize this object using values in Properties object. |
void |
setProperties(Properties props,
boolean printProps)
Initialize using values in Properties file. |
String |
toString()
Print the properties specified by this object. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final String DEFAULT_BACKGROUND_SYMBOL
public boolean useNGrams
public boolean conjoinShapeNGrams
public boolean lowercaseNGrams
public boolean dehyphenateNGrams
public boolean usePrev
public boolean useNext
public boolean useTags
public boolean useWordPairs
public boolean useGazettes
public boolean useSequences
public boolean usePrevSequences
public boolean useNextSequences
public boolean useLongSequences
public boolean useBoundarySequences
public boolean useTaggySequences
public boolean useExtraTaggySequences
public boolean dontExtendTaggy
public boolean useTaggySequencesShapeInteraction
public boolean strictlyZeroethOrder
public boolean strictlyFirstOrder
public boolean strictlySecondOrder
public boolean strictlyThirdOrder
public String entitySubclassification
public boolean retainEntitySubclassification
public boolean useGazettePhrases
public boolean makeConsistent
public boolean useWordLabelCounts
public boolean useViterbi
public int[] binnedLengths
public boolean verboseMode
public boolean useSum
public double tolerance
public String printFeatures
public boolean useSymTags
public boolean useSymWordPairs
public String printClassifier
public int printClassifierParam
public boolean intern
public boolean intern2
public boolean selfTest
public boolean sloppyGazette
public boolean cleanGazette
public boolean noMidNGrams
public int maxNGramLeng
public boolean useReverse
public boolean greekifyNGrams
public boolean useParenMatching
public boolean useLemmas
public boolean usePrevNextLemmas
public boolean normalizeTerms
public boolean normalizeTimex
public boolean useNB
public boolean useQN
public boolean useFloat
public int QNsize
public int QNsize2
public int maxIterations
public int wordShape
public boolean useShapeStrings
public boolean useTypeSeqs
public boolean useTypeSeqs2
public boolean useTypeSeqs3
public boolean useDisjunctive
public int disjunctionWidth
public boolean useDisjunctiveShapeInteraction
public boolean useDisjShape
public boolean useWord
public boolean useClassFeature
public boolean useShapeConjunctions
public boolean useWordTag
public boolean useNPHead
public boolean useNPGovernor
public boolean useHeadGov
public boolean useLastRealWord
public boolean useNextRealWord
public boolean useOccurrencePatterns
public boolean useTypeySequences
public boolean justify
public boolean normalize
public String priorType
public double sigma
public double epsilon
public int beamSize
public int maxLeft
public int maxRight
public boolean usePosition
public boolean useBeginSent
public boolean useGazFeatures
public boolean useMoreGazFeatures
public boolean useAbbr
public boolean useMinimalAbbr
public boolean useAbbr1
public boolean useMinimalAbbr1
public boolean useMoreAbbr
public boolean deleteBlankLines
public boolean useGENIA
public boolean useTOK
public boolean useABSTR
public boolean useABSTRFreqDict
public boolean useABSTRFreq
public boolean useFREQ
public boolean useABGENE
public boolean useWEB
public boolean useWEBFreqDict
public boolean useIsURL
public boolean useURLSequences
public boolean useIsDateRange
public boolean useEntityTypes
public boolean useEntityTypeSequences
public boolean useEntityRule
public boolean useOrdinal
public boolean useACR
public boolean useANTE
public boolean useMoreTags
public boolean useChunks
public boolean useChunkySequences
public boolean usePrevVB
public boolean useNextVB
public boolean useVB
public boolean subCWGaz
public String documentReader
public String map
public boolean useWideDisjunctive
public int wideDisjunctionWidth
public boolean useRadical
public boolean useBigramInTwoClique
public String morphFeatureFile
public boolean useReverseAffix
public int charHalfWindow
public boolean useWord1
public boolean useWord2
public boolean useWord3
public boolean useWord4
public boolean useRad1
public boolean useRad2
public boolean useWordn
public boolean useCTBPre1
public boolean useCTBSuf1
public boolean useASBCPre1
public boolean useASBCSuf1
public boolean usePKPre1
public boolean usePKSuf1
public boolean useHKPre1
public boolean useHKSuf1
public boolean useCTBChar2
public boolean useASBCChar2
public boolean useHKChar2
public boolean usePKChar2
public boolean useRule2
public boolean useDict2
public boolean useOutDict2
public String outDict2
public boolean useDictleng
public boolean useDictCTB2
public boolean useDictASBC2
public boolean useDictPK2
public boolean useDictHK2
public boolean useBig5
public boolean useNegDict2
public boolean useNegDict3
public boolean useNegDict4
public boolean useNegCTBDict2
public boolean useNegCTBDict3
public boolean useNegCTBDict4
public boolean useNegASBCDict2
public boolean useNegASBCDict3
public boolean useNegASBCDict4
public boolean useNegHKDict2
public boolean useNegHKDict3
public boolean useNegHKDict4
public boolean useNegPKDict2
public boolean useNegPKDict3
public boolean useNegPKDict4
public boolean usePre
public boolean useSuf
public boolean useRule
public boolean useHk
public boolean useMsr
public boolean useMSRChar2
public boolean usePk
public boolean useAs
public boolean useFilter
public boolean largeChSegFile
public boolean useRad2b
public boolean keepEnglishWhitespaces
public boolean keepAllWhitespaces
public boolean sighanPostProcessing
public boolean useChPos
public String normalizationTable
public String dictionary
public String serializedDictionary
public String dictionary2
public String normTableEncoding
public String sighanCorporaDict
public boolean useWordShapeGaz
public String wordShapeGaz
public boolean splitDocuments
public boolean printXML
public boolean useSeenFeaturesOnly
public String lastNameList
public String maleNameList
public String femaleNameList
public transient String trainFile
public transient String adaptFile
public transient String devFile
public transient String testFile
public transient String textFile
public transient boolean readStdin
public transient String outputFile
public transient String loadClassifier
public transient String loadTextClassifier
public transient String loadJarClassifier
public transient String loadAuxClassifier
public transient String serializeTo
public transient String serializeToText
public transient int interimOutputFreq
public transient String initialWeights
public transient List<String> gazettes
public transient String selfTrainFile
public String inputEncoding
public boolean bioSubmitOutput
public int numRuns
public String answerFile
public String altAnswerFile
public String dropGaz
public String printGazFeatures
public int numStartLayers
public boolean dump
public boolean mergeTags
public boolean splitOnHead
public int featureCountThreshold
public double featureWeightThreshold
public String featureFactory
public String backgroundSymbol
public boolean useObservedSequencesOnly
public int maxDocSize
public boolean printProbs
public boolean printFirstOrderProbs
public boolean saveFeatureIndexToDisk
public boolean removeBackgroundSingletonFeatures
public boolean doGibbs
public int numSamples
public boolean useNERPrior
public boolean useAcqPrior
public boolean useUniformPrior
public boolean useMUCFeatures
public double annealingRate
public String annealingType
public String loadProcessedData
public boolean initViterbi
public boolean useUnknown
public boolean checkNameList
public boolean useSemPrior
public boolean useFirstWord
public boolean useNumberFeature
public int ocrFold
public transient boolean ocrTrain
public String classifierType
public String svmModelFile
public String inferenceType
public boolean useLemmaAsWord
public String type
public String readerAndWriter
public List<String> comboProps
public boolean usePrediction
public boolean useAltGazFeatures
public String gazFilesFile
public boolean usePrediction2
public String baseTrainDir
public String baseTestDir
public String trainFiles
public String trainFileList
public String testFiles
public String trainDirs
public String testDirs
public boolean useOnlySeenWeights
public String predProp
public CoreLabel pad
public boolean useObservedFeaturesOnly
public String distSimLexicon
public boolean useDistSim
public int removeTopN
public int numTimesRemoveTopN
public double randomizedRatio
public double removeTopNPercent
public int purgeFeatures
public boolean booleanFeatures
public boolean iobWrapper
public boolean iobTags
public boolean useSegmentation
public boolean memoryThrift
public boolean timitDatum
public String serializeDatasetsDir
public String loadDatasetsDir
public String pushDir
public boolean purgeDatasets
public boolean keepOBInMemory
public boolean fakeDataset
public boolean restrictTransitionsTimit
public int numDatasetsPerFile
public boolean useTitle
public boolean lowerNewgeneThreshold
public boolean useEitherSideWord
public boolean useEitherSideDisjunctive
public boolean twoStage
public String crfType
public int featureThreshold
public String featThreshFile
public double featureDiffThresh
public int numTimesPruneFeatures
public double newgeneThreshold
public boolean doAdaptation
public boolean useInternal
public boolean useExternal
public double selfTrainConfidenceThreshold
public int selfTrainIterations
public int selfTrainWindowSize
public boolean useHuber
public boolean useQuartic
public double adaptSigma
public int numFolds
public int startFold
public int endFold
public boolean cacheNGrams
public String outputFormat
public boolean useSMD
public boolean useSGDtoQN
public boolean useStochasticQN
public boolean useScaledSGD
public int scaledSGDMethod
public int SGDPasses
public int QNPasses
public boolean tuneSGD
public StochasticCalculateMethods stochasticMethod
public double initialGain
public int stochasticBatchSize
public boolean useSGD
public double gainSGD
public boolean useHybrid
public int hybridCutoffIteration
public boolean outputIterationsToFile
public boolean testObjFunction
public boolean testVariance
public int SGD2QNhessSamples
public boolean testHessSamples
public int CRForder
public int CRFwindow
public boolean estimateInitial
public transient String biasedTrainFile
public transient String confusionMatrix
public String outputEncoding
public boolean useKBest
public String searchGraphPrefix
public double searchGraphPrune
public int kBest
public boolean useFeaturesC4gram
public boolean useFeaturesC5gram
public boolean useFeaturesC6gram
public boolean useFeaturesCpC4gram
public boolean useFeaturesCpC5gram
public boolean useFeaturesCpC6gram
public boolean useUnicodeType
public boolean useUnicodeType4gram
public boolean useUnicodeType5gram
public boolean use4Clique
public boolean useUnicodeBlock
public boolean useShapeStrings1
public boolean useShapeStrings3
public boolean useShapeStrings4
public boolean useShapeStrings5
public boolean useGoodForNamesCpC
public boolean useDictionaryConjunctions
public boolean expandMidDot
public int printFeaturesUpto
public boolean useDictionaryConjunctions3
public boolean useWordUTypeConjunctions2
public boolean useWordUTypeConjunctions3
public boolean useWordShapeConjunctions2
public boolean useWordShapeConjunctions3
public boolean useMidDotShape
public boolean augmentedDateChars
public boolean suppressMidDotPostprocessing
public boolean printNR
public String classBias
public boolean printLabelValue
public boolean useRobustQN
public boolean combo
public boolean useGenericFeatures
public boolean verboseForTrueCasing
public String trainHierarchical
public String domain
public boolean baseline
public String transferSigmas
public boolean doFE
public boolean restrictLabels
public boolean announceObjectBankEntries
public boolean usePos
public boolean useAgreement
public boolean useAccCase
public boolean useInna
public boolean useConcord
public boolean useFirstNgram
public boolean useLastNgram
public boolean collapseNN
public boolean useConjBreak
public boolean useAuxPairs
public boolean usePPVBPairs
public boolean useAnnexing
public boolean useTemporalNN
public boolean usePath
public boolean innaPPAttach
public boolean markProperNN
public boolean markMasdar
public boolean useSVO
public int numTags
public boolean useTagsCpC
public boolean useTagsCpCp2C
public boolean useTagsCpCp2Cp3C
public boolean useTagsCpCp2Cp3Cp4C
public double l1reg
public String mixedCaseMapFile
public String auxTrueCaseModels
public boolean use2W
public boolean useLC
public boolean useYetMoreCpCShapes
public boolean useIfInteger
public String exportFeatures
public boolean useInPlaceSGD
public boolean useTopics
public int evaluateIters
public String evalCmd
public boolean evaluateTrain
public transient boolean evaluateIOB
public int tuneSampleSize
public boolean usePhraseFeatures
public boolean usePhraseWords
public boolean usePhraseWordTags
public boolean usePhraseWordSpecialTags
public boolean useCommonWordsFeature
public boolean useProtoFeatures
public boolean useWordnetFeatures
public String tokenFactory
public String tokensAnnotationClassName
public transient String tokenizerOptions
public boolean useCorefFeatures
public String wikiFeatureDbFile
public boolean useNoisyNonNoisyFeature
public boolean useYear
public boolean useSentenceNumber
public boolean useLabelSource
public boolean casedDistSim
public String distSimFileFormat
public int distSimMaxBits
public boolean numberEquivalenceDistSim
public String unknownWordDistSimClass
public boolean useNeighborNGrams
public Function<String,String> wordFunction
public transient List<String> phraseGazettes
public transient Properties props
Constructor Detail |
---|
public SeqClassifierFlags()
public SeqClassifierFlags(Properties props)
props
- The properties object used for initializationMethod Detail |
---|
public final void setProperties(Properties props)
props
- The properties object used for initializationpublic void setProperties(Properties props, boolean printProps)
props
- The properties object used for initializationprintProps
- Whether to print the properties to stderr as it works.public String toString()
toString
in class Object
public String getNotNullTrueStringRep()
IllegalAccessException
IllegalArgumentException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |