|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.sequences.SeqClassifierFlags
public class SeqClassifierFlags
Flags for sequence classifiers. Documentation for general flags and
flags for NER can be found in the Javadoc of
edu.stanford.nlp.ie.NERFeatureFactory
.
Documentation for the flags for Chinese word segmentation can be
found in the Javadoc of
edu.stanford.nlp.wordseg.ChineseSegmenterFeatureFactory
.
Property Name | Type | Default Value | Description |
useQN | boolean | true | Use Quasi-Newton (L-BFGS) to find minimum. NOTE: Need to set this to false if using other minimizers such as SGD. |
QNsize | int | 25 | Number of previous iterations of Quasi-Newton to store (this increases memory use, but speeds convergence by letting the Quasi-Newton optimization more effectively approximate the second derivative). |
QNsize2 | int | 25 | Number of previous iterations of Quasi-Newton to store (used when pruning features, after the first iteration - the first iteration is with QNSize). |
useInPlaceSGD | boolean | false | Use SGD (tweaking weights in place) to find minimum (more efficient than the old SGD, faster to converge than Quasi-Newtown if there are very large of samples). Implemented for CRFClassifier. NOTE: Remember to set useQN to false |
tuneSampleSize | int | -1 | If this number is greater than 0, specifies the number of samples to use for tuning (default is 1000). |
SGDPasses | int | -1 | If this number is greater than 0, specifies the number of SGD passes over entire training set) to do before giving up (default is 50). Can be smaller if sample size is very large. |
useSGD | boolean | false | Use SGD to find minimum (can be slow). NOTE: Remember to set useQN to false |
useSGDtoQN | boolean | false | Use SGD (SGD version selected by useInPlaceSGD or useSGD) for a certain number of passes (SGDPasses) and then switches to QN. Gives the quick initial convergence of SGD, with the desired convergence criterion of QN (there is some rampup time for QN). NOTE: Remember to set useQN to false |
evaluateIters | int | 0 | If this number is greater than 0, evaluates on the test set every so often while minimizing. Implemented for CRFClassifier. |
evalCmd | String | If specified (and evaluateIters is set), runs the specified cmdline command during evaluation (instead of default CONLL-like NER evaluation) | |
evaluateTrain | boolean | false | If specified (and evaluateIters is set), also evaluate on training set (can be expensive) |
Constructor Summary | |
---|---|
SeqClassifierFlags()
|
|
SeqClassifierFlags(java.util.Properties props)
Create a new SeqClassifierFlags object and initialize it using values in the Properties object. |
Method Summary | |
---|---|
void |
setProperties(java.util.Properties props)
Initialize this object using values in Properties object. |
void |
setProperties(java.util.Properties props,
boolean printProps)
Initialize using values in Properties file. |
java.lang.String |
toString()
Print the properties specified by this object. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String DEFAULT_BACKGROUND_SYMBOL
public boolean useNGrams
public boolean conjoinShapeNGrams
public boolean lowercaseNGrams
public boolean dehyphenateNGrams
public boolean usePrev
public boolean useNext
public boolean useTags
public boolean useWordPairs
public boolean useGazettes
public boolean useSequences
public boolean usePrevSequences
public boolean useNextSequences
public boolean useLongSequences
public boolean useBoundarySequences
public boolean useTaggySequences
public boolean useExtraTaggySequences
public boolean dontExtendTaggy
public boolean useTaggySequencesShapeInteraction
public boolean strictlyZeroethOrder
public boolean strictlyFirstOrder
public boolean strictlySecondOrder
public boolean strictlyThirdOrder
public java.lang.String entitySubclassification
public boolean retainEntitySubclassification
public boolean useGazettePhrases
public boolean makeConsistent
public boolean useWordLabelCounts
public boolean useViterbi
public int[] binnedLengths
public boolean verboseMode
public boolean useSum
public double tolerance
public java.lang.String printFeatures
public boolean useSymTags
public boolean useSymWordPairs
public java.lang.String printClassifier
public int printClassifierParam
public boolean intern
public boolean intern2
public boolean selfTest
public boolean sloppyGazette
public boolean cleanGazette
public boolean noMidNGrams
public int maxNGramLeng
public boolean useReverse
public boolean greekifyNGrams
public boolean useParenMatching
public boolean useLemmas
public boolean usePrevNextLemmas
public boolean normalizeTerms
public boolean normalizeTimex
public boolean useNB
public boolean useQN
public boolean useFloat
public int QNsize
public int QNsize2
public int maxIterations
public int wordShape
public boolean useShapeStrings
public boolean useTypeSeqs
public boolean useTypeSeqs2
public boolean useTypeSeqs3
public boolean useDisjunctive
public int disjunctionWidth
public boolean useDisjunctiveShapeInteraction
public boolean useDisjShape
public boolean useWord
public boolean useClassFeature
public boolean useShapeConjunctions
public boolean useWordTag
public boolean useNPHead
public boolean useNPGovernor
public boolean useHeadGov
public boolean useLastRealWord
public boolean useNextRealWord
public boolean useOccurrencePatterns
public boolean useTypeySequences
public boolean justify
public boolean normalize
public java.lang.String priorType
public double sigma
public double epsilon
public int beamSize
public int maxLeft
public int maxRight
public boolean usePosition
public boolean useBeginSent
public boolean useGazFeatures
public boolean useMoreGazFeatures
public boolean useAbbr
public boolean useMinimalAbbr
public boolean useAbbr1
public boolean useMinimalAbbr1
public boolean useMoreAbbr
public boolean deleteBlankLines
public boolean useGENIA
public boolean useTOK
public boolean useABSTR
public boolean useABSTRFreqDict
public boolean useABSTRFreq
public boolean useFREQ
public boolean useABGENE
public boolean useWEB
public boolean useWEBFreqDict
public boolean useIsURL
public boolean useURLSequences
public boolean useIsDateRange
public boolean useEntityTypes
public boolean useEntityTypeSequences
public boolean useEntityRule
public boolean useOrdinal
public boolean useACR
public boolean useANTE
public boolean useMoreTags
public boolean useChunks
public boolean useChunkySequences
public boolean usePrevVB
public boolean useNextVB
public boolean useVB
public boolean subCWGaz
public java.lang.String documentReader
public java.lang.String map
public boolean useWideDisjunctive
public int wideDisjunctionWidth
public boolean useRadical
public boolean useBigramInTwoClique
public java.lang.String morphFeatureFile
public boolean useReverseAffix
public int charHalfWindow
public boolean useWord1
public boolean useWord2
public boolean useWord3
public boolean useWord4
public boolean useRad1
public boolean useRad2
public boolean useWordn
public boolean useCTBPre1
public boolean useCTBSuf1
public boolean useASBCPre1
public boolean useASBCSuf1
public boolean usePKPre1
public boolean usePKSuf1
public boolean useHKPre1
public boolean useHKSuf1
public boolean useCTBChar2
public boolean useASBCChar2
public boolean useHKChar2
public boolean usePKChar2
public boolean useRule2
public boolean useDict2
public boolean useOutDict2
public java.lang.String outDict2
public boolean useDictleng
public boolean useDictCTB2
public boolean useDictASBC2
public boolean useDictPK2
public boolean useDictHK2
public boolean useBig5
public boolean useNegDict2
public boolean useNegDict3
public boolean useNegDict4
public boolean useNegCTBDict2
public boolean useNegCTBDict3
public boolean useNegCTBDict4
public boolean useNegASBCDict2
public boolean useNegASBCDict3
public boolean useNegASBCDict4
public boolean useNegHKDict2
public boolean useNegHKDict3
public boolean useNegHKDict4
public boolean useNegPKDict2
public boolean useNegPKDict3
public boolean useNegPKDict4
public boolean usePre
public boolean useSuf
public boolean useRule
public boolean useHk
public boolean useMsr
public boolean useMSRChar2
public boolean usePk
public boolean useAs
public boolean useFilter
public boolean largeChSegFile
public boolean useRad2b
public boolean keepEnglishWhitespaces
public boolean keepAllWhitespaces
public boolean sighanPostProcessing
public boolean useChPos
public java.lang.String normalizationTable
public java.lang.String dictionary
public java.lang.String serializedDictionary
public java.lang.String dictionary2
public java.lang.String normTableEncoding
public java.lang.String sighanCorporaDict
public boolean useWordShapeGaz
public java.lang.String wordShapeGaz
public boolean splitDocuments
public boolean printXML
public boolean useSeenFeaturesOnly
public java.lang.String lastNameList
public java.lang.String maleNameList
public java.lang.String femaleNameList
public transient java.lang.String trainFile
public transient java.lang.String adaptFile
public transient java.lang.String devFile
public transient java.lang.String testFile
public transient java.lang.String textFile
public transient java.lang.String outputFile
public transient java.lang.String loadClassifier
public transient java.lang.String loadTextClassifier
public transient java.lang.String loadJarClassifier
public transient java.lang.String loadAuxClassifier
public transient java.lang.String serializeTo
public transient java.lang.String serializeToText
public transient int interimOutputFreq
public transient java.lang.String initialWeights
public transient java.util.List<java.lang.String> gazettes
public transient java.lang.String selfTrainFile
public java.lang.String inputEncoding
public boolean bioSubmitOutput
public int numRuns
public java.lang.String answerFile
public java.lang.String altAnswerFile
public java.lang.String dropGaz
public java.lang.String printGazFeatures
public int numStartLayers
public boolean dump
public boolean mergeTags
public boolean splitOnHead
public int featureCountThreshold
public double featureWeightThreshold
public java.lang.String featureFactory
public java.lang.String backgroundSymbol
public boolean useObservedSequencesOnly
public int maxDocSize
public boolean printProbs
public boolean printFirstOrderProbs
public boolean saveFeatureIndexToDisk
public boolean removeBackgroundSingletonFeatures
public boolean doGibbs
public int numSamples
public boolean useNERPrior
public boolean useAcqPrior
public boolean useUniformPrior
public boolean useMUCFeatures
public double annealingRate
public java.lang.String annealingType
public java.lang.String loadProcessedData
public boolean initViterbi
public boolean useUnknown
public boolean checkNameList
public boolean useSemPrior
public boolean useFirstWord
public boolean useNumberFeature
public int ocrFold
public transient boolean ocrTrain
public java.lang.String classifierType
public java.lang.String svmModelFile
public java.lang.String inferenceType
public boolean useLemmaAsWord
public java.lang.String type
public java.lang.String readerAndWriter
public java.util.List<java.lang.String> comboProps
public boolean usePrediction
public boolean useAltGazFeatures
public java.lang.String gazFilesFile
public boolean usePrediction2
public java.lang.String baseTrainDir
public java.lang.String baseTestDir
public java.lang.String trainFiles
public java.lang.String trainFileList
public java.lang.String testFiles
public java.lang.String trainDirs
public java.lang.String testDirs
public boolean useOnlySeenWeights
public java.lang.String predProp
public CoreLabel pad
public boolean useObservedFeaturesOnly
public java.lang.String distSimLexicon
public boolean useDistSim
public int removeTopN
public int numTimesRemoveTopN
public double randomizedRatio
public double removeTopNPercent
public int purgeFeatures
public boolean booleanFeatures
public boolean iobWrapper
public boolean iobTags
public boolean useSegmentation
public boolean memoryThrift
public boolean timitDatum
public java.lang.String serializeDatasetsDir
public java.lang.String loadDatasetsDir
public java.lang.String pushDir
public boolean purgeDatasets
public boolean keepOBInMemory
public boolean fakeDataset
public boolean restrictTransitionsTimit
public int numDatasetsPerFile
public boolean useTitle
public boolean lowerNewgeneThreshold
public boolean useEitherSideWord
public boolean useEitherSideDisjunctive
public boolean twoStage
public java.lang.String crfType
public int featureThreshold
public java.lang.String featThreshFile
public double featureDiffThresh
public int numTimesPruneFeatures
public double newgeneThreshold
public boolean doAdaptation
public boolean useInternal
public boolean useExternal
public double selfTrainConfidenceThreshold
public int selfTrainIterations
public int selfTrainWindowSize
public boolean useHuber
public boolean useQuartic
public double adaptSigma
public int numFolds
public int startFold
public int endFold
public boolean cacheNGrams
public java.lang.String outputFormat
public boolean useSMD
public boolean useSGDtoQN
public boolean useStochasticQN
public boolean useScaledSGD
public int scaledSGDMethod
public int SGDPasses
public int QNPasses
public boolean tuneSGD
public StochasticCalculateMethods stochasticMethod
public double initialGain
public int stochasticBatchSize
public boolean useSGD
public double gainSGD
public boolean useHybrid
public int hybridCutoffIteration
public boolean outputIterationsToFile
public boolean testObjFunction
public boolean testVariance
public int SGD2QNhessSamples
public boolean testHessSamples
public int CRForder
public int CRFwindow
public boolean estimateInitial
public transient java.lang.String biasedTrainFile
public transient java.lang.String confusionMatrix
public java.lang.String outputEncoding
public boolean useKBest
public java.lang.String searchGraphPrefix
public double searchGraphPrune
public int kBest
public boolean useFeaturesC4gram
public boolean useFeaturesC5gram
public boolean useFeaturesC6gram
public boolean useFeaturesCpC4gram
public boolean useFeaturesCpC5gram
public boolean useFeaturesCpC6gram
public boolean useUnicodeType
public boolean useUnicodeType4gram
public boolean useUnicodeType5gram
public boolean use4Clique
public boolean useUnicodeBlock
public boolean useShapeStrings1
public boolean useShapeStrings3
public boolean useShapeStrings4
public boolean useShapeStrings5
public boolean useGoodForNamesCpC
public boolean useDictionaryConjunctions
public boolean expandMidDot
public int printFeaturesUpto
public boolean useDictionaryConjunctions3
public boolean useWordUTypeConjunctions2
public boolean useWordUTypeConjunctions3
public boolean useWordShapeConjunctions2
public boolean useWordShapeConjunctions3
public boolean useMidDotShape
public boolean augmentedDateChars
public boolean suppressMidDotPostprocessing
public boolean printNR
public java.lang.String classBias
public boolean printLabelValue
public boolean useRobustQN
public boolean combo
public boolean useGenericFeatures
public boolean verboseForTrueCasing
public java.lang.String trainHierarchical
public java.lang.String domain
public boolean baseline
public java.lang.String transferSigmas
public boolean doFE
public boolean restrictLabels
public boolean announceObjectBankEntries
public boolean usePos
public boolean useAgreement
public boolean useAccCase
public boolean useInna
public boolean useConcord
public boolean useFirstNgram
public boolean useLastNgram
public boolean collapseNN
public boolean useConjBreak
public boolean useAuxPairs
public boolean usePPVBPairs
public boolean useAnnexing
public boolean useTemporalNN
public boolean usePath
public boolean innaPPAttach
public boolean markProperNN
public boolean markMasdar
public boolean useSVO
public int numTags
public boolean useTagsCpC
public boolean useTagsCpCp2C
public boolean useTagsCpCp2Cp3C
public boolean useTagsCpCp2Cp3Cp4C
public double l1reg
public java.lang.String mixedCaseMapFile
public java.lang.String auxTrueCaseModels
public boolean use2W
public boolean useLC
public boolean useYetMoreCpCShapes
public boolean useIfInteger
public java.lang.String exportFeatures
public boolean useInPlaceSGD
public boolean useTopics
public int evaluateIters
public java.lang.String evalCmd
public boolean evaluateTrain
public int tuneSampleSize
public boolean usePhraseFeatures
public boolean usePhraseWords
public boolean usePhraseWordTags
public boolean usePhraseWordSpecialTags
public boolean useProtoFeatures
public boolean useWordnetFeatures
public java.lang.String tokenFactory
public java.lang.String tokensAnnotationClassName
public boolean useCorefFeatures
public java.lang.String wikiFeatureDbFile
public boolean casedDistSim
public java.lang.String distSimFileFormat
public int distSimMaxBits
public boolean numberEquivalenceDistSim
public java.lang.String unknownWordDistSimClass
public transient java.util.List<java.lang.String> phraseGazettes
public transient java.util.Properties props
Constructor Detail |
---|
public SeqClassifierFlags()
public SeqClassifierFlags(java.util.Properties props)
props
- The properties object used for initializationMethod Detail |
---|
public final void setProperties(java.util.Properties props)
props
- The properties object used for initializationpublic void setProperties(java.util.Properties props, boolean printProps)
props
- The properties object used for initializationprintProps
- Whether to print the properties to stderr as it works.public java.lang.String toString()
toString
in class java.lang.Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |