edu.stanford.nlp.international.process
Class AbstractDataset
java.lang.Object
edu.stanford.nlp.international.process.AbstractDataset
- All Implemented Interfaces:
- Dataset
- Direct Known Subclasses:
- ATBArabicDataset, FTBDataset
public abstract class AbstractDataset
- extends java.lang.Object
- implements Dataset
- Author:
- Spence Green
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
outputFileList
protected final java.util.List<java.lang.String> outputFileList
posMapper
protected Mapper posMapper
posMapOptions
protected java.lang.String posMapOptions
lexMapper
protected Mapper lexMapper
lexMapOptions
protected java.lang.String lexMapOptions
encoding
protected Dataset.Encoding encoding
pathsToData
protected final java.util.List<java.io.File> pathsToData
pathsToMappings
protected final java.util.List<java.io.File> pathsToMappings
splitFilter
protected java.io.FileFilter splitFilter
addDeterminer
protected boolean addDeterminer
removeDashTags
protected boolean removeDashTags
addRoot
protected boolean addRoot
removeEscapeTokens
protected boolean removeEscapeTokens
maxLen
protected int maxLen
morphDelim
protected java.lang.String morphDelim
customTreeVisitor
protected TreeVisitor customTreeVisitor
outFileName
protected java.lang.String outFileName
flatFileName
protected java.lang.String flatFileName
makeFlatFile
protected boolean makeFlatFile
fileNameNormalizer
protected final java.util.regex.Pattern fileNameNormalizer
treebank
protected Treebank treebank
configuredOptions
protected final java.util.Set<java.lang.String> configuredOptions
requiredOptions
protected final java.util.Set<java.lang.String> requiredOptions
toStringBuffer
protected final java.lang.StringBuilder toStringBuffer
treeFileExtension
protected java.lang.String treeFileExtension
options
protected StringMap options
- Provides access for sub-classes to the data set parameters
AbstractDataset
public AbstractDataset()
build
public abstract void build()
- Description copied from interface:
Dataset
- Generic method for loading, processing, and writing a dataset.
- Specified by:
build
in interface Dataset
setOptions
public boolean setOptions(StringMap opts)
- Description copied from interface:
Dataset
- Sets options for a dataset.
- Specified by:
setOptions
in interface Dataset
- Parameters:
opts
- A map from parameter types defined in ConfigParser
to
values
- Returns:
- true if opts contains all required options. false, otherwise.
buildSplitMap
protected StringMap buildSplitMap(java.io.File path)
getFilenames
public java.util.List<java.lang.String> getFilenames()
- Description copied from interface:
Dataset
- Returns the filenames written by
Dataset.build()
.
- Specified by:
getFilenames
in interface Dataset
- Returns:
- A collection of filenames
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object
Stanford NLP Group