edu.stanford.nlp.process.treebank
Interface Dataset

All Known Implementing Classes:
AbstractDataset, ATBArabicDataset, FTBDataset

public interface Dataset

A generic interface loading, processing, and writing a data set. Classes that implement this interface may be specified in the configuration file using the TYPE parameter. TreebankPreprocessor will then call setOptions(edu.stanford.nlp.process.treebank.StringMap), build() and getFilenames() in that order.

Author:
Spence Green

Nested Class Summary
static class Dataset.Encoding
           
 
Method Summary
 void build()
          Generic method for loading, processing, and writing a dataset.
 List<String> getFilenames()
          Returns the filenames written by build().
 boolean setOptions(StringMap opts)
          Sets options for a dataset.
 

Method Detail

setOptions

boolean setOptions(StringMap opts)
Sets options for a dataset.

Parameters:
opts - A map from parameter types defined in ConfigParser to values
Returns:
true if opts contains all required options. false, otherwise.

build

void build()
Generic method for loading, processing, and writing a dataset.


getFilenames

List<String> getFilenames()
Returns the filenames written by build().

Returns:
A collection of filenames


Stanford NLP Group