|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.objectbank.ObjectBank<E>
public class ObjectBank<E>
The ObjectBank class is designed to make it easy to change the format/source of data read in by other classes and to standardize how data is read in javaNLP classes. This should make reuse of existing code (by non-authors of the code) easier because one has to just create a new ObjectBank which knows where to look for the data and how to turn it into Objects, and then use the new ObjectBank in the class. This will also make it easier to reuse code for reading in the same data.
An ObjectBank is a Collection of Objects. These objects are taken from input sources and then tokenized and parsed into the desired kind of Object. An ObjectBank requires a ReaderIteratorFactory and a IteratorFromReaderFactory. The ReaderIteratorFactory is used to get an Iterator over java.util.Readers which contain representations of the Objects. A ReaderIteratorFactory resembles a collection that takes input sources and dispenses Iterators over java.util.Readers of those sources. A IteratorFromReaderFactory is used to turn a single java.io.Reader into an Iterator over Objects. The IteratorFromReaderFactory splits the contents of the java.util.Reader into Strings and then parses them into appropriate Objects.<puzzle> <preamble> some text </preamble> <question> some intro text <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> <question> another question <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> </puzzle>First you need to build a ReaderIteratorFactory which will provide java.io.Readers over all the files in your directory:
Collection c = new FileSequentialCollection("/u/nlp/data/gre/questions/", "", false); ReaderIteratorFactory rif = new ReaderIteratorFactory(c);Next you need to make an IteratorFromReaderFactory which will take the java.io.Readers vended by the ReaderIteratorFactory, split them up into documents (Strings) and then convert the Strings into Objects. In this case we want to keep everything between each set of
public class PuzzleParser implements Function { public Object apply (Object o) { String s = (String)o; ... Puzzle p = new Puzzle(...); ... return p; } }Now to build the IteratorFromReaderFactory:
IteratorFromReaderFactory rtif = new BeginEndTokenizerFactory("Now, to create your ObjectBank you just give it the ReaderIteratorFactory and IteratorFromReaderFactory that you just created:", " ", new PuzzleParser());
ObjectBank puzzles = new ObjectBank(rif, rtif);Now, if you get a new set of puzzles that are located elsewhere and formatted differently you create a new ObjectBank for reading them in and use that ObjectBank instead with only trivial changes (or possible none at all if the ObjectBank is read in on a constructor) to your code. Or even better, if someone else wants to use your code to evaluate their puzzles, which are located elsewhere and formatted differently, they already know what they have to do to make your code work for them.
Field Summary | |
---|---|
protected IteratorFromReaderFactory<E> |
ifrf
|
protected ReaderIteratorFactory |
rif
|
Constructor Summary | |
---|---|
ObjectBank(ReaderIteratorFactory rif,
IteratorFromReaderFactory<E> ifrf)
This creates a new ObjectBank with the given ReaderIteratorFactory and ObjectIteratorFactory. |
Method Summary | ||
---|---|---|
boolean |
add(E o)
Unsupported Operation. |
|
boolean |
addAll(Collection<? extends E> c)
Unsupported Operation. |
|
void |
clear()
|
|
void |
clearMemory()
If you are keeping the contents in memory, this will clear hte memory, and they will be recomputed the next time iterator() is called. |
|
boolean |
contains(Object o)
Can be slow. |
|
boolean |
containsAll(Collection<?> c)
Can be slow. |
|
static
|
getLineIterator(Collection<?> filesStringsAndReaders,
Function<String,X> op)
|
|
static
|
getLineIterator(Collection<?> filesStringsAndReaders,
Function<String,X> op,
String encoding)
|
|
static ObjectBank<String> |
getLineIterator(File file)
|
|
static
|
getLineIterator(File file,
Function<String,X> op)
|
|
static
|
getLineIterator(File file,
Function<String,X> op,
String encoding)
|
|
static ObjectBank<String> |
getLineIterator(File file,
String encoding)
|
|
static ObjectBank<String> |
getLineIterator(Reader reader)
|
|
static
|
getLineIterator(Reader reader,
Function<String,X> op)
|
|
static ObjectBank<String> |
getLineIterator(String filename)
|
|
static
|
getLineIterator(String filename,
Function<String,X> op)
|
|
boolean |
isEmpty()
|
|
Iterator<E> |
iterator()
|
|
void |
keepInMemory(boolean keep)
Tells the ObjectBank to store all of its contents in memory so that it doesn't have to be recomputed each time you iterate through it. |
|
boolean |
remove(Object o)
Unsupported Operation. |
|
boolean |
removeAll(Collection<?> c)
Unsupported Operation. |
|
boolean |
retainAll(Collection<?> c)
Unsupported Operation. |
|
int |
size()
Can be slow. |
|
Object[] |
toArray()
Can be slow. |
|
|
toArray(T[] o)
Can be slow. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface java.util.Collection |
---|
equals, hashCode |
Field Detail |
---|
protected ReaderIteratorFactory rif
protected IteratorFromReaderFactory<E> ifrf
Constructor Detail |
---|
public ObjectBank(ReaderIteratorFactory rif, IteratorFromReaderFactory<E> ifrf)
rif
- The ReaderIteratorFactory
from which to get Readersifrf
- The IteratorFromReaderFactory
which turns java.io.Readers
into Iterators of ObjectsMethod Detail |
---|
public static ObjectBank<String> getLineIterator(String filename)
public static <X> ObjectBank<X> getLineIterator(String filename, Function<String,X> op)
public static ObjectBank<String> getLineIterator(Reader reader)
public static <X> ObjectBank<X> getLineIterator(Reader reader, Function<String,X> op)
public static ObjectBank<String> getLineIterator(File file)
public static <X> ObjectBank<X> getLineIterator(File file, Function<String,X> op)
public static ObjectBank<String> getLineIterator(File file, String encoding)
public static <X> ObjectBank<X> getLineIterator(File file, Function<String,X> op, String encoding)
public static <X> ObjectBank<X> getLineIterator(Collection<?> filesStringsAndReaders, Function<String,X> op)
public static <X> ObjectBank<X> getLineIterator(Collection<?> filesStringsAndReaders, Function<String,X> op, String encoding)
public Iterator<E> iterator()
iterator
in interface Iterable<E>
iterator
in interface Collection<E>
public void keepInMemory(boolean keep)
keep
- Whether to keep contents in memorypublic void clearMemory()
public boolean isEmpty()
isEmpty
in interface Collection<E>
public boolean contains(Object o)
contains
in interface Collection<E>
public boolean containsAll(Collection<?> c)
containsAll
in interface Collection<E>
public int size()
size
in interface Collection<E>
public void clear()
clear
in interface Collection<E>
public Object[] toArray()
toArray
in interface Collection<E>
public <T> T[] toArray(T[] o)
toArray
in interface Collection<E>
public boolean add(E o)
add
in interface Collection<E>
public boolean remove(Object o)
remove
in interface Collection<E>
public boolean addAll(Collection<? extends E> c)
addAll
in interface Collection<E>
public boolean removeAll(Collection<?> c)
removeAll
in interface Collection<E>
public boolean retainAll(Collection<?> c)
retainAll
in interface Collection<E>
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |