public class ObjectBank<E>
extends java.lang.Object
implements java.util.Collection<E>, java.io.Serializable
getLineIterator
method.
In its simplest use, it returns an ObjectBank<String>, which is a subclass of
Collection<String>. So, statements like these work:
for (String str : ObjectBank.getLineIterator(filename) {
System.out.println(str);
}
String[] strings = ObjectBank.getLineIterator(filename).toArray(new String[0]);
String[] strings = ObjectBank.getLineIterator(filename, "GB18030").toArray(new String[0]);
More complex uses of getLineIterator let you interpret each line of a file
as an object of arbitrary type via a transformer Function.
For more general uses with existing classes, you first construct a collection of sources, then a class that
will make the objects of interest from instances of those sources, and then set up an ObjectBank that can
vend those objects:
ReaderIteratorFactory rif = new ReaderIteratorFactory(Arrays.asList(new String[] { "file1", "file2", "file3" }));
IteratorFromReaderFactory corefIFRF = new MUCCorefIteratorFromReaderFactory(true);
for (Mention m : new ObjectBank(rif, corefIFRF)) {
...
}
As an example of the general power of this class, suppose you have
a collection of files in the directory /u/nlp/data/gre/questions. Each file
contains several Puzzle documents which look like:
<puzzle> <preamble> some text </preamble> <question> some intro text <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> <question> another question <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> </puzzle>First you need to build a ReaderIteratorFactory which will provide java.io.Readers over all the files in your directory:
Collection c = new FileSequentialCollection("/u/nlp/data/gre/questions/", "", false); ReaderIteratorFactory rif = new ReaderIteratorFactory(c);Next you need to make an IteratorFromReaderFactory which will take the java.io.Readers vended by the ReaderIteratorFactory, split them up into documents (Strings) and then convert the Strings into Objects. In this case we want to keep everything between each set of <puzzle> </puzzle> tags so we would use a BeginEndTokenizerFactory. You would also need to write a class which extends Function and whose apply method converts the String between the <puzzle> </puzzle> tags into Puzzle objects.
public class PuzzleParser implements Function { public Object apply (Object o) { String s = (String)o; ... Puzzle p = new Puzzle(...); ... return p; } }Now to build the IteratorFromReaderFactory:
IteratorFromReaderFactory rtif = new BeginEndTokenizerFactory("Now, to create your ObjectBank you just give it the ReaderIteratorFactory and IteratorFromReaderFactory that you just created:", " ", new PuzzleParser());
ObjectBank puzzles = new ObjectBank(rif, rtif);Now, if you get a new set of puzzles that are located elsewhere and formatted differently you create a new ObjectBank for reading them in and use that ObjectBank instead with only trivial changes (or possible none at all if the ObjectBank is read in on a constructor) to your code. Or even better, if someone else wants to use your code to evaluate their puzzles, which are located elsewhere and formatted differently, they already know what they have to do to make your code work for them.
Modifier and Type | Class and Description |
---|---|
static class |
ObjectBank.PathToFileFunction
This is handy for having getLineIterator return a collection of files for feeding into another ObjectBank.
|
Modifier and Type | Field and Description |
---|---|
protected IteratorFromReaderFactory<E> |
ifrf |
protected ReaderIteratorFactory |
rif |
Constructor and Description |
---|
ObjectBank(ReaderIteratorFactory rif,
IteratorFromReaderFactory<E> ifrf)
This creates a new ObjectBank with the given ReaderIteratorFactory
and ObjectIteratorFactory.
|
Modifier and Type | Method and Description |
---|---|
boolean |
add(E o)
Unsupported Operation.
|
boolean |
addAll(java.util.Collection<? extends E> c)
Unsupported Operation.
|
void |
clear() |
void |
clearMemory()
If you are keeping the contents in memory,
this will clear the memory, and they will be
recomputed the next time iterator() is
called.
|
boolean |
contains(java.lang.Object o)
Can be slow.
|
boolean |
containsAll(java.util.Collection<?> c)
Can be slow.
|
static <X> ObjectBank<X> |
getLineIterator(java.util.Collection<?> filesStringsAndReaders,
java.util.function.Function<java.lang.String,X> op) |
static <X> ObjectBank<X> |
getLineIterator(java.util.Collection<?> filesStringsAndReaders,
java.util.function.Function<java.lang.String,X> op,
java.lang.String encoding) |
static ObjectBank<java.lang.String> |
getLineIterator(java.util.Collection<?> filesStringsAndReaders,
java.lang.String encoding) |
static ObjectBank<java.lang.String> |
getLineIterator(java.io.File file) |
static <X> ObjectBank<X> |
getLineIterator(java.io.File file,
java.util.function.Function<java.lang.String,X> op) |
static <X> ObjectBank<X> |
getLineIterator(java.io.File file,
java.util.function.Function<java.lang.String,X> op,
java.lang.String encoding) |
static ObjectBank<java.lang.String> |
getLineIterator(java.io.File file,
java.lang.String encoding) |
static ObjectBank<java.lang.String> |
getLineIterator(java.io.Reader reader) |
static <X> ObjectBank<X> |
getLineIterator(java.io.Reader reader,
java.util.function.Function<java.lang.String,X> op) |
static ObjectBank<java.lang.String> |
getLineIterator(java.lang.String filename) |
static <X> ObjectBank<X> |
getLineIterator(java.lang.String filename,
java.util.function.Function<java.lang.String,X> op) |
static ObjectBank<java.lang.String> |
getLineIterator(java.lang.String filename,
java.lang.String encoding) |
boolean |
isEmpty() |
java.util.Iterator<E> |
iterator() |
void |
keepInMemory(boolean keep)
Tells the ObjectBank to store all of
its contents in memory so that it doesn't
have to be recomputed each time you iterate
through it.
|
boolean |
remove(java.lang.Object o)
Unsupported Operation.
|
boolean |
removeAll(java.util.Collection<?> c)
Unsupported Operation.
|
boolean |
retainAll(java.util.Collection<?> c)
Unsupported Operation.
|
int |
size()
Can be slow.
|
java.lang.Object[] |
toArray() |
<T> T[] |
toArray(T[] o)
Can be slow.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
protected ReaderIteratorFactory rif
protected IteratorFromReaderFactory<E> ifrf
public ObjectBank(ReaderIteratorFactory rif, IteratorFromReaderFactory<E> ifrf)
rif
- The ReaderIteratorFactory
from which to get Readersifrf
- The IteratorFromReaderFactory
which turns java.io.Readers
into Iterators of Objectspublic static ObjectBank<java.lang.String> getLineIterator(java.lang.String filename)
public static <X> ObjectBank<X> getLineIterator(java.lang.String filename, java.util.function.Function<java.lang.String,X> op)
public static ObjectBank<java.lang.String> getLineIterator(java.lang.String filename, java.lang.String encoding)
public static ObjectBank<java.lang.String> getLineIterator(java.io.Reader reader)
public static <X> ObjectBank<X> getLineIterator(java.io.Reader reader, java.util.function.Function<java.lang.String,X> op)
public static ObjectBank<java.lang.String> getLineIterator(java.io.File file)
public static <X> ObjectBank<X> getLineIterator(java.io.File file, java.util.function.Function<java.lang.String,X> op)
public static ObjectBank<java.lang.String> getLineIterator(java.io.File file, java.lang.String encoding)
public static <X> ObjectBank<X> getLineIterator(java.io.File file, java.util.function.Function<java.lang.String,X> op, java.lang.String encoding)
public static <X> ObjectBank<X> getLineIterator(java.util.Collection<?> filesStringsAndReaders, java.util.function.Function<java.lang.String,X> op)
public static ObjectBank<java.lang.String> getLineIterator(java.util.Collection<?> filesStringsAndReaders, java.lang.String encoding)
public static <X> ObjectBank<X> getLineIterator(java.util.Collection<?> filesStringsAndReaders, java.util.function.Function<java.lang.String,X> op, java.lang.String encoding)
public java.util.Iterator<E> iterator()
public void keepInMemory(boolean keep)
keep
- Whether to keep contents in memorypublic void clearMemory()
public boolean isEmpty()
isEmpty
in interface java.util.Collection<E>
public boolean contains(java.lang.Object o)
contains
in interface java.util.Collection<E>
public boolean containsAll(java.util.Collection<?> c)
containsAll
in interface java.util.Collection<E>
public int size()
size
in interface java.util.Collection<E>
public void clear()
clear
in interface java.util.Collection<E>
public java.lang.Object[] toArray()
toArray
in interface java.util.Collection<E>
public <T> T[] toArray(T[] o)
toArray
in interface java.util.Collection<E>
public boolean add(E o)
add
in interface java.util.Collection<E>
public boolean remove(java.lang.Object o)
remove
in interface java.util.Collection<E>
public boolean addAll(java.util.Collection<? extends E> c)
addAll
in interface java.util.Collection<E>
public boolean removeAll(java.util.Collection<?> c)
removeAll
in interface java.util.Collection<E>
public boolean retainAll(java.util.Collection<?> c)
retainAll
in interface java.util.Collection<E>