public class ObjectBank<E> extends Object implements Collection<E>, Serializable
getLineIterator
method.
In its simplest use, it returns an ObjectBank<String>, which is a subclass of
Collection<String>. So, statements like these work:
for (String str : ObjectBank.getLineIterator(filename) {
System.out.println(str);
}
String[] strings = ObjectBank.getLineIterator(filename).toArray(new String[0]);
String[] strings = ObjectBank.getLineIterator(filename, "GB18030").toArray(new String[0]);
More complex uses of getLineIterator let you interpret each line of a file
as an object of arbitrary type via a transformer Function.
As an example of the general power of this class, suppose you have
a collection of files in the directory /u/nlp/data/gre/questions. Each file
contains several Puzzle documents which look like:
<puzzle> <preamble> some text </preamble> <question> some intro text <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> <question> another question <answer> answer1 </answer> <answer> answer2 </answer> <answer> answer3 </answer> <answer> answer4 </answer> </question> </puzzle>First you need to build a ReaderIteratorFactory which will provide java.io.Readers over all the files in your directory:
Collection c = new FileSequentialCollection("/u/nlp/data/gre/questions/", "", false); ReaderIteratorFactory rif = new ReaderIteratorFactory(c);Next you need to make an IteratorFromReaderFactory which will take the java.io.Readers vended by the ReaderIteratorFactory, split them up into documents (Strings) and then convert the Strings into Objects. In this case we want to keep everything between each set of
public class PuzzleParser implements Function { public Object apply (Object o) { String s = (String)o; ... Puzzle p = new Puzzle(...); ... return p; } }Now to build the IteratorFromReaderFactory:
IteratorFromReaderFactory rtif = new BeginEndTokenizerFactory("Now, to create your ObjectBank you just give it the ReaderIteratorFactory and IteratorFromReaderFactory that you just created:", " ", new PuzzleParser());
ObjectBank puzzles = new ObjectBank(rif, rtif);Now, if you get a new set of puzzles that are located elsewhere and formatted differently you create a new ObjectBank for reading them in and use that ObjectBank instead with only trivial changes (or possible none at all if the ObjectBank is read in on a constructor) to your code. Or even better, if someone else wants to use your code to evaluate their puzzles, which are located elsewhere and formatted differently, they already know what they have to do to make your code work for them.
Modifier and Type | Class and Description |
---|---|
static class |
ObjectBank.PathToFileFunction
This is handy for having getLineIterator return a collection of files for feeding into another ObjectBank.
|
Modifier and Type | Field and Description |
---|---|
protected IteratorFromReaderFactory<E> |
ifrf |
protected ReaderIteratorFactory |
rif |
Constructor and Description |
---|
ObjectBank(ReaderIteratorFactory rif,
IteratorFromReaderFactory<E> ifrf)
This creates a new ObjectBank with the given ReaderIteratorFactory
and ObjectIteratorFactory.
|
Modifier and Type | Method and Description |
---|---|
boolean |
add(E o)
Unsupported Operation.
|
boolean |
addAll(Collection<? extends E> c)
Unsupported Operation.
|
void |
clear() |
void |
clearMemory()
If you are keeping the contents in memory,
this will clear the memory, and they will be
recomputed the next time iterator() is
called.
|
boolean |
contains(Object o)
Can be slow.
|
boolean |
containsAll(Collection<?> c)
Can be slow.
|
static <X> ObjectBank<X> |
getLineIterator(Collection<?> filesStringsAndReaders,
java.util.function.Function<String,X> op) |
static <X> ObjectBank<X> |
getLineIterator(Collection<?> filesStringsAndReaders,
java.util.function.Function<String,X> op,
String encoding) |
static ObjectBank<String> |
getLineIterator(Collection<?> filesStringsAndReaders,
String encoding) |
static ObjectBank<String> |
getLineIterator(File file) |
static <X> ObjectBank<X> |
getLineIterator(File file,
java.util.function.Function<String,X> op) |
static <X> ObjectBank<X> |
getLineIterator(File file,
java.util.function.Function<String,X> op,
String encoding) |
static ObjectBank<String> |
getLineIterator(File file,
String encoding) |
static ObjectBank<String> |
getLineIterator(Reader reader) |
static <X> ObjectBank<X> |
getLineIterator(Reader reader,
java.util.function.Function<String,X> op) |
static ObjectBank<String> |
getLineIterator(String filename) |
static <X> ObjectBank<X> |
getLineIterator(String filename,
java.util.function.Function<String,X> op) |
static ObjectBank<String> |
getLineIterator(String filename,
String encoding) |
boolean |
isEmpty() |
Iterator<E> |
iterator() |
void |
keepInMemory(boolean keep)
Tells the ObjectBank to store all of
its contents in memory so that it doesn't
have to be recomputed each time you iterate
through it.
|
boolean |
remove(Object o)
Unsupported Operation.
|
boolean |
removeAll(Collection<?> c)
Unsupported Operation.
|
boolean |
retainAll(Collection<?> c)
Unsupported Operation.
|
int |
size()
Can be slow.
|
Object[] |
toArray() |
<T> T[] |
toArray(T[] o)
Can be slow.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
equals, hashCode, parallelStream, removeIf, spliterator, stream
protected ReaderIteratorFactory rif
protected IteratorFromReaderFactory<E> ifrf
public ObjectBank(ReaderIteratorFactory rif, IteratorFromReaderFactory<E> ifrf)
rif
- The ReaderIteratorFactory
from which to get Readersifrf
- The IteratorFromReaderFactory
which turns java.io.Readers
into Iterators of Objectspublic static ObjectBank<String> getLineIterator(String filename)
public static <X> ObjectBank<X> getLineIterator(String filename, java.util.function.Function<String,X> op)
public static ObjectBank<String> getLineIterator(String filename, String encoding)
public static ObjectBank<String> getLineIterator(Reader reader)
public static <X> ObjectBank<X> getLineIterator(Reader reader, java.util.function.Function<String,X> op)
public static ObjectBank<String> getLineIterator(File file)
public static <X> ObjectBank<X> getLineIterator(File file, java.util.function.Function<String,X> op)
public static ObjectBank<String> getLineIterator(File file, String encoding)
public static <X> ObjectBank<X> getLineIterator(File file, java.util.function.Function<String,X> op, String encoding)
public static <X> ObjectBank<X> getLineIterator(Collection<?> filesStringsAndReaders, java.util.function.Function<String,X> op)
public static ObjectBank<String> getLineIterator(Collection<?> filesStringsAndReaders, String encoding)
public static <X> ObjectBank<X> getLineIterator(Collection<?> filesStringsAndReaders, java.util.function.Function<String,X> op, String encoding)
public void keepInMemory(boolean keep)
keep
- Whether to keep contents in memorypublic void clearMemory()
public boolean isEmpty()
isEmpty
in interface Collection<E>
public boolean contains(Object o)
contains
in interface Collection<E>
public boolean containsAll(Collection<?> c)
containsAll
in interface Collection<E>
public int size()
size
in interface Collection<E>
public void clear()
clear
in interface Collection<E>
public Object[] toArray()
toArray
in interface Collection<E>
public <T> T[] toArray(T[] o)
toArray
in interface Collection<E>
public boolean add(E o)
add
in interface Collection<E>
public boolean remove(Object o)
remove
in interface Collection<E>
public boolean addAll(Collection<? extends E> c)
addAll
in interface Collection<E>
public boolean removeAll(Collection<?> c)
removeAll
in interface Collection<E>
public boolean retainAll(Collection<?> c)
retainAll
in interface Collection<E>