edu.stanford.nlp.objectbank
Class XMLBeginEndIterator<E>
java.lang.Object
edu.stanford.nlp.util.AbstractIterator<E>
edu.stanford.nlp.objectbank.XMLBeginEndIterator<E>
- All Implemented Interfaces:
- java.util.Iterator<E>
public class XMLBeginEndIterator<E>
- extends AbstractIterator<E>
A class which iterates over Strings occurring between the begin and end of
a selected tag or tags. The element is specified by a regexp, matched
against the name of the element (i.e., excluding the angle bracket
characters) using matches()
).
The class ignores all other characters in the input Reader.
There are a few different ways to modify the output of the
XMLBeginEndIterator. One way is to ask it to keep internal tags;
if keepInternalTags
is set, then
<text>A<foo>B</text> will be printed as A<foo>B.
Another is to tell it to keep delimiting tags; in the above example,
<text> will be kept as well.
Finally, you can ask it to keep track of the nesting depth; the
ordinary behavior of this iterator is to close all tags with just
one close tag. This is incorrect XML behavior, but is kept in case
any code relies on it. If countDepth
is set, though,
the iterator keeps track of how much it has nested.
- Author:
- Teg Grenager (grenager@stanford.edu)
Constructor Summary |
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
|
Method Summary |
static IteratorFromReaderFactory<java.lang.String> |
getFactory(java.lang.String tag)
Returns a factory that vends BeginEndIterators that reads the contents of
the given Reader, extracts text between the specified Strings, then
returns the result. |
static IteratorFromReaderFactory<java.lang.String> |
getFactory(java.lang.String tag,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
static
|
getFactory(java.lang.String tag,
Function<java.lang.String,E> op)
|
static
|
getFactory(java.lang.String tag,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
boolean |
hasNext()
|
static void |
main(java.lang.String[] args)
|
E |
next()
|
protected E |
parseString(java.lang.String s)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
XMLBeginEndIterator
public XMLBeginEndIterator(java.io.Reader in,
java.lang.String tagNameRegexp,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
parseString
protected E parseString(java.lang.String s)
hasNext
public boolean hasNext()
- Specified by:
hasNext
in interface java.util.Iterator<E>
- Specified by:
hasNext
in class AbstractIterator<E>
next
public E next()
- Specified by:
next
in interface java.util.Iterator<E>
- Specified by:
next
in class AbstractIterator<E>
getFactory
public static IteratorFromReaderFactory<java.lang.String> getFactory(java.lang.String tag)
- Returns a factory that vends BeginEndIterators that reads the contents of
the given Reader, extracts text between the specified Strings, then
returns the result.
- Parameters:
tag
- The tag the XMLBeginEndIterator will match on
- Returns:
- The IteratorFromReaderFactory
getFactory
public static IteratorFromReaderFactory<java.lang.String> getFactory(java.lang.String tag,
boolean keepInternalTags,
boolean keepDelimitingTags)
getFactory
public static <E> IteratorFromReaderFactory<E> getFactory(java.lang.String tag,
Function<java.lang.String,E> op)
getFactory
public static <E> IteratorFromReaderFactory<E> getFactory(java.lang.String tag,
Function<java.lang.String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
main
public static void main(java.lang.String[] args)
throws java.io.IOException
- Throws:
java.io.IOException
Stanford NLP Group