|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.util.AbstractIterator<E>
edu.stanford.nlp.objectbank.XMLBeginEndIterator<E>
public class XMLBeginEndIterator<E>
A class which iterates over Strings occuring between the begin and end of
a selected tag or tags. The element is specified by a regexp, matched
against the name of the element (i.e., excluding the angle bracket
characters) using matches()
).
The class ignores all other characters in the input Reader.
There are a few different ways to modify the output of the
XMLBeginEndIterator. One way is to ask it to keep internal tags;
if keepInternalTags
is set, then
<text>A<foo/>B</text> will be printed as A<foo/>B.
Another is to tell it keep delimiting tags; in the above example,
<text> will be kept as well.
Finally, you can ask it to keep track of the nesting depth; the
ordinary behavior of this iterator is to close all tags with just
one close tag. This is incorrect XML behavior, but is kept in case
any code relies on it. If countDepth
is set, though,
the iterator keeps track of how much it has nested.
Constructor Summary | |
---|---|
XMLBeginEndIterator(Reader in,
String tagNameRegexp)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
boolean keepInternalTags)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
Function<String,E> op,
boolean keepInternalTags)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
Function<String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
|
XMLBeginEndIterator(Reader in,
String tagNameRegexp,
Function<String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags,
boolean countDepth)
|
Method Summary | ||
---|---|---|
static IteratorFromReaderFactory<String> |
getFactory(String tag)
Returns a factory that vends BeginEndIterators that reads the contents of the given Reader, extracts text between the specified Strings, then returns the result. |
|
static IteratorFromReaderFactory<String> |
getFactory(String tag,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
|
static
|
getFactory(String tag,
Function<String,E> op)
|
|
static
|
getFactory(String tag,
Function<String,E> op,
boolean keepInternalTags,
boolean keepDelimitingTags)
|
|
boolean |
hasNext()
Returns true if and only if this Tokenizer has more elements. |
|
static void |
main(String[] args)
|
|
E |
next()
Returns the next token from this Tokenizer. |
|
protected E |
parseString(String s)
|
|
E |
peek()
Returns the next token, without removing it, from the Tokenizer, so that the same token will be again returned on the next call to next() or peek(). |
|
List<E> |
tokenize()
Returns pieces of text in element as a List of tokens. |
Methods inherited from class edu.stanford.nlp.util.AbstractIterator |
---|
remove |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface edu.stanford.nlp.process.Tokenizer |
---|
remove |
Constructor Detail |
---|
public XMLBeginEndIterator(Reader in, String tagNameRegexp)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, boolean keepInternalTags)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, Function<String,E> op, boolean keepInternalTags)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, boolean keepInternalTags, boolean keepDelimitingTags)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, boolean keepInternalTags, boolean keepDelimitingTags, boolean countDepth)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, Function<String,E> op, boolean keepInternalTags, boolean keepDelimitingTags)
public XMLBeginEndIterator(Reader in, String tagNameRegexp, Function<String,E> op, boolean keepInternalTags, boolean keepDelimitingTags, boolean countDepth)
Method Detail |
---|
protected E parseString(String s)
public boolean hasNext()
Tokenizer
true
if and only if this Tokenizer has more elements.
hasNext
in interface Tokenizer<E>
hasNext
in interface Iterator<E>
hasNext
in class AbstractIterator<E>
public E next()
Tokenizer
next
in interface Tokenizer<E>
next
in interface Iterator<E>
next
in class AbstractIterator<E>
public E peek()
Tokenizer
peek
in interface Tokenizer<E>
public List<E> tokenize()
tokenize
in interface Tokenizer<E>
public static IteratorFromReaderFactory<String> getFactory(String tag)
tag
- The tag the XMLBeginEndIterator will match on
public static IteratorFromReaderFactory<String> getFactory(String tag, boolean keepInternalTags, boolean keepDelimitingTags)
public static <E> IteratorFromReaderFactory<E> getFactory(String tag, Function<String,E> op)
public static <E> IteratorFromReaderFactory<E> getFactory(String tag, Function<String,E> op, boolean keepInternalTags, boolean keepDelimitingTags)
public static void main(String[] args) throws IOException
IOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |