public class AceSentenceSegmenter extends DomReader
Constructor and Description |
---|
AceSentenceSegmenter() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
static List<List<AceToken>> |
tokenizeAndSegmentSentences(String filenamePrefix) |
static AceToken |
wordTokenToAceToken(RobustTokenizer.WordToken wordToken,
int sentence) |
getAttributeValue, getChildByAttribute, getChildByName, getChildByNameAndAttribute, getChildrenByName, readDocument
public static List<List<AceToken>> tokenizeAndSegmentSentences(String filenamePrefix) throws IOException, SAXException, ParserConfigurationException
filenamePrefix
- path to an ACE .sgm file (but not including the .sgm extension)IOException
SAXException
ParserConfigurationException
public static AceToken wordTokenToAceToken(RobustTokenizer.WordToken wordToken, int sentence)
public static void main(String[] args) throws IOException, SAXException, ParserConfigurationException