CoNLL2011DocumentReader (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.dcoref.CoNLL2011DocumentReader

```
public class CoNLL2011DocumentReader
extends java.lang.Object
```
Read _conll file format from CoNLL2011. See http://conll.bbn.com/index.php/data.html. CoNLL2011 files are in /u/scr/nlp/data/conll-2011/v0/data/ dev train Contains *_auto_conll files (auto generated) and _gold_conll (hand labelled), default reads _gold_conll There is also /u/scr/nlp/data/conll-2011/v0/conll.trial which has *.conll files (parse has _ at end) Column Type Description 1 Document ID This is a variation on the document filename 2 Part number Some files are divided into multiple parts numbered as 000, 001, 002, ... etc. 3 Word number 4 Word itself 5 Part-of-Speech 6 Parse bit This is the bracketed structure broken before the first open parenthesis in the parse, and the word/part-of-speech leaf replaced with a *. The full parse can be created by substituting the asterix with the "([pos] [word])" string (or leaf) and concatenating the items in the rows of that column. 7 Predicate lemma The predicate lemma is mentioned for the rows for which we have semantic role information. All other rows are marked with a "-" 8 Predicate Frameset ID This is the PropBank frameset ID of the predicate in Column 7. 9 Word sense This is the word sense of the word in Column 3. 10 Speaker/Author This is the speaker or author name where available. Mostly in Broadcast Conversation and Web Log data. 11 Named Entities These columns identifies the spans representing various named entities. 12:N Predicate Arguments There is one column each of predicate argument structure information for the predicate mentioned in Column 7. N Coreference Coreference chain information encoded in a parenthesis structure.

Author:

Angel Chang

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`CoNLL2011DocumentReader.CorefMentionAnnotation`
`static class`	`CoNLL2011DocumentReader.CorpusStats`
`static class`	`CoNLL2011DocumentReader.Document`
`static class`	`CoNLL2011DocumentReader.NamedEntityAnnotation`
`static class`	`CoNLL2011DocumentReader.Options` Flags

Field Summary

Fields
Modifier and Type Field and Description

protected java.util.List<java.io.File> fileList

static java.util.logging.Logger logger

Fields
Modifier and Type	Field and Description
`protected java.util.List<java.io.File>`	`fileList`
`static java.util.logging.Logger`	`logger`

Constructor Summary

Constructors
Constructor and Description
`CoNLL2011DocumentReader(java.lang.String filepath)`
`CoNLL2011DocumentReader(java.lang.String filepath, CoNLL2011DocumentReader.Options options)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()`
`static Pair<java.lang.Integer,java.lang.Integer>`	`getMention(java.lang.Integer index, java.lang.String corefG, java.util.List<CoreLabel> sentenceAnno)`
`CoNLL2011DocumentReader.Document`	`getNextDocument()`
`static boolean`	`include(java.util.Map<Pair<java.lang.Integer,java.lang.Integer>,java.lang.String> sentenceInfo, Pair<java.lang.Integer,java.lang.Integer> mention, java.lang.String corefG)`
`static void`	`main(java.lang.String[] args)` Reads and dumps output, mainly for debugging.
`void`	`reset()`
`static void`	`usage()`
`static void`	`writeTabSep(java.io.PrintWriter pw, CoreMap sentence, CollectionValuedMap<java.lang.String,CoreMap> chainmap)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

fileList

protected final java.util.List<java.io.File> fileList

logger

public static final java.util.logging.Logger logger

Constructor Detail

CoNLL2011DocumentReader

public CoNLL2011DocumentReader(java.lang.String filepath)

CoNLL2011DocumentReader

public CoNLL2011DocumentReader(java.lang.String filepath,
                               CoNLL2011DocumentReader.Options options)

Method Detail

reset
```
public void reset()
```

getNextDocument

public CoNLL2011DocumentReader.Document getNextDocument()

close
```
public void close()
```

usage
```
public static void usage()
```

getMention

public static Pair<java.lang.Integer,java.lang.Integer> getMention(java.lang.Integer index,
                                                                   java.lang.String corefG,
                                                                   java.util.List<CoreLabel> sentenceAnno)

include

public static boolean include(java.util.Map<Pair<java.lang.Integer,java.lang.Integer>,java.lang.String> sentenceInfo,
                              Pair<java.lang.Integer,java.lang.Integer> mention,
                              java.lang.String corefG)

writeTabSep

public static void writeTabSep(java.io.PrintWriter pw,
                               CoreMap sentence,
                               CollectionValuedMap<java.lang.String,CoreMap> chainmap)

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException

Reads and dumps output, mainly for debugging.

Throws:: java.io.IOException

Class CoNLL2011DocumentReader

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

fileList

logger

Constructor Detail

CoNLL2011DocumentReader

CoNLL2011DocumentReader

Method Detail

reset

getNextDocument

close

usage

getMention

include

writeTabSep

main