edu.stanford.nlp.trees.international
Class PunctEquivalenceClasser

java.lang.Object
  extended by edu.stanford.nlp.trees.international.PunctEquivalenceClasser

public class PunctEquivalenceClasser
extends Object

Performs equivalence classing of punctuation per PTB guidelines. Many of the multilingual treebanks mark all punctuation with a single POS tag, which is bad for parsing.

PTB punctuation POS tag set (12 tags): 37. # Pound sign 38. $ Dollar sign 39. . Sentence-final punctuation 40. , Comma 41. : Colon, semi-colon 42. ( Left bracket character 43. ) Right bracket character 44. " Straight double quote 45. ` Left open single quote 46. " Left open double quote 47. ' Right close single quote 48. " Right close double quote

See http://www.ldc.upenn.edu/Catalog/docs/LDC95T7/cl93.html

Author:
Spence Green

Constructor Summary
PunctEquivalenceClasser()
           
 
Method Summary
static String getPunctClass(String punc)
          Return the equivalence class of the argument.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PunctEquivalenceClasser

public PunctEquivalenceClasser()
Method Detail

getPunctClass

public static String getPunctClass(String punc)
Return the equivalence class of the argument. If the argument is not contained in and equivalence class, then an empty string is returned.

Parameters:
punc -
Returns:
The class name if found. Otherwise, an empty string.


Stanford NLP Group