public class ChineseSegmenterFeatureFactory<IN extends CoreLabel> extends FeatureFactory<IN> implements Serializable
c is Chinese character ("char"). c means current, n means next and p means previous.
| Feature | Templates |
|---|---|
| Current position clique | |
| useWord1 | CONSTANT, cc, nc, pc, pc+cc, if (As|Msr|Pk|Hk) cc+nc, pc,nc |
cliqueC, cliqueCnC, cliqueCp2C, cliqueCp3C, cliqueCp4C, cliqueCp5C, cliqueCpC, cliqueCpCnC, cliqueCpCp2C, cliqueCpCp2Cp3C, cliqueCpCp2Cp3Cp4C, cliqueCpCp2Cp3Cp4Cp5C, flags, knownCliques| Constructor and Description |
|---|
ChineseSegmenterFeatureFactory() |
| Modifier and Type | Method and Description |
|---|---|
Collection<String> |
featuresC(PaddedList<IN> cInfo,
int loc) |
Collection<String> |
featuresCnC(PaddedList<IN> cInfo,
int loc) |
Collection<String> |
featuresCpC(PaddedList<IN> cInfo,
int loc) |
Collection<String> |
getCliqueFeatures(PaddedList<IN> cInfo,
int loc,
Clique clique)
Extracts all the features from the input data at a certain index.
|
void |
init(SeqClassifierFlags flags) |
addAllInterningAndSuffixing, getCliques, getCliques, getWordpublic void init(SeqClassifierFlags flags)
init in class FeatureFactory<IN extends CoreLabel>public Collection<String> getCliqueFeatures(PaddedList<IN> cInfo, int loc, Clique clique)
getCliqueFeatures in class FeatureFactory<IN extends CoreLabel>cInfo - The complete data set as a List of WordInfoloc - The index at which to extract features.clique - The particular clique for which to extract features. It
should be a member of the knownCliques list.Collection of the features
calculated for the word at the specified position in info.public Collection<String> featuresC(PaddedList<IN> cInfo, int loc)
public Collection<String> featuresCpC(PaddedList<IN> cInfo, int loc)
public Collection<String> featuresCnC(PaddedList<IN> cInfo, int loc)