next up previous contents index
Next: Upweighting document zones. Up: Improving classifier performance Previous: Features for text   Contents   Index

Document zones in text classification

As already discussed in Section 6.1 , documents usually have zones, such as mail message headers like the subject and author, or the title and keywords of a research article. Text classifiers can usually gain from making use of these zones during training and classification.


© 2008 Cambridge University Press
This is an automatically generated page. In case of formatting errors you may want to look at the PDF edition of the book.