Sublinear tf scaling
It seems unlikely that twenty occurrences of a term in a document truly carry twenty times the significance of a single occurrence. Accordingly, there has been considerable research into variants of term frequency that go beyond counting the number of occurrences of a term. A common modification is to use instead the logarithm of the term frequency, which assigns a weight given by

(28) 
In this form, we may replace by some other function as in (28), to obtain:

(29) 
Equation (23) can then be modified by replacing tfidf by wfidf as defined in (29).
