Feature-Based Filtering: Analyzing Documents for Relevant Keywords
SMART algorithm (Salton, Cornell, ‘69):
compute ratio:
frequency of word in this document
average frequency of word in all documents
pick n words with highest ratios as the representation of the document