Comparing raw features with processed tf-idf features on the 20 Newsgroups dataset