I followed the Hadoop MapReduce cookbook to create a Mahout Naive Bayes classification model of the 20news dataset. The important and relevant commands I have executed (after making some changes because I use Mahout 0.13 now, the book is a bit old) to get the final test result were (in the order ):
1. hadoop fs -put 20_newsgroups / * 20news-all
2 main catalog -i 20news-all -o 20news-seq
3 mahout seq2sparse -i 20news-seq -o 20news-vector
4 mahout split -i 20news-vector / tfidf-vectors -tr 20news-train-vectors
20news-vectors-of-test -rp 40 -subse -seq -xm sequential
5 mahout trainnb -i 20news-train-vectors -o model -li labelindex
6 mahout testnb -i 20news-train-vectors -m model -l labelindex -o
After that, I got the result:
Everything is fine.
My question is if I can sort a string of text, for example, "The situation in the Middle East continues to remain volatile, something … xyz ….." or a file containing the string above with the help of the mahout command and based on the template that I created at step 5.?
NOTE: I want the release to be the subject in which it is classified as sci.electronics.