Apache Mahout is an Apache project to produce free implementations of distributed or otherwise scalable machine learning algorithms on the Hadoop platform.
The MAHOUT-DISTRIBUTION-0.4-SRC installation is done in below versions of Linux, Java and Hadoop respectively.
UBUNTU 12.04 LTS
Install maven using below command.
apt-get install maven2
After installation check for maven version using command mvn –version
Set the Maven environment variables like below.
I have hduser as a dedicated hadoop system user. I had installed my Hadoop in /home/hduser/hadoop folder. Now I am going to install mahout in /home/hduser folder. Change the directory to the hduser and execute below commands.
Download the Mahout from below URL using wget.
[Out of so many zipped files in there download the .src zipped file]
Unzip the tar file.
sudo tar xzf mahout-distribution-0.4-src.tar.gz
Change the name to Mahout.
sudo mv mahout-distribution-0.4-src mahout
Now go to Mahout folder and execute bellow command.
That’s it, now we can run the Mahout examples.
After every example you have to clear the output directory and tmp directory. Otherwise it give error like
/home/hduser/output [Specified in --output option]
/home/hduser/temp [Specified in hadoop.tmp.dir of core-site.xml file]