Mahout Development Environment with Maven and Eclipse (1)

I’m reading “Mahout in Action” MEAP Edition, but it doesn’t teach how to construct a development environment of Mahout…
So I wrote the way of that by testing sample codes of “Mahout in Action”.

Install

I examine based on Windows 2008 x64.
Install several packages.

  • Cygwin
  • Java SDK 6u23 x64
  • Eclipse 3.6(helios) SR1 x64
  • Maven 3.0.2
  • (Hadoop 0.21.0)

Hadoop is not used in this article.

Maven

I am not good at Maven… So I’ve read the following documents.

Maven 3 has “Maven 2 Repository”! :P

Source of Mahout

Use not the binary but the source code of Mahout, because reference them in Eclipse.

I used Mahout 0.4, but 0.5 SNAPSHOT may be better since Mahout’s API is fluid.

At first, start Eclipse and create a workspace. We take it “C:\Users\shuyo\workspace” for the present.
Extract the source of Mahout below the workspace. It is “C:\Users\shuyo\workspace\mahout-distribution-0.4″ for the present.
Convert Maven project of Mahout into Eclipse project with the below command.

cd C:\Users\shuyo\workspace\mahout-distribution-0.4
mvn eclipse:eclipse

Now set the classpath variable M2_REPO of Eclipse to Maven 2 local repository.

mvn -Declipse.workspace= eclipse:add-maven-repo

But “Maven – Guide to using Eclipse with Maven 2.x” says “Issue: The command does not work”. So set it in Eclipse directly.

  • Open Window > Preferences > Java > Build Path > Classpath Valirables from Eclipse’s menu.
  • Press “New” and Add Name as “M2_REPO” and Path as Maven 2 repository path (its default is .m2/repository at your user directory).

When M2_REPO doesn’t be set, the following errors are thrown.

The project cannot be built until build path errors are resolved
Unbound classpath variable: 'M2_REPO/junit/junit/3.8.1/junit-3.8.1.jar' in project '********'

Finally import the converted Eclipse project of Mahout.

  • Open File > Import > General > Existing Projects into Workspace from Eclipse menu.
  • Select the project directory C:\Users\shuyo\workspace\mahout-distribution-0.4 and all projects.

Continued on the next post.

About these ads
This entry was posted in Development, Eclipse, Java, Machine Learning, Mahout, Maven. Bookmark the permalink.

10 Responses to Mahout Development Environment with Maven and Eclipse (1)

  1. Pingback: Mahout Development Environment with Maven and Eclipse (2) | Shuyo's Weblog

  2. Pingback: Hadoop Development Environment with Eclipse | Shuyo's Weblog

  3. Laila M. says:

    when I tried this it showed..

    cd C:\Users\shuyo\workspace\mahout-distribution-0.4
    mvn eclipse:eclipse

    Cannot execute mojo: eclipse. It requires a project with an existing pom.

    what shall i do?
    I tried doing the same but changed the directory into “mahout-collections-1.0″ (another zip that i download from mahout) and it worked successfully, I think because it contained pom file.

    I need your reply urgently, I searched google for 2 hrs and no results :(

    Thanks alot

    • shuyo says:

      “shuyo” is my user name :P
      You should change the current directory according to your environment.

      • Laila M. says:

        I know :) .. I meant I tried it using my directory but it didn’t work, anyway I fixed the problem I discovered that I downloaded the binary zip instead of the source :S .. but thanks alot for the reply, I’ll complete the next steps hope it’ll work :)..

  4. shuyo says:

    Mmm, I see. But I can’t reproduce it…
    As the message said, is there a pom xml in the current directory?

  5. divya says:

    I don ve the all projects folder in my mahout 0.4 distribution is it something i should create or what should be done pls help ? i am completely new to mahout and eclipse

  6. Jason Gowans says:

    Extremely helpful post. Thank you!

  7. Respected Sir, I completly follows your steps, I am facing some errors at the end of this project. When i import mahout-distribution-0.7 project into my workspace(/home/saeed/workspace/mahout-distribution-0.7).
    there are some missing files (OpenIntLongHashMap.java, math.function.IntObjectProcedure.java, math.set.OpenIntHashSet.java, etc) that require in another packages. What can i do sir,
    My mahout is 0.7.

  8. Excellent post thanks, sometimes simple thinks take the longest and this is so detailed you just cannot make mistakes :)

    Just a typo in
    “mvn -Declipse.workspace= eclipse:add-maven-repo”

    It worked for me removing the space after “=” :
    “mvn -Declipse.workspace=eclipse:add-maven-repo”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s