I’ve got a big passion for Big-Data processing and Machine-Learning algorithms. So I wanted to explore the most famous recommendation engine in the open-source world which is Mahout. This is an experimental Machine Learning project has been developed based on a number of cutting edge scalable open-source projects such as Apache Mahout™ and Redis. I’ve called it “Next”. The project goal is providing recommendation services applying collaborative filtering techniques which focused on the users/items associations. The model I was thinking about was to provide user-based recommendation services as an application service provider. So a suitable HTTP API and very extendable underlying components have been chosen.
Recommendation algorithms have been made by famous giants such as Amazon, Google, Yahoo, Netflix or Youtube. They use this kind of knowledge to make suggestions based on what users read, bought, watched, liked or commented. The most widely used recommendation algorithm they use, known as collaborative filtering.
Who Needs It?
Actually from a practical view, using “Next” services is much easier than using Mahout directly. “Next” makes the front-end developers who don’t know much about back-end services and machine learning techniques able to find out what is an anonymous user’s taste when she is interacting with a website. This means making simple web sites and web applications more intelligent.
How It Works
The first thing first thing website owners need to do is to feed engine with current observed data. The engine provides the ability to continue answering requests during analyzing other received events. Meantime website would be able to ask recommendations transparently and shows the items current user (anonymous or authenticated) might be interested in.
I’ve used Apache Mahout, Redis and a number of other open-source tools. Redis has been used as a cache server and DB both. I need a significant performance on DB side, so Redis was one of the best options. The solution can be hosted on hadoop for achieving more scalability and bigger processing power while it already is pretty scalable.
My plan is simple. I am going to extend the machine for satisfying a number of business demands and making current websites more intelligent. Making users able to find what they are really looking for would be an exciting goal. Meantime I would be able to represent the project as my master thesis.
Three big blogs are feeding “Next” right now. A number of non-profit organizations would use “Next” very soon. After a big refactoring and providing comprehensive documents I might make “Next” open-source.