Wrapping a Big Data warehouse platform by Spring/REST

Three months ago, my client just asked me to provide them a Hadoop based data warehouse platform. They had the experience of OLAP things and RDBMSs. Meantime they were working on a JavaScript based data visualizer.  The notified that they have to migrate to the big-data world.

So what I supposed to make, was a Hadoop based platform for preparing, processing and dynamic warehousing Terabyte size text files and making them a relational shape at the end.

I knew Hadoop 2.2.0 rocks. So I just decided to use Hadoop 2.2.0 regarding it’s big improvements on the YARN flexible computation model and advanced recovery techniques.

After preparing Hadoop I just deployed HBase as a big data world columnar database. I was going to make Hive up over HBase. I just made it but found it soon that Hive works much faster when it runs without HBase. So I just ignored HBase in this case.

The UI developer was very interested to talk to the service in JSON/REST. I have a respect for REST either. So I just developed a REST client to connect to HiveServer2 client which talks the HiveServer2 on a node of the cluster. It works like a charm. I think spring boot made it much easier than ever to make services based on Microservice architecture.

This little middle server would be a good place for implementing business and security policies.

The Apache Sqoop also is a part of the platform. Fortunately the sqoop client provides a cool REST API. So I just wrapped it by my Spring app.

I’ve developed a number of Linux bash scripts for preparing data. The spring app wraps them either within it’s REST API.

What we have now is a very scalable automated dynamic data warehouse platform based on Hadoop 2.2.0 which can be used by the front-end tools as simple as an RDBMS.

Running the app as a service just by giving it a call through terminal is pretty cool. No Tomcat, no JSP container and no configuration. All you need is just to say it ‘run’.

This entry was posted in Big Data, Java, Linux, Open Source, Software Engineering, Software Market Demands, Web. Bookmark the permalink.

Comments are closed.