Putting SQL back in NoSQL and Hadoop with Apache Phoenix

phoenix-logo-trans

By now most of you might have guessed that things like Hadoop and HBase(and others) and big data in general are going to be in the everywhere. With all the power comes a lot of complexity. Most of you will agree that event the most seasoned Java developers will find performing stuff like Joins quite hard in the bare bones Map-Reduce world, Pig and Hive does come to rescue but still is not quite that easy as in the SQL world. So what’s the solution?

Apache Phoenix is a top level Apache project with the motto “putting SQL back in NoSQL”. It provides the power of  standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. It allows you to perform operations like you do with SQL and Phoenix will do all the heavy lifting of Creating map reduce jobs and other stuff and return you back the results in the good old ResultSet

Pheonix has two ways in which you can connect to the Hbase Database

  1. The JDBC API
  2. sqlline.py python script

The JDBC API

Phoenix provides a JDBC API can be used as any other JDBC API(except you are not supposed to do pooling), you create a Connection, create a Statement Object execute it and get back a ResultSet. Sample isn’t it. Sample code can be found below.

 

 

The sqlline.py utility

Its a wonderful little utility that can be found in the ‘bin’ directory of the Phoenix. All you got to do is navigate to bin and run the command.

Firing it up is as easy as executing the following command with your zookeeper URL and then you can just file up SQL.

This is the general introduction to Apache Phoenix, be sure to check out the real deal @ https://phoenix.apache.org

 

email

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.