Putting SQL back in NoSQL and Hadoop with Apache Phoenix

By now most of you might have guessed that things like Hadoop and HBase(and others) and big data in general are going to be in the everywhere. With all the power comes a lot of complexity. Most of you will agree that event the most seasoned Java developers will find performing stuff like Joins quite hard in the bare bones Map-Reduce world, Pig and Hive does come to rescue but still is not quite that easy as in the SQL world. So what’s the solution?

Apache Phoenix is a top level Apache project with the motto “putting SQL back in NoSQL”. It provides the power of standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. It allows you to perform operations like you do with SQL and Phoenix will do all the heavy lifting of Creating map reduce jobs and other stuff and return you back the results in the good old ResultSet

Pheonix has two ways in which you can connect to the Hbase Database

The JDBC API

sqlline.py python script

The JDBC API

Phoenix provides a JDBC API can be used as any other JDBC API(except you are not supposed to do pooling), you create a Connection, create a Statement Object execute it and get back a ResultSet. Sample isn’t it. Sample code can be found below.

import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.PreparedStatement; import java.sql.Statement; public class Test { public static void main(String[] args) throws SQLException { Statement stmt = null; ResultSet rset = null; Connection con = DriverManager.getConnection("jdbc:phoenix:[zookeeper]"); stmt = con.createStatement(); stmt.executeUpdate("create table test (mykey integer not null primary key, mycolumn varchar)"); stmt.executeUpdate("upsert into test values (1,'Hello')"); stmt.executeUpdate("upsert into test values (2,'World!')"); con.commit(); PreparedStatement statement = con.prepareStatement("select * from test"); rset = statement.executeQuery(); while (rset.next()) { System.out.println(rset.getString("mycolumn")); } statement.close(); con.close(); } }

The sqlline.py utility

Its a wonderful little utility that can be found in the ‘bin’ directory of the Phoenix. All you got to do is navigate to bin and run the command.

Firing it up is as easy as executing the following command with your zookeeper URL and then you can just file up SQL.

sqlline.py localhost:2181 #where the localhost:2181 is the URL to zookeeper

This is the general introduction to Apache Phoenix, be sure to check out the real deal @ https://phoenix.apache.org

Putting SQL back in NoSQL and Hadoop with Apache Phoenix

The JDBC API

The sqlline.py utility

Leave a Reply Cancel reply

Categories