If you are a hive noob like me, this post may be of use to you. I just wanted to install hadoop/hive on my ubuntu (on a dell) box, so that i can run hive (hive -e “”) commands from eclipse, before I commit the python scripts.
Installing hadoop and hive seems to be pretty straight forward. Except if you are someone who always makes wrong choices (like me!). I edited the wrong config file and had to spend considerable time to figure this simple thing out.
Following the above instructions, I was able to install hive, except the following step.
As mentioned in the above post, copying the lib directory from hive-0.12.0.tar.gz to the $HIVE_HOME directory (/opt/hive in my case), solved the problem.
Hive, by default stores the metadata in derby database. Its good enough apparently but strangely writes derby.log file and metastore_db, in whichever directory we start hive shell from. There might be ways to fix this, but i decided to get rid of derby and use mysql instead.
Configuring hive with mysql:
If you follow the instructions above, you should be fine. Just make sure you edit the correct hive-site.xml file.
On my box,
hduser@learningbox:~$ locate hive-site.xml
As mentioed in the above blog post, editing /opt/hive/conf/hive-site.xml gets the stuff done. Except if you edited another file, like I did.
Creating a table with json serde and inserting turned out to be frustrating. The solution that worked was to create table, move data to hdfs and add the partition.
I also faced this issue in hive 0.13.