Building govpedia.in

Lets begin!

Consuming tweet stream:

Need to consume public streams from twitter using their streaming API.

https://dev.twitter.com/streaming/overview

Twitter has provided an http client to listen to the streaming API.

https://github.com/twitter/hbc

The SampleStreamExample is good enough for the first cut. I just added a properties file instead of giving the credentials on the commandline.

https://github.com/twitter/hbc/tree/master/hbc-example/src/main/java/com/twitter/hbc/example

Now we have to parse the JSON streamed by the above code. Or it turns out we can index json docs directly in elasticsearch.

Indexing tweets in elasticsearch:

Install elasticsearch!

https://intercityup.com/blog/installing-elasticsearch-mac-os-x-10-9-mavericks-development.html

Preliminary tests in elasticsearch.

http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html

Java client for elasticsearch:

There seem to be a number of java clients available to talk to elastic search. Though the native client is highly recommended, on first look, I like the Jest client.

https://www.elastic.co/blog/found-java-clients-for-elasticsearch

Jest client: https://github.com/searchbox-io/Jest/tree/master/jest

Create a free website or blog at WordPress.com.

Up ↑