Netcat => Spark Streaming => ElasticSearch

In this tutorial, let me walk you through the Words typed in the Unix Netcat Channel are streamed through Spark and written into Elastic Search lively as a word,count in a Elastic Search Index.  Here am assuming Spark to ElasticSearch ingestion automatically creates the index during the ingestion process.

What steps I followed:

  1. Install elasticsearch (brew install elasticsearch)
  2. Install mobz/head plugin (https://github.com/mobz/elasticsearch-head)
  3. Open a Netcat channel in Unix streams and type some words
  4. Launch a spark program with org.apache.spark.streaming._ imported libraries
    1. Read the socketTextStream through spark streaming context via 9999 port
    2. flatmap the words
    3. For Each DStreams RDD => Convert them to Dataframe and Register a Temp Table
    4. Save each Dataframe to Elastic Search Index
  5. Open https://localhost:9200/_plugin/head and keep Refreshing the Page to see the latest word, count columns in a ES Index.

No Index is present:

Screen Shot 2016-06-09 at 10.15.41 PM

Create the Below Program in IntelliJ and Execute the Program:

After the Exeuction in IntelliJ, you will see a Streaming Context launched locally and job keeps running in the Run Console.

Open Netcat and Type some words and Press Enter:

$ nc -lk 9999

Iteration 1:

(23 Documents must be stored)

Screen Shot 2016-06-09 at 10.20.18 PM

Now see the words in Elastic Search with the Index Auto Created.

Screen Shot 2016-06-09 at 10.21.38 PM

Screen Shot 2016-06-09 at 10.22.11 PM

Iteration 2:

Enter some other words and Press Enter

Screen Shot 2016-06-09 at 10.23.40 PM.png

Now, see it in the Elastic Search Index:

You can see document count increased to 49 as below.

Screen Shot 2016-06-09 at 10.24.38 PM

Screen Shot 2016-06-09 at 10.25.31 PM

This was all possible with the magic of Spark+DataFrame+ES APIs. Same way, you can write it into Cassandra, Mongo, Hbase, Oracle, mysql or any destination databases or systems.

Now you can stop the spark streaming job. Hope you Enjoyed this tutorial.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s