Using NodeBuilder to instantiate node based Elasticsearch client and Visualizing data by singhpratyush

elasticsearch · @singhpratyush · Dec 5 '17 (edited)

Using NodeBuilder to instantiate node based Elasticsearch client and Visualizing data

https://i2.wp.com/blog.fossasia.org/wp-content/uploads/2017/05/elasticsearch-logo-1200x625.png?resize=825%2C510&ssl=1

As [elastic.io](https://www.elastic.co/) mentions, Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. But in many setups, it is not possible to manually install an Elasticsearch node on a machine. To handle these type of scenarios, Elasticsearch provides the [`NodeBuilder`](https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.0/node-client.html) module, which can be used to spawn Elasticsearch node programmatically. Let’s see how.

## Getting Dependencies

In order to get the ES Java API, we need to add the following line to dependencies.
```
compile group: 'org.elasticsearch', name: 'securesm', version: '1.0'
```

The required packages will be fetched the next time we `gradle build`.

## Configuring Settings
In the Elasticsearch Java API, Settings are used to configure the node(s). To create a node, we first need to define its properties.

```java
Settings.Builder settings = new Settings.Builder();
settings.put("cluster.name", "cluster_name");

// Configuring HTTP details
settings.put("http.enabled", "true");
settings.put("http.cors.enabled", "true");
settings.put("http.cors.allow-origin", "https?:\/\/localhost(:[0-9]+)?/");  // Allow requests from localhost
settings.put("http.port", "9200");

// Configuring TCP and host
settings.put("transport.tcp.port", "9300");
settings.put("network.host", "localhost");

// Configuring node details
settings.put("node.data", "true");
settings.put("node.master", "true");

// Configuring index
settings.put("index.number_of_shards", "8");
settings.put("index.number_of_replicas", "2");
settings.put("index.refresh_interval", "10s");
settings.put("index.max_result_window", "10000");

// Defining paths
settings.put("path.conf", "/path/to/conf/");
settings.put("path.data", "/path/to/data/");
settings.put("path.home", "/path/to/data/");
settings.build();  // Buid with the assigned configurations
```
There are many more settings that can be tuned in order to get desired node configuration.

## Building the Node and Getting Clients
The Java API makes it very simple to launch an Elasticsearch node. This example will make use of settings that we just built.
```java
Node elasticsearchNode = NodeBuilder.nodeBuilder().local(false).settings(settings).node();
```

A piece of cake. Isn’t it? Let’s get a client now, on which we can execute our queries.

```java
Client elasticsearhClient = elasticsearchNode.client();
Shutting Down the Node
elasticsearchNode.close();
```

## The loklak Server
A nice implementation of using the module can be seen at [`ElasticsearchClient.java`](https://github.com/loklak/loklak_server/blob/development/src/org/loklak/data/ElasticsearchClient.java) in the [loklak project](https://loklak.org/). It uses the settings from a configuration file and builds the node using it.

## Visualisation using elasticsearch-head
So by now, we have an Elasticsearch client which is capable of doing all sorts of operations on the node. But how do we visualise the data that is being stored? Writing code and running it every time to check results is a lengthy thing to do and significantly slows down development/debugging cycle.

To overcome this, we have a web frontend called [`elasticsearch-head`](https://github.com/mobz/elasticsearch-head) which lets us execute Elasticsearch queries and monitor the cluster.
To run `elasticsearch-head`, we first need to have `grunt-cli` installed –

```
$ sudo npm install -g grunt-cli
```
Next, we will clone the repository using git and install dependencies –

```
$ git clone git://github.com/mobz/elasticsearch-head.git
$ cd elasticsearch-head
$ npm install
```
Next, we simply need to run the server and go to indicated address on a web browser –

```
$ grunt server
```

At the top, enter the location at which `elasticsearch-head` can interact with the cluster and `Connect`.

https://i1.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic1.png?w=660&ssl=1

Upon connecting, the dashboard appears telling about the status of cluster –

https://i0.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic2.png?w=660&ssl=1

The dashboard shown above is from the loklak project (will talk more about it).

There are 5 major sections in the UI –
1. **Overview**: The above screenshot, gives details about the indices and shards of the cluster.
2. **Index**: Gives an overview of all the indices. Also allows to add new from the UI.
3. **Browser**: Gives a browser window for all the documents in the cluster. It looks something like this –
The left pane allows us to set the filter (index, type and field). The table listed is sortable. But we don’t always get what we are looking for manually. So, we have the following two sections.

https://i0.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic3.png?w=660&ssl=1

4. **Structured Query**: Gives a dead simple UI that can be used to make a well-structured request to Elasticsearch. This is what we need to search for to get Tweets from @gsoc that are indexed –

https://i1.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic4.png?w=660&ssl=1

5. **Any Request**: Gives an advance console that allows executing any query allowable by Elasticsearch API.

## A little about the loklak project and Elasticsearch

> loklak is a server application which is able to collect messages from various sources, including Twitter. The server contains a search index and a peer-to-peer index sharing interface. All messages are stored in an Elasticsearch index.

Source: [github/loklak/loklak_server](https://github.com/loklak/loklak_server)

The project uses Elasticsearch to index all the data that it collects. It uses NodeBuilder to create Elasticsearch node and process the index. It is flexible enough to join an existing cluster instead of creating a new one, just by changing the configuration file.

## Conclusion
This blog post tries to explain how `NodeBuilder` can be used to create Elasticsearch nodes and how they can be configured using Elasticsearch `Settings`.

It also demonstrates the installation and basic usage of `elasticsearch-head`, which is a great library to visualize and check queries against an Elasticsearch cluster.

The official [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.0/index.html) is a good source of reference for its Java API and all other aspects.

> Originally posted at FOSSASIA blog - [Using NodeBuilder to instantiate node based Elasticsearch client and Visualising data](https://blog.fossasia.org/using-nodebuilder-to-instantiate-node-based-elasticsearch-client-and-visualizing-data/)

👍 cheetah, singhpratyush, rajatdangi, tokenteller, fivestargroup, primetimesports, ridex, ansarimofid, bxute, nileshchaturvedi

`author`	singhpratyush
`permlink`	using-nodebuilder-to-instantiate-node-based-elasticsearch-client-and-visualising-data
`category`	elasticsearch
`json_metadata`	{"tags":["elasticsearch","data-visualization","gsoc","gradle","big-data"],"users":["gsoc"],"image":["https://i2.wp.com/blog.fossasia.org/wp-content/uploads/2017/05/elasticsearch-logo-1200x625.png?resize=825%2C510&ssl=1","https://i1.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic1.png?w=660&ssl=1","https://i0.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic2.png?w=660&ssl=1","https://i0.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic3.png?w=660&ssl=1","https://i1.wp.com/gitlab.com/singh.pratyush96/website-static-server/raw/master/elastic4.png?w=660&ssl=1"],"links":["https://www.elastic.co/","https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.0/node-client.html","https://github.com/loklak/loklak_server/blob/development/src/org/loklak/data/ElasticsearchClient.java","https://loklak.org/","https://github.com/mobz/elasticsearch-head","https://github.com/loklak/loklak_server","https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.0/index.html","https://blog.fossasia.org/using-nodebuilder-to-instantiate-node-based-elasticsearch-client-and-visualizing-data/"],"app":"steemit/0.1","format":"markdown"}
`created`	2017-12-05 04:06:27
`last_update`	2017-12-05 04:21:30
`depth`	0
`children`	3
`last_payout`	2017-12-12 04:06:27
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	6,826
`author_reputation`	7,035,648,262,478
`root_title`	"Using NodeBuilder to instantiate node based Elasticsearch client and Visualizing data"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	22,424,838
`net_rshares`	5,388,747,031
`author_curate_reward`	""

properties (23)vote details (10)

voter	rshares	pct
cheetah	2,450,884,025	0.08%
tokenteller	206,227,459	1%
fivestargroup	147,333,956	0.02%
primetimesports	138,595,435	0.02%
ridex	124,622,669	90%
singhpratyush	1,160,544,926	100%
rajatdangi	1,160,538,561	100%
nileshchaturvedi	0	100%
bxute	0	100%
ansarimofid	0	100%