본문 바로가기

카테고리 없음

Boost performance with Elastic Search

출처 : https://www.codeproject.com/Tips/1233845/Boost-performance-with-Elastic-Search


Boost performance with Elastic Search

9 Mar 2018
Few Tips & tricks to enhance performance from ElasticSearch
Is your email address OK? You are signed up for our newsletters but your email address is either unconfirmed, or has not been reconfirmed in a long time. Please click here to have a confirmation email sent so we can confirm your email address and start sending you newsletters again. Alternatively, you can update your subscriptions.

Introduction

In this article I will be sharing my findings to boost perfromance with ElasticSearch when application are running on-premise datacentre and Elastic instance is on AWS managed service.

Background

While working on one of my recent projects there was requirement to use ElasticSearch for storing complex nested data. elastic instance is on cloud i.e. AWS and my application was deployed on on-premise datacentre.

As a result few of my APIs were perfroming very poor due to following:

  1. Searching in complex nested index
  2. Network latency due to large amount of data transfer

How performance got improved

In order to overcome performance issues, I followed following three fold approach:

Filter your response

Elastic provides two types of filtering Response Filtering and Source Filtering. Both helps in achieving same result i.e. filtering the unwanted fields/values from the response.

The key difference between Response filtering and Source filtering is that Response Filtering filters fields when sending the response while Source filtering filters while execution of the query.

Response filtering request sample

GET /_search?q=elasticsearch&filter_path=took,hits.hits._id,hits.hits._score

Response filtering Response sample

{
  "took" : 3,
  "hits" : {
    "hits" : [
      {
        "_id" : "0",
        "_score" : 1.6375021
      }
    ]
  }
}

Make your InnerHits intelligent

Elastic provides facility to fetch records of nested index using InnerHits. However blindly using innerhits can cause ElasticSearch to process data which is not required. Hence it is recommended to fetch required fields from inner hits using the doc values.

Columnize the required field

Most of the fields in ElasticSearch are indexed which makes them searchable. However in complex nested documents you may need to index some fields which are nested for which DocValues comes in picture.

They store data in column fashion on disk at source of document which improved the searching, sorting, aggregation etc of nested document.  These values are built at document index time only.

Points of Interest

Using above mentioned method I got performance enahncement by 70% and CPU utilization decreased to 35-40% from 70-80%.