My Tech: ElasticSearch on AWS with AutoScaling Groups and Spot Instances

Saturday, 8 November 2014

ElasticSearch on AWS with AutoScaling Groups and Spot Instances

One of the most powerful feature of ElasticSearch is its ability to scale horizontally, in many different ways; routing, sharding, and time / pattern based index creation and query. It is a robust storage solution that is capable of starting out small and cheap and then grow and scale as the load and volume rises.

Having implemented the ElasticSearch cluster at where I'm working myself single-handedly, I'll go over a few points of interest. (The design is pretty straight forward for people who have worked with ElasticSearch. If you want to know how to set it up, RTFM.)

As the diagram is clearly marked:

First of all, you want to use a single security group across all your data nodes and master nodes for the purpose of discovery. This "shields" your nodes from other nodes or malicious attempt to join the nodes and is part of your security. Open port 9300 to and from this security group itself.
A split of "core" data nodes and "spot" data nodes. Basically you have a small set of "core" data nodes that guarantee the safety of the data. A set of spot instances then are added to the cluster to boost the performance.

Set rack_id and cluster.routing.allocation.awareness.attributes!!!
I don't like stating the obvious, but THIS IS CRITICAL. Set the "core" nodes to use the same rack_id while the spot instances to use another. This will force the replication to store at least 1 complete set of data on "core" nodes. Also, install kopf plugin and MAKE SURE THAT IS THE CASE. Spot instances are just that, spot instances. They can all disappear in a blink of an eye.
Your shard and replication count directly affects the number of maximum number of machines you can "Auto Scale" to. Self explanatory.
You can update the instance size of the servers specified in the launch configuration and terminate the instances one by one to "vertically" scale the servers should you run into horizontal scaling limit due to shard and replication count limit.
This is an incredibly economical set up. Taking r3.2xlarge instances for example, even 3 year heavy reserve cost $191 / month while spot instances cost you $60 / month. It is the ability to leverage spot instances that makes all managed hosting of ElasticSearch look like day light robbery.
For $4k / month, you can easily scale all the way up to 2TB+ memory and 266+ cores and 10TB of SSD for 30k iops (assuming 50% of $4k monthly fee spent on spot instances and 25% on SSD.). You get 60gb ram and 780gb storage from Bonsai btw.

Setup your master nodes on dedicated separate set of Auto Scaling Groups and have at least 2 servers in it. This is to prevent the cluster from falling apart should any of those master nodes in the Auto Scaling Group gets recycled.
Another point of interest with the necessity of these dedicated master nodes is that because of the volatility of the core data nodes and spot instances since they are on Auto Scaling Groups and the frequency at which the shard and cluster states are reshuffled (again, due to ASG).
They don't need to be particularly beefy.
Front up the master nodes with an Elastic Load Balancer

Open port 9200 for HTTP. User queries will then be evenly distributed across the master nodes. You can use a separate ASG for a set of none data nodes specifically used for query purpose if there is a high volume of query traffic. Optionally set up SSL here.
Open port 9300 for TCP. Logstash instances can then join the cluster using this endpoint as the "discovery point". Otherwise you will not have a specific IP address you can set in logstash ElasticSearch output set in node protocol.

Configure a S3 bucket to store snapshot and restore from using the master nodes as the gateway.

The cluster I'm currently running has 2x t2.small for masters, 2x r3.large for "core" nodes and 4x r3.large for spot instances. It managed to hold 200gb of index and 260 million records and growing without breaking a sweat. It will have a long long way to go before it hits its scaling limit of 12x r3.8xlarge. Should be good for upwards of many terabytes of index and billions of rows. Fingers crossed.