Easily Deploying Elasticsearch on AWS with Terraform and Packer
After finding myself deploying Elasticsearch on the cloud for many clients who asked for help, I figured I should find a way to automate and simplify the process. It is now here for you to enjoy.
Using hosted Elasticsearch solutions is costly and some solutions really don't worth the premium you pay (looking at you, AWS Elasticsearch). And so I went to seek a solution that will allow to deploy and manage my own (or my clients') clusters easily without hassle, and without paying any premiums whatsoever.
The solution I came up with was to use Packer and Terraform - two great HashiCorp products - to create immutable machine images and then deploy them to a cluster of EC2 machines. Infrastructure as Code FTW.
The entire work, with documentation, can be found here: https://github.com/synhershko/elasticsearch-cloud-deploy/. High-level documentation follows.
Creating immutable images with Packer
"Packer is a tool for creating machine and container images for multiple platforms from a single source configuration". In other words, you can easily define the steps to execute on a base image, and then run it everywhere to create images that you can later deploy.
In my solution, I created two images. One is an image for an Elasticsearch node that is installed on the latest Ubuntu; second one is an image with Kibana and Kopf installed, that is based on the first image and will be later used as an external and internal gateway to the cluster.
More details and instructions for running this can be found in the README. You need those AMIs in order to proceed to the next step.
Deploying an Elasticsearch cluster with Terraform
Terraform is great at describing complex infrastructure easily, and then separating planning from execution. It is significantly easier to write and maintain than CloudFormation.
Once you created the machine images with Packer, all is left for you to do is editing some configurations (e.g. machine sizes, number of nodes, AWS region and availability zones, key name and a VPC) and you are set to go.
Running terraform plan
and then terraform apply
will create the cluster for you, as well as a load balancer for the client nodes, auto scaling groups for all nodes, IAM roles and security groups. Everything will be set up using best practices, although your mileage may vary and you might want to fork my work and adapt it to your use case.
The recommended configuration is to have exactly 3 master nodes, at least 2 data nodes, and at least 1 client node. This is supported out-of-the-box. We also support a single-node mode, mostly for experimentation but also might be usable for very small deployments. And you should note this is the default configuration, see the full instructions on what need to be changed to deploy the recommended configuration.
Full details and instructions are here.
Client nodes with Kibana and Kopf
Once deployed, the cluster is fully configured and available via the client nodes. The client nodes also expose Kibana instances and Kopf UI on top of the cluster, so everything is fully visible and ready for use.
Those client nodes are also the ones your apps need to talk to (internally of course). They are password protected (make sure to change the password!), and you might want to remove that and rely on security groups and VPNs, removing public IP access completely. I discussed security concernse elsewhere before.
Future work
The work is done, but not at all complete. As I continue to support clients, I'll be improving on this and adding more features along the way. Pull requests and other forms of feedback welcome!
Among the things on my roadmap are:
- SSL for client nodes. Certificates and secret management is kind of a PITA so I skipped it for now. I will be adding native support for it soon.
- X-Pack installation and configurations.
- Log shipping of Elasticsearch own logs, as well as cluster monitoring support (via X-Pack and others).
- Support for deployments also on Azure and Google Cloud Platform.
Feel free to share your experience, report issues and request features here: https://github.com/synhershko/elasticsearch-cloud-deploy/issues.