Deploying Elasticsearch 6.x on Azure with Terraform
Terraform is my go-to tool for repeatable and easy infrastructure deployments. I've previously shared how I deploy Elasticsearch on AWS with Terraform and Packer, and since posting that I used it to deploy many clusters, and it also got picked up by quite a few others.
Our offerings at BigData Boutique are cloud-agnostic and as such we also help projects deployed on clouds other clouds. Today we will be looking at deploying a full Elasticsearch cluster, using best practices end-to-end, on Microsoft Azure.
You can find all relevant code + documentation here: https://github.com/synhershko/elasticsearch-cloud-deploy/. This entire Terraforming project supports deplying both Elasticsearch 5.x and 6.x clusters.
Feel free to share your experience, report issues and request features here: https://github.com/synhershko/elasticsearch-cloud-deploy/issues.
Creating immutable images with Packer
To enable quickly launching machines on the cloud without waiting for lengthy installs on provisioning, and also to avoid snowflake servers, I opted for generating images of the servers we deploy and then just provision machines with those images loaded into them. This is a general practice I use and here it is really easy to see how it makes a difference.
"Packer is a tool for creating machine and container images for multiple platforms from a single source configuration". In other words, you can easily define the steps to execute on a base image, and then run it everywhere to create images that you can later deploy.
In my solution, I created two images. One is an image for an Elasticsearch node that is installed on the latest Ubuntu; second one is an image with Kibana, Grafana and Cerebro installed, that is based on the first image and will be later used as an external and internal gateway to the cluster.
More details and instructions for running this can be found in the README. You need to create those images in order to proceed to the next step.
Deploying an Elasticsearch cluster with Terraform
Terraform is great at describing complex infrastructure easily and in a repeatable way. I find Terraform so much easier to use for deploying and amending infrastructure - especially on Azure which for many people tends to be more UI oriented.
Once you created the machine images with Packer, all is left for you to do is editing some configurations (e.g. machine sizes, number of nodes, Azure location, SSH keys to use) and you are set to go.
terraform plan and then
terraform apply will create the cluster for you, using scale-sets and load balancing for the client nodes, and the necessary network interfaces. Everything will be set up using best practices, although your mileage may vary and you might want to fork my work and adapt it to your use case.
The recommended configuration is to have exactly 3 master nodes, at least 2 data nodes, and at least 1 client node (and it's easy to add more to ensure 100% uptime). This is supported out-of-the-box. We also support a single-node mode, mostly for experimentation but also might be usable for very small deployments.
Elastic's X-Pack is deployed on the cluster out of the box with monitoring enabled but security disabled - you should enable and setup X-Pack Security for any production deployment.
Full details and instructions are here.
Client nodes with Kibana, Grafana and Cerebro
Once deployed, the cluster is fully configured, and accessible via the deployed client nodes. The client nodes also expose Kibana instances and a Cerebro UI on top of the cluster, so everything is fully visible and ready for use. There is also Grafana installed, for those who prefer using Grafana dashboards on top of Elasticsearch.
Those client nodes are also the ones your apps need to talk to (internally of course). They are password protected (the password is automatically generated, and can be retrieved using
terraform output), and you might want to remove that completely and rely on your vnet and private IPs, removing public IP access completely. I discussed security concerns in this blog before.
Note: The first time Kibana is initialized it takes about 10 minutes to become available, it does some magic compressions and stuff.
Elastic Discovery on Azure
Unfortunately, the story of cluster discovery on Azure is quite bad. There is an Azure "Classic" discovery plugin that has been deprecated since circa 5.0 and Elastic are yet to release a properly working discovery plugin. There is a PR for an Azure RM discovery plugin which is open for over a year now without any real progress if you want to track it.
A discovery plugin on a public cloud is important because it takes a lot of complexity off your hands, and manages the initial cluster nodes discovery using the available cloud APIs.
Having none available, I defaulted to using vnet and naming conventions. Another viable option is using file-based discovery, which is a file describing your cluster you can upload to the images and use as a seed.
The Azure repository plugin is installed on the cluster and ready to be used for index snapshots and (should you ever need) a restore. Official documentation is available here: https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-azure-usage.html