Elasticsearch One Tip a Day: Avoid Costly Scripts At All Costs

Elasticsearch supports scripting in several places, however the most notable and useful usage of scripting is in queries. You can use Groovy scripts to run dynamic scripts on query time to perform filtering, adjust scoring or compute aggregations. Scripting support is very useful, especially since it can access and work with fields in the indexed documents.

However, scripts are very slow to run. Furthermore, in recent Elasticsearch versions dynamic scripting with Groovy has been disabled due to a security vulnerability. So while dynamic scripting is a very useful tool for prototyping, the security implications and associated performance hit (since they require initializing a sandboxed scripting VM to run each script) just prevent them from being production worthy.

Often times there is a good alternative for scripts - using more sophisticated queries, pre-computing values on indexing or using the internal decay functions instead of doing this via scripts. In general, Elasticsearch's Function Score Query is a very powerful tool that can do a lot without the costs of scripts (although at some cost still).

For those cases where you really have to do some computation that isn't supported otherwise, there is the Lucene Expressions support, which is very limited but considerably more performant as it parses and runs natively and doesn't require a scripting VM.

If no other option remains, consider using native scripts instead. Native scripts are in fact Elasticsearch plugins written in Java and compiled into a JAR file. This is more annoying to maintain, but that's the most power at the highest speed you can get. A good example for a native script plugin is maintained here: https://github.com/imotov/elasticsearch-native-script-example.


Comments are now closed