How to Control Impala Daemon’s Memory Limit

How to Control Impala Daemon’s Memory Limit

This article explains Impala daemon’s processes and how to control the maximum memory each process can use. Impala Daemon has two different processes running, one is written in C++, used by backend, mainly for query processing. The other one is written in Java, used by frontend, for query compilations, storing metadata information etc, and it is embedded into backend’s C++ process, hence they share the same Process ID. So, the way to control how much memory each process can take is quite different between the two. Memory Limit for C++ Process: To control the memory limit for the C++’s backend process, so that each Impala Daemon will not over commit itself when running queries, Cloudera Manager provides native configuration to control it. Simply go to Cloudera Manager Home Page > Impala > Configuration > Impala Daemon Memory Limit, see below screenshot: Just update the value, save and then restart Impala. To confirm the change takes affect, you can navigate to Impala Daemon’s UI web page at http://:25000/varz and search for “mem_limit”: Memory Limit for Java Process: By default, Impala will use a quarter of host’s physical memory, or 32GB, whichever is smaller, for it’s frontend Java process, which is used mainly for query compilation and storing metadata information. Normally you do not need to make the change. However, should you think that it is used too much, or not enough, you can change it by using following steps: 1. Go to Cloudera Manager Home Page > Impala > Configuration > Impala Daemon Environment Advanced Configuration Snippet (Safety Valve) 2. enter below into the text box:
JAVA_TOOL_OPTIONS=-Xmx?g
Where “?” is the number you choose for the amount of memory in GB for Impala. 3. Save and then restart Impala To confirm that the change takes affect, run below commands on Impala Daemon’s host:
sudo -u impala jcmd $(pgrep -f IMPALAD) VM.flags
You might want to add path to Java’s bin directory if command “jcmd” returns command not found error. Sample output looks like below:
12821:
-XX:CICompilerCount=2 -XX:InitialHeapSize=62914560 -XX:MaxHeapSize=994050048 \
-XX:MaxNewSize=331350016 -XX:MinHeapDeltaBytes=524288 -XX:NewSize=20971520 \
-XX:OldSize=41943040 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
You can compare the value of -XX:MaxHeapSize with the value you set in JAVA_TOOL_OPTIONS to make sure they match.

Leave a Reply

Your email address will not be published.

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!