Spark Job SASL Authentication Error

Spark has an internal mechanism that authenticates executors with the driver controlling a given application. This can be controlled by setting “spark.authenticate” to “true”, as part of spark-submit’s parameters, like below: This setting is required if you have “spark.authenticate.enableSaslEncryption” enabled, or called “Enable Network Encryption” in Cloudera Manager. So if …

Oozie Spark Action Not Loading Spark Configurations

Recently I was working on an issue that Oozie was not able to pick up Spark’s configuration and caused job to fail. The reason that I know it was not loading Spark’s configuration was because spark had “spark.authenticate=true” set in its configuration file under file /etc/spark/conf/spark-defaults.conf. And I confirmed Oozie …

Oozie Hive2 Action Failed with Error: “HiveSQLException: Failed to execute session hooks”

If you have an Oozie Hive2 job that fails with below error message randomly, which can be found in Oozie’s server log, located by default under /var/log/oozie: It is likely that you are hitting a possible issue with JDK. Please refer to HADOOP-13809 for details. There is no prove at …

Impala Query Failed with ERROR “AnalysisException: ORDER BY expression not produced by aggregation output”

Recently, I discovered a bug in Impala that when you are using Expression in the ORDER BY clause, the query will fail with below error message: Customer used a very complicated query, and I managed to simplify it to look something like below: This can be re-produced from CDH5.13.x onward. …