Impala nested inline view produces incorrect result when referencing the same column implicitly

Impala nested inline view produces incorrect result when referencing the same column implicitly

This article explains how to workaround the impala bug: IMPALA-2643. To see what the issue is, have a look at the test case below:
CREATE TABLE t (a int);
INSERT INTO TABLE t VALUES(NULL), (1);
SELECT COUNT(*) FROM t;
Query: SELECT COUNT(*) FROM t
+----------+
| COUNT(*) |
+----------+
| 2        |
+----------+

CREATE VIEW vv (aa, bb) AS SELECT a aa, a bb FROM t;

SELECT COUNT(*) FROM vv;
+-----+
| _c0 |
+-----+
| 1   |
+-----+
We are expecting the two returned numbers should be the same of value “2”. This happens when nested inline view referencing the same column name implicitly. In the above example, we referenced the same column “a” twice with different aliases, and it affects Impala older than 2.5.0. The workaround is to cast one of the column names. See test case below:
CREATE TABLE t (a int);
INSERT INTO TABLE t VALUES(NULL), (1);

SELECT COUNT(*) FROM t;
+----------+
| count(*) |
+----------+
| 2        |
+----------+

CREATE VIEW vv (aa, bb) AS SELECT a aa, CAST(a AS int) bb FROM t;

SELECT COUNT(*) FROM vv;
+-----+
| _c0 |
+-----+
| 2   |
+-----+
The difference is that we changed from “a bb” to “CAST(a AS int) bb”.

Loading

Leave a Reply

Your email address will not be published. Required fields are marked *

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!