This article explains how to workaround the impala bug: IMPALA-2643. To see what the issue is, have a look at the test case below:
CREATE TABLE t (a int);
INSERT INTO TABLE t VALUES(NULL), (1);
SELECT COUNT(*) FROM t;
Query: SELECT COUNT(*) FROM t
+----------+
| COUNT(*) |
+----------+
| 2        |
+----------+

CREATE VIEW vv (aa, bb) AS SELECT a aa, a bb FROM t;

SELECT COUNT(*) FROM vv;
+-----+
| _c0 |
+-----+
| 1   |
+-----+
We are expecting the two returned numbers should be the same of value “2”. This happens when nested inline view referencing the same column name implicitly. In the above example, we referenced the same column “a” twice with different aliases, and it affects Impala older than 2.5.0. The workaround is to cast one of the column names. See test case below:
CREATE TABLE t (a int);
INSERT INTO TABLE t VALUES(NULL), (1);

SELECT COUNT(*) FROM t;
+----------+
| count(*) |
+----------+
| 2        |
+----------+

CREATE VIEW vv (aa, bb) AS SELECT a aa, CAST(a AS int) bb FROM t;

SELECT COUNT(*) FROM vv;
+-----+
| _c0 |
+-----+
| 2   |
+-----+
The difference is that we changed from “a bb” to “CAST(a AS int) bb”.

Leave a Reply

Your email address will not be published. Required fields are marked *