This article explains how to fix the posix_fallocate error from Impala Daemon log when running an impala query.
The following were what happened:
1) Impala query with GROUP BY clause failed
2) Check the profile to find out the host of the coordinator
3) Found the following in the coordinator log:
16:24:52.857059 31382 coordinator.cc:479] Query id=34447b48cbaa3649:6e785ead00e43d8c failed because fragment id=34447b48cbaa3649:6e785ead00e43d8f failed.
4) Check again with profile, found out the host of the fragment id 34447b48cbaa3649:6e785ead00e43d8f that was running on
5) Then found the following in the host:
16:24:52.831714 26858 runtime-state.cc:230] Error from query 34447b48cbaa3649:6e785ead00e43d8c: posix_fallocate(350, 19864223744, 8388573) failed for file /tmp/impala-scratch/34447b48cbaa3649:6e785ead00e43d8c_45559564-1d20-4c53-bf29-8cd4f3e00a86 with returnval=28 description=
The function posix_fallocate() ensures that disk space is allocated for the file referred to by the descriptor fd for the bytes in the range starting at offset and continuing for len bytes. After a successful call to posix_fallocate(), subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space. More information about posix_fallocate can be found here.
Impala has default scratch directory under /tmp.
Based on the above information, we can see that /tmp did not have enough disk space. The scratch directory is used by the disk spilling feature that impala supports, when there are not enough memory to run certain queries aggregation or hash join. And when this happens, one of the impala daemon failed to create tmp files and the whole query will be cancelled.
The solution here is to change the impala’s scratch directory from default /tmp directory to somewhere that have larger disk space, HDFS disks are good candidates.
Hope this helps.