Hive/Impala Timestamp Conversion with “AM/PM”

Hive/Impala Timestamp Conversion with “AM/PM”

It is common that we need to convert a string representation of date/time into timestamp value, and we know that there are lots of different formats to represent date/time using string. This blog post will focus on the time that contains “AM/PM” while converting to Timestamp value from Hive or Impala

Hive does support converting time string that contains “AM/PM” into Timestamp, see below example:

SELECT UNIX_TIMESTAMP('4/20/2018 09:26:35 pm', "MM/dd/yyyy hh:mm:ss a");
+-------------+--+
|     _c0     |
+-------------+--+
| 1524223595  |
+-------------+--+

There are a couple of things to note here:

  • The hour should be “hh”, not “HH”, as “HH” will return 24 hour format, which will not make sense when combined with AM/PM
  • Hive is written in Java, so it follows on what Java uses, which supports the below list of Date Time formats:
LetterDate or Time ComponentPresentationExamples
GEra designatorTextAD
yYearYear199696
YWeek yearYear200909
MMonth in year (context sensitive)MonthJulyJul07
LMonth in year (standalone form)MonthJulyJul07
wWeek in yearNumber27
WWeek in monthNumber2
DDay in yearNumber189
dDay in monthNumber10
FDay of week in monthNumber2
EDay name in weekTextTuesdayTue
uDay number of week (1 = Monday, …, 7 = Sunday)Number1
aAm/pm markerTextPM
HHour in day (0-23)Number0
kHour in day (1-24)Number24
KHour in am/pm (0-11)Number0
hHour in am/pm (1-12)Number12
mMinute in hourNumber30
sSecond in minuteNumber55
SMillisecondNumber978
zTime zoneGeneral time zonePacific Standard TimePSTGMT-08:00
ZTime zoneRFC 822 time zone-0800
XTime zoneISO 8601 time zone-08-0800-08:00

Source: https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html

You can see that to represent “AM/PM”, you need to use pattern “a”.

However, if you want to CAST a Date/Time string to Timestamp type, having “AM/PM” in the Date/Time string won’t work, you need to use 24 hour format, see below example:

-- not working
SELECT CAST('2020-01-06 09:26:35 PM' AS timestamp);
+-------+--+
|  _c0  |
+-------+--+
| NULL  |
+-------+--+

-- working
SELECT CAST('2020-01-06 21:26:35' AS timestamp);
+------------------------+--+
|          _c0           |
+------------------------+--+
| 2020-01-06 21:26:35.0  |
+------------------------+--+

Impala, on the other hand, currently does not support to convert Date/Time string that contains “AM/PM” to Timestamp, it will return NULL instead. It is reported in upstream JIRA IMPALA-3381, which is duplicated by IMPALA-4018, and it is still not resolved at the time of writing.

The workaround is to use Hive for such transformations.

Loading

Leave a Reply

Your email address will not be published. Required fields are marked *

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!