It is common that we need to convert a string representation of date/time into timestamp value, and we know that there are lots of different formats to represent date/time using string. This blog post will focus on the time that contains “AM/PM” while converting to Timestamp value from Hive or Impala
Hive does support converting time string that contains “AM/PM” into Timestamp, see below example:
SELECT UNIX_TIMESTAMP('4/20/2018 09:26:35 pm', "MM/dd/yyyy hh:mm:ss a"); +-------------+--+ | _c0 | +-------------+--+ | 1524223595 | +-------------+--+
There are a couple of things to note here:
- The hour should be “hh”, not “HH”, as “HH” will return 24 hour format, which will not make sense when combined with AM/PM
- Hive is written in Java, so it follows on what Java uses, which supports the below list of Date Time formats:
Letter | Date or Time Component | Presentation | Examples |
---|---|---|---|
G | Era designator | Text | AD |
y | Year | Year | 1996 ; 96 |
Y | Week year | Year | 2009 ; 09 |
M | Month in year (context sensitive) | Month | July ; Jul ; 07 |
L | Month in year (standalone form) | Month | July ; Jul ; 07 |
w | Week in year | Number | 27 |
W | Week in month | Number | 2 |
D | Day in year | Number | 189 |
d | Day in month | Number | 10 |
F | Day of week in month | Number | 2 |
E | Day name in week | Text | Tuesday ; Tue |
u | Day number of week (1 = Monday, …, 7 = Sunday) | Number | 1 |
a | Am/pm marker | Text | PM |
H | Hour in day (0-23) | Number | 0 |
k | Hour in day (1-24) | Number | 24 |
K | Hour in am/pm (0-11) | Number | 0 |
h | Hour in am/pm (1-12) | Number | 12 |
m | Minute in hour | Number | 30 |
s | Second in minute | Number | 55 |
S | Millisecond | Number | 978 |
z | Time zone | General time zone | Pacific Standard Time ; PST ; GMT-08:00 |
Z | Time zone | RFC 822 time zone | -0800 |
X | Time zone | ISO 8601 time zone | -08 ; -0800 ; -08:00 |
Source: https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html
You can see that to represent “AM/PM”, you need to use pattern “a”.
However, if you want to CAST a Date/Time string to Timestamp type, having “AM/PM” in the Date/Time string won’t work, you need to use 24 hour format, see below example:
-- not working SELECT CAST('2020-01-06 09:26:35 PM' AS timestamp); +-------+--+ | _c0 | +-------+--+ | NULL | +-------+--+ -- working SELECT CAST('2020-01-06 21:26:35' AS timestamp); +------------------------+--+ | _c0 | +------------------------+--+ | 2020-01-06 21:26:35.0 | +------------------------+--+
Impala, on the other hand, currently does not support to convert Date/Time string that contains “AM/PM” to Timestamp, it will return NULL instead. It is reported in upstream JIRA IMPALA-3381, which is duplicated by IMPALA-4018, and it is still not resolved at the time of writing.
The workaround is to use Hive for such transformations.