A bug in time
Spoiler alert: this is about SystemTimeToTzSpecificLocalTime
, but first I'm going to give a bit of background so feel free to skim ahead if you're already familiar with the Windows API.
Windows time!
Windows generally represents time using a FILETIME
:
struct FILETIME {
uint32_t dwLowDateTime;
uint32_t dwHighDateTime;
}
This is a 64-bit integer but with the alignment of a 32-bit integer1. It encodes the number of 100 nanoseconds since the 1st of January, 1601 UTC2. This is a number in the range 0 to 9,223,372,036,854,775,807 (aka INT64_MAX
). Negative numbers are usually invalid but some APIs may overload them to mean something special. The value of 0 may also be special, depending on the context. E.g. using zeroes in SetFileTime
tells it to not set that value.
Ok, so we've got a 64-bit integer. Maybe we got the time from a file or maybe we asked the OS for the precise time. Now we want to show it to the user but showing them a large integer would be pretty meaningless. Fortunately there's a function to help with that, FileTimeToSystemTime
:
BOOL FileTimeToSystemTime(
[in] const FILETIME *lpFileTime,
[out] LPSYSTEMTIME lpSystemTime
);
It converts the FILETIME
we have into a SYSTEMTIME
, which looks like this3:
struct SYSTEMTIME {
// The year. The valid values for this member are 1601 through 30827.
int16_t wYear;
int16_t wMonth;
int16_t wDayOfWeek;
int16_t wDay;
int16_t wHour;
int16_t wMinute;
int16_t wSecond;
int16_t wMilliseconds;
}
It's simple enough to then format the fields however the user likes. Note the comment on wYear
. Based on what I said earlier about FILETIME
you can probably guess why valid values are restricted to 1601 through 30827. 1601 is FILETIME
zero whereas 30827 is the last full year that fits into a FILETIME
(most of 30828 fits too, but not beyond the 14th of September). It's funny, people talk a lot about Y2k or the Y2038 problem but for some reason the Y30828 problem doesn't even have a dedicated wikipedia page. Go figure.
Anyway, we've got our human readable time. Job done? Not quite. Recall that the FILETIME
is in UTC. Converting it to SYSTEMTIME
doesn't change that. Fortunately there's a function to help with that, SystemTimeToTzSpecificLocalTime
:
BOOL SystemTimeToTzSpecificLocalTime(
[in, optional] const TIME_ZONE_INFORMATION *lpTimeZoneInformation,
[in] const SYSTEMTIME *lpUniversalTime,
[out] LPSYSTEMTIME lpLocalTime
);
This is pretty straight forward as APIs go. It takes in a UTC time (lpUniversalTime
) and uses the given timezone (lpTimeZoneInformation
) to convert it to a local time. If no timezone is given then it converts it to whatever the system's current timezone is set to. lpLocalTime
reuses the SYSTEMTIME
structure from before so keeping track of timezone information is your responsibility.
Anyway, now we're done right?
Where things go wrong
Ok, I'm finally ready to get to the point. What's wrong with SystemTimeToTzSpecificLocalTime
?
Well for a start you probably don't want to omit the timezone unless it's for a one-off conversion. The user may change their timezone at any time so it's usually better to cache it at the start of an operation and use the same one consistently throughout. Update it only when you do something new. Or never, for short lived processes (e.g. cli utilities). But that's not a bug; it's a feature.
An actual issue is that Windows built-in timezones have pretty limited historical data. If the date is within the last decade or so it'll likely be good enough. But beyond that it falls back to just using the default timezone offsets (i.e. the same offsets as are used for a time one second ago). This is wrong™. However, in practice it often doesn't matter too much. I can't remember much about what I was doing in the 18th century so I'm not all that bothered if an old file's time is a few hours out. So this is unfortunate but it's not buggy per se. Just a bit wrong.
Where SystemTimeToTzSpecificLocalTime
truly gets stuck is at the boundaries. Recall once again that Windows dates start at 1601 UTC. So what happens if you're unfortunate enough to live in a negative timezone? Say you're trying to convert 1601-01-01 02:28
to a local time with the offset of -03:00
. If my calculations are correct then that should give us 1600-12-31 23:28
but SystemTimeToTzSpecificLocalTime
gets confused and returns an error instead. Why? As far as I can tell it seems to run into problems calculating if the timezone's daylight saving times are in effect. So it ends up with a SYSTEMTIME
with a wYear
set to 1600
, which it then sends to another function which sees the year and throws a tantrum. Admittedly, this mostly speculation and I could be completely wrong about the cause.
In any case, the effect of all this is you can't display a perfectly valid FILETIME
to the user using their own timezone. Or can you? There are a few ways round this.
You can grab the timezone data using functions like GetDynamicTimeZoneInformation
or GetTimeZoneInformationForYear
and calculate the correct offset yourself. You can even grab the timezone data directly from the registry where it's stored. Take a look at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones
. All the values are well documented in TIME_ZONE_INFORMATION
and DYNAMIC_TIME_ZONE_INFORMATION
.
I see you
So right about now you might be thinking that sounds great but it's also like a lot of work. And also didn't you say earlier that the data was wrong™? So that's a lot of effort just to be wrong.
I hear you, imaginary person. Fortunately there's a solution and it's called icu.dll
. This provides a C interface4 for the International Components for Unicode APIs, which happily includes a whole API wrapping their wealth of timezone data. I will however be upfront about the downside: it's only avaliable on recent versions of Windows. Windows 10 1703 first included icuin.dll
, which contains the APIs we're interested in. In Windows 10 1903 this was rolled into icu.dll
for all your internationalization needs.
A tutorial for writing using icu.dll
is probably better left to another time but essentially you can use ucal_open
to get a calendar for a particular timezone (or use UCAL_DEFAULT
for the default). Then use ucal_setMillis
to set the time in milliseconds from the UNIX epoc (aka 1970-01-01 00:00 UTC
). This does require a bit of calculating to convert from a FILETIME
. I could manually calculate how many 100's of nanoseconds there are between 1601-01-01 00:00
and 1970-01-01 00:00
or I could just ask SystemTimeToFileTime
and cache the result. Either way the answer is 116444736000000000
(spoiler, I guess). Erm, sorry I said I wasn't writing a tutorial so I leave converting to Unix time (and to milliseconds) as an exercise for the reader.
Though one final thought is that in my testing I found the ucal
interface to be a bit unwieldly and ucal_open
relatively expensive to call. So it might be worth just using ucal_getTimeZoneTransitionDate
to cache the transition data and then do our own thing with it. But anyway, I think I can save investigating this further for another time.
Conclusion
Why did I spend so many words on a bug that could have been summed up in a sentence or two5? Answers on a postcard please.
For historical reasons. The docs contain appropriate warnings about converting to/from 64-bit integers. Be careful out there!
1601 was chosen because it was the year when Microsoft started selling computer software.
Note that a win32 SYSTEMTIME
is not to be confused with what NT calls a SystemTime
, which is essentially the same as a FILETIME
but using an actual 64-bit integer.
But not the C++ interface because the C++ ABI is unstable.
It's fortunate that nobody reads footnotes because this one is completely unrelated to what is referencing it.