Dealing with time and time zones is hard work.

We've all been in a situation where we've had to decide on how we process. store or transmit times without running foul of time zones and complications such as daylight saving time.

I recently watched a recording of Jon Skeet's Working with Time is Easy conference talk which highlights some of the challenges of dealing with time and provides some guidance on how to deal with time. This talk has since motivated me to better understand the challenges and what we should consider when design systems to better deal with the challenges of time zones.

Dealing with time is easy isn't it? Don't we just set the server to UTC and store everything in UTC? Kind of, but it's a bit more complicated than that.

Local Time can be Undefined and Ambiguous

Before considering the complexity of time zones, its first worth considering the issues that can exist within local time.

Daylight Saving Time (DST) is observed in many different parts of the world. Some place have DST, some places don't. Some countries may observe DST in certain regions but not in others.

In the UK, we observe DST between the end of March and the end of October: DST starts at 01:00 on the last Sunday in March when clocks will go forward by 1 hour, and ends on the last Sunday in October at 02:00 where clocks will go back by 1 hour.

The start and end of DST can be represented visually as follows:

Transitioning to/from daylight savings in the UK

The left hand graph shows the start of DST in March where the clocks go forward by an hour, and the right hand graph shows the end of DST in October where clocks go back an hour.

There are a few implications of DST which can add to the edge cases when dealing with time:

  • At the point that DST starts local time is undefined between 01:00-01:59.
  • At the point where DST ends local time is ambiguous between 01:00-01:59.
    If someone asks you to meet them at 01:30 local time, which 01:30 do they mean? The first one or the second one?

In a system where local times can be inputted by a user some thought may need to be given to how these two edge cases will be handled. Here are a few options to consider:

Undefined times:

  • Raise an error and force the user to reconsider their input
  • Shift the time forwards proportional to the undefined period of time, e.g. assume if they input 01:30 which is undefined due to clocks going forward an hour, they probably meant 02:30

Ambiguous times:

  • Raise an error and ask the user to refine their input
  • Make an assumption about which occurrence to use (e.g. first occurrence).

Determining the correct answer for these scenarios is likely to depend on your specific scenario and the user experience that you want your user to have.

Regardless of the solution, edge cases like this should be well tested.

Time Zones are Geopolitical

It's easy to fall into the trap of thinking of time zones as being purely geographical, especially when people usually encounter time zones when travelling between different parts of the world.

Time zones are also influenced by politics, and political decisions have been responsible for many changes to time zones over the years. One recent example comes from 2019 when Brazil abolished daylight saving time. Until 2019, Brazil typically observed DST between the start of November and mid-February

Offsets ≠ Time Zone

ISO 8601 is a fairly well adopted international standard for representing datetimes. According to ISO 8601 a datetime can be represented as a UTC time:

2021-01-17T17:18:46Z

Or as a local time with a UTC offset:

2021-01-17T19:18:46+02:00

For the most part ISO 8601 works well because a local date time can be converted to ISO 8601 format by calculating the UTC offset (e.g. +02:00) for the current time zone.

ISO 8601 can fall short for representing times in the future. Because ISO 8601 uses an offset rather than an actual time zone, you could run the risk that the rules for a time zone change between when a time is captured and when a time occurs.

Let's take the example of Brazil abolishing DST in 2019. We'll assume we live in Sao Paulo, Brazil and the year is 2015. We decide to schedule a meeting for 10:00 local time on the 5th December 2020. Our hypothetical calendar system records the date in ISO format using a datetime and a UTC offset, we will set an offset of -02:00. We fast forward in time several years and Brazil decides to abolish Daylight savings time. Our system has no knowledge that the time recorded is for Brazil because only the hour offset of -02:00 is saved. And an offset of -02:00 covers many time zones. As a result our calendar system will still remind us at an offset of -02:00 rather than -03:00, so we'll end up missing the meeting because we turned up an hour late.

API Designers don't always get it right

All programming languages that I've encountered ship with some kind of API for working with dates and times. As dates and times seem somewhat fundamental, you'd be forgiven for thinking that programming language designed could put together a decent date/time API.

Anyone who's used Java prior to version 8, will remember the horrors of java.util.Date and java.util.Calendar which had many painful quirks such as mutable objects and years defined relative to 1900.

Let's take the java.util.Date example:

new Date(121, 2, 24)

That's describing the 24 March 2021. Defining a year relative to 1900 and starting the month ordinal at 0 is not intuitive.

More worryingly, some APIs don't provide any protection against using dates that are ambiguous or undefined. In the Europe/London time zone 01:30 on Sunday, 28 March 2021 is an undefined time as this is during the transition where the clock go forward an hour for DST.

In the following example Ruby code, an exception is thrown when I try and define this time:

$ irb
2.5.7 :001 > require 'tzinfo' 
2.5.7 :002 > tz = TZInfo::Timezone.get('Europe/London')
 => #<TZInfo::DataTimezone: Europe/London> 
2.5.7 :003 > tz.local_to_utc(Time.utc(2021, 3, 28, 1, 30, 0))
Traceback (most recent call last):
        6: from /home/james/.rvm/rubies/ruby-2.5.7/bin/irb:11:in `<main>'
        5: from (irb):4
        4: from /home/james/.rvm/gems/ruby-2.5.7/gems/tzinfo-1.2.7/lib/tzinfo/timezone.rb:464:in `local_to_utc'
        3: from /home/james/.rvm/gems/ruby-2.5.7/gems/tzinfo-1.2.7/lib/tzinfo/time_or_datetime.rb:324:in `wrap'
        2: from /home/james/.rvm/gems/ruby-2.5.7/gems/tzinfo-1.2.7/lib/tzinfo/timezone.rb:468:in `block in local_to_utc'
        1: from /home/james/.rvm/gems/ruby-2.5.7/gems/tzinfo-1.2.7/lib/tzinfo/timezone.rb:384:in `period_for_local'
TZInfo::PeriodNotFound (TZInfo::PeriodNotFound)

On the other hand, if I try the same thing in Python:

 $ python3
Python 3.6.9 (default, Oct  8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> from pytz import timezone
>>> from datetime import datetime, timedelta
>>> london = timezone("Europe/London")
>>> # Method 1:
>>> london.localize(datetime(2021, 3, 28, 1, 30, 0))
datetime.datetime(2021, 3, 28, 1, 30, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>)
>>> # Method 2:
>>> datetime(2021, 3, 28, 1, 30, 0,tzinfo=london)
datetime.datetime(2021, 3, 28, 1, 30, tzinfo=<DstTzInfo 'Europe/London' LMT-1 day, 23:59:00 STD>)

I've tried two different methods in Python for creating a localised time and both methods have allowed me to create an object that represents the time, despite it not being a valid time for the given time zone.

The key point here is to understand how or even if the programming language you're using deals with these complexities. Prior to Java 8, the standard APIs didn't provide a good way of working with datetimes – this lead to the development of the hugely popular Joda-Time library which is an excellent alternative for anyone working with an older version of Java. Fortunately, those involved in Joda-Time had a significant influence on the development of the java.time API that was introduced in Java 8.

If your language doesn't natively deal with time very well, there may well be a third-party library out there that can make up for any shortcomings. Another example is NodaTime for the .NET ecosystem.

Communicating Times

When building a system it is almost inevitable that at some point you will need to communicate time values over the wire. For example, you may have a calendar application that needs to request a users appointments over an API. How should the start and end times of these appointments be represented?

ISO 8601 is a fairly well known standard for representing dates and times, and the goals of the standard (according to Wikipedia) is:

to provide an unambiguous and well-defined method of representing dates and times, so as to avoid misinterpretation of numeric representations of dates and times, particularly when data is transferred between countries with different conventions for writing numeric dates and times

ISO 8601, can be used to just represent a date:

2021-01-24

As well as representing a datetime:

2021-01-24T07:28:37-05:00
2021-01-24T12:28:37Z

A datetime will either end with Z for UTC or an hour offset for the relevant timezone (e.g. -05:00.

ISO 8601 is great – for the most part. ISO 8601 is perfect for representing either the current time or a time in the past. It's less perfect at representing a time in the future. This comes back to the point that time zones are geopolitical and the time zone rules could change in the future, which could impact the hour offset. An hour offset is also not enough to determine which time zone the hour offset refers to as there are multiple time zones for every hour offset. Whether or not this is an issue for you is likely to depend on your use case. Assuming timezone rules in the future won't change, may also be an acceptable assumption for your use case, but that it down to you to decide.

It is also worth noting that RFC 3339 is another contender for representing datetime format. This is largely a subset of the ISO 8601 standard which for the most part removes ambiguity, for instance:

  • 2-digit year representations are not allowed, they must be specified in the 4-year format
  • full date and time representation (only fractional seconds are optional)
  • The T date/time separator can be replaced with a space or any other character

Neither ISO 8601, RFC 3339, nor any other standard (at the time of writing) provides provisions for explicitly defining the time zone of the datetime. Some have suggested using an extended ISO 8601 format to represent time zones:

2021-01-24T10:15:30-05:00[America/Cancun]

In this variation from ISO 8601, the time zone is appended in square brackets – this is a various/extension to ISO 8601, and isn't an official standard

Alternatively, you could use a pure ISO 8601 representation and provide the time zone as another field. For instance, if you had a JSON payload:

{
  "title": "Some appointment",
  ...,
  "start" : {
    "timestamp": "2021-01-24T10:15:30-05:00",
    "timezone": "America/Cancun"
  },
  ...
}

Storing Datetimes

When storing datetimes, there can often be a question of what to store.

Some people advocate for:

Converting a datetime to UTC as soon as possible and store the UTC timestamp in the database. Only convert back to a local datetime for presentation to the user

As we've covered fairly extensively by now, this approach can work if you don't care about the time zone accuracy of future datetimes. 2021-01-24T10:15:30 could be the local time in Cancun, which we convert to UTC and store as 2021-01-24T15:15:30Z. With this approach we can convert the UTC time accurately into a users local time, provided the rules for their timezone did not change between us capturing the datetime and the datetime occurring.

Using UTC datetimes isn't a completely defunct strategy – as with all of this it depends on the use case. If your having to manage a server-side part of your system, it is recommended that the server runs UTC and any internal functionality (e.g. running a task at a specific time, recording the time that the system calculated a certain value, etc) work off of UTC time.

There are a few reasons that support this:

  • If your server-side infrastructure runs out of data centres in different time zones they'll need to work off of UTC to coordinate activities such as database replication, etc...
  • Times in UTC are unambiguous, so you don't have to worry about what about time going backwards when exiting daylight savings. This could result in nuances such as ambiguous timestamps in log files, or running scheduled tasks twice when they should only have been run one.

If we now return to the main challenge of determining how to store user-inputted datetime values, it's worth considering how most database systems are able to store datetime values. Typically a datetime will be stored as a specific data type that are made up of the following the components:

  • the date
  • the time
  • the UTC offset (e.g. +02:00)

Again, this works perfectly fine for current or past datetimes, but can be limiting when storing times in the future. For instance, if we look at the PostgreSQL documentation on storing datetimes in this way they note the limitation:

For times in the future, the assumption is that the latest known rules for a given time zone will continue to be observed indefinitely far into the future.

In Jon Skeet's Working with Time is Easy conference talk he provides some clear guidance on what to store. His advice ultimately boils down to storing everything that you know at the point of user input. Don't omit any information and don't fabricate any either.

So if the user tell's you they want to be reminded about at event at 09:00 on 8 February 2021 in Europe/London, they you should store the local datetime and the timezone. If you decide to store this information in any other format (e.g. as UTC or UTC with an offset), you're losing information (in this case, the time zone).

Having read both Jon Skeet and Lau Taarnskov respective blog posts on working with time zones, I conclude that for user-inputted datetimes knowing and storing the local datetime and relevant time zone provides the richest representation.

How should you store a time zone? The most unambiguous way to store a time zone is to use the IANA canonical identifier (e.g. Europe/London). IANA maintains a time zone database for all of the historic and upcoming time zone rules.

Time zone abbreviations should be avoided because they are ambiguous, for example CST can stand for:

  • Central Standard Time
  • Australian Central Standard Time
  • China Standard Time
  • Cuba Summer Time

But how will database queries work well if we store local times and time zones? If you opt to store local times and their respective time zones, it is perfectly valid to store a computed field that converts the time to UTC, or UTC with the hour offset for the purpose of efficiently running queries against dates.  An example record may look like this:

Title:             "Some appointment"
StartTimestamp:    "2021-01-24T10:15:30-05:00"
StartTimezone:     "America/Cancun"
StartUTCTimestamp: "2021-01-24T15:15:30Z" (computed)

Any computed columns should be recomputed if IANA issues updated time zone rules. It is possible to subscribe to notifications of time zone changes from IANA.

Summary

Working with time zones can be difficult and complicated, but with some careful consideration it's possible to distil the topic down to a few key points:

  • Local datetimes can be ambiguous and undefined
  • Time zones are geopolitical, just because the rules are what they are today, doesn't guarantee that they will be the same in the future
  • Knowing an hour offset doesn't tell you the time zone
  • Understand how your programming language deals with ambiguous and undefined local times
  • Servers should run UTC and system generated times/events should be saved in UTC
  • Storing a user defined datetime as UTC and converting back at the presentation layer may not be enough
  • Beware of time zone abbreviations and hour offsets – these are ambiguous. Try and use the IANA canonical name if you need to know the timezone
  • For user-inputted/defined datetimes store all the information you have.