In this post, I lay out the reason that you should always transfer dates in a JSON/REST API as an integer epoch value: the number of milliseconds since Jan 1, 1970. If you think it should be anything else, please read on….
In the JSON representation, you must convert whatever internal format you are using to a string, and it must be parsed on the other side. So what is the best string representation for a date? See earlier post “Storing Date Values in Files” which is relevant to this discussion. There, I consider two potential formats:
- A) An integer epoch value
- B) An ISO-8601 standard date format
Comparison of Approaches
Compactness: ( A > B ) The ISO format is always twice as long or longer. We always think that disk space and bandwidth are free, but if the file is half the size, it makes a difference.
Precision: ( A = B ) Both formats represent time values covering all of history down to millisecond resolution. There is no particular advantage of one format over the other in terms of precision.
Human Readability: ( A ?? B ) The one “advantage” of ISO-8601 is that it display a meaningful date value with days, hours, minutes, etc. It is easier for a human to look at the value and know what it is. The reason I mark this as (??) questionable is that there is a dark side to human readability that encourages programmers to be MISLEAD about the value they think they are seeing. I will argue that in a separate section. In the end, I am convinced that you do NOT want the value readable. If you want to know the value, you can easily take it to an Epoch Value Converter page and find out what it means. So the value plainly there and visible for inspection if you need to … the question is: how often are humans going to read these files? This is an API protocol! This is for machines to talk to each other. For debugging, you can use an epoch converter, but there is no real need for humans to read these files on any kind of regular basis.
The Dark Side of Human Readibility
Human-readable date values are problematic for one reason: time zones. It is easy enough to think the “time” part will be a couple of hours off, but people usually forget that this effects the DATE portion. They see two values showing “4-26-1990″ and they think they are on the same day — but they may not be!
The fact that you can see something like a date, fools the viewer into thinking that the value represented will be on the day they experienced. It is actually misleading.
You have to convert at every point correctly! Every client and every server needs to support every time zone and any other one might generate. While this information about time zones is relatively stable, it does sometimes change, and that can introduce a source of error. If a particular environment uses a time zone that another environment does not know about, you get a failure to communicate.
Some of the problem with computational complexity of ISO-8601 format is reduced if you always use UTC time zone (Zebra Time — from the Z that designates the time zone.) However, this raises the human readability problems. It is NEVER in the local time, which reduces or eliminates the value of human readability.
For example, a timestamp value for 6pm on Nov 21 in California, will have a UTC date value of Nov 22, which will cause the human reader to think that it occurred on a different day than a value taken two hours earlier at 4pm. Having the date part change int he “middle of the day” is not human readable, and causes more problems than it solves.
Programmer often make mistakes when manipulating date/time values in human readable think that if you want to find the “beginning of the day” that they can set the hours & minutes to 00:00, but that only works if the representation is in the same time zone as the user they are working with! Several times recently I have had a server working in one time zone represent just the date part, truncating the time value to 00:00, but then when this is transported to a client in another time zone, it sees the previous day, and then proceeds to truncate the time to 00:00 again, yielding an error of more than one day in the date values.
Most programmers make the mistake that thinking that the time zone affects only the “hour” part, so it is incredibly common for a programmer to strip the time off of a date value. For instance (and I had to find this exact bug recently) a client and server were in different time zones, the client was sending a date (without time) to the server, which was storing it as a complete date (using midnight as the time) and then the full date/dime was sent back to the client which interpreted it as 11pm the previous day. So the day was appearing always one day less than in should have been — and this was a big problem for the users.
It is very easy for the programmer to compose a string for the date above: “1990-04-26T00:00:00Z” and think they have specified April 26, only to find that this is really a date time on April 25 for all of the western hemisphere!
It gets worse when you consider daylight savings time. Many programmers don’t seem to understand that PST (Pacific Standard Time) and PDT (Pacific Daylight Time) are actually different time zones according to the ISO 8601 format. Similarly what can happen is like above, but accidentally specifying “1990-04-26T00:00:00PST” but of course, April should be in daylight time. What happens is that same problem that the 1 hour shift causes the date to be interpreted as 11pm the previous day. And so on.
If you do track the time zone properly, you probably are aware that the daylight time change does not happen at midnight — because otherwise you would have two midnights on one day of the year that would mean it is hard to say which midnight caused the date change. So it is moved to 2am. That avoids the date problem, but you still end up with the largely ignored problem that 1:30am occurs twice on one day a year, and those different 1:30am times are distinguished by the time zone they are in. How much do you want to bet that setting a meeting for 1:30am is messed up on half the systems out there?
The point of using a text encoding that is “human readable” is that people can view the value and know what it means. My experience has been that this is actually a distraction that causes more bugs than it is worth.
Look at the advantages of the epoch value:
- Nobody expects it to be human readable, so it never gets approximated or distorted.
- Humans never mistake it for being on a different day than. The epoch converter will convert two epoch values to the same appropriate time zone and you can compare them appropriately.
- It is very fast on every platform
- It is compact by more than a factor of 2
- Every operating environment on the planet can deal with it
- The millisecond value on the server is always the exact same value on the client JS environment
- There are almost no failure modes to worry about.
The epoch value has none of the problems that the “human readable” values have. Remember that you create a JSON REST API as a data transfer format for machine to write and read. Focus on the approach that is BEST for machine to machine reliability.