¿Cómo podemos desarrollar prácticas de codificación diseñadas para proteger contra errores de año bisiesto? [cerrado]

Microsoft has just announced that a software error in calculating dates (over leap year) caused a major outage in Windows Azure la semana pasada.

Was it really a simple error in judgement working around DateTime.Now.AddYears(1) on a leap year?

What coding practices could have prevented this?

EDITAR As dcstraw pointed out DateTime.Now.AddYears(1) on a leap year does in fact return the correct date in .NET. So it's not a framework bug, but evidently a bug in Date calculations.

preguntado el 10 de marzo de 12 a las 14:03

Since there's no 'one' answer, and this is more of a discussion-y question, I think it should be migrated to programmers. Also: Always Unit test your code. Always. -

DateTime issues are a gnarly and complicated issue. One might say something like do everything in UTC but there will always be some point of translation... and when that happens the number of permutations you would have to account for to avoid todos bugs is staggering. Most of us will never work on a system that has to care. -

I think this question is too interesting to be closed :-) -

FOUR votes to close? I hope this the bad tempered, trigger happy close thing is just a phase in the SO evolution. It's all just a bit too holier-than-thou. -

@IainMH have to agree with you especially since nearly all of those who closed it down have never asked a question with more than a few votes. -

3 Respuestas

Enchufe desvergonzado:

Use a better date and time API

The built-in .NET date and time libraries are horribly hard to use properly. They do let you do everything you need, but you can't expreso yourself clearly through the type system. DateTime es un lío, DateTimeOffset may lull you into thinking you're actually preserving the time zone information when you're not, and TimeZoneInfo doesn't force you to think about everything you ought to be considering.

None of these provide a nice way of saying "just a time of day" or "just a date", nor do they make a clear distinction between "local time" and "time in a particular time zone". And if you want to use a calendar other than the Gregorian one, you need to go through the Calendar class the whole time.

All of this is why I'm building Hora de Noda - an alternative date and time library built on a port of the Hora de Joda "engine" but with a new (and leaner) API on top.

Some points you may want to think about, which are easy to miss if you're not aware of them:

  • Mapping a local date/time to one in a particular time zone isn't as simple as you might think. A specific local date/time might occur once, twice (ambiguity) or zero times (it's skipped) due to daylight saving transitions
  • Time zones vary historically - more than TimeZoneInfo is generally willing to reveal, frankly. (It doesn't support a time zone whose idea of "standard time" changes over time, or which goes into permanent daylight saving time.)
  • Even with the zoneinfo database, time zone IDs aren't necessarily stable. (CLDR addresses this; something I'm hoping to support in Noda Time eventually.)
  • Textual representations of dates and times are a nightmare, not just in terms of ordering, but date separators, time separators, and odd things like genitive month names
  • The start of the day isn't always midnight - in Brazil, for example, the spring daylight saving transition moves the wall clock from 11:59:59pm to 1am
  • In some cases (well, one that I know about) a time zone can force a whole day to be skipped - December 30th 2011 didn't occur in Samoa! I suspect most developers can probably ignore this one, but...
  • If you're going to use a calendar other than the Gregorian one, be careful and make sure you really know how you expect it to behave.

As far as specific development practices:

  • Think about what you're really trying to represent. I expect the core benefit of Noda Time to be forcing developers to choose between various different types to represent their data. Get that right, and everything else is simpler.
  • Unit test everything you can think of. That will depend on exactly what your system does, of course, but particularmente consider different time zones, what happens across daylight saving transitions, and of course leap years.
  • I'd advise injecting a "clock-like interface" - a service for telling the current time - rather than explicitly calling DateTime.Now or DateTime.UtcNow; it makes it easier (feasible!) to unit test
  • If you're performing multiple operations with "now", obtain that date/time una vez and remember it, rather than repeatedly requesting "now" - otherwise the value could change in unfortunate ways between the calls.
  • "Do everything in UTC" isn't always the answer either - if I want to know "when exactly does 'two weeks from now' occur in my local time zone?" then I need to store the locales date/time as well as the time zone.

respondido 10 mar '12, 15:03

@flq: Again, you'd need to define exactly what you mean by "leap-year safe". I doubt that it was a marco bug that caused the problem in Azure - I expect it was poor utilizado of the framework. - jon skeet

Umpteen bullet points of advert without a single one on the subject of leap years (ie the topic). And then puffed on Twitter. I think you've had better moments, Jon. - decano

@WillDean: Also note that the question was edited by George Stocker. The original title was "Defensive programming against DateTime bugs" - at which case I think you'll agree my post is entirely relevant. (I only just noticed that the title had been changed. It said DateTime when I started answering...) - jon skeet

Jon, sorry, I hadn't seen the previous title! Maybe it should be incumbent on someone changing a title to edit all the answers too... - decano

@WillDean: Yes... I'm tempted to either roll back the title change or make it end with "DateTime bugs (e.g. leap years)" - jon skeet

It's worth noting that the bug probably wasn't due to a line like you posted:


That doesn't create an invalid date. If you run:

(new DateTime(2012, 2, 29)).AddYears(1)

you get Feb 28, 2013. I don't know what Azure's guest agent is written in but it must have been a different call that failed. A bad way to have done this in .NET would have been:

new DateTime(today.Year + 1, today.Month, today.Day)

That throws an exception if today is leap day. However the Microsoft blog about the Azure issue said that they created an invalid date of Feb 29, 2013, which I'm not sure is possible to do with DateTime en la red.

No estoy diciendo eso DateTime y DateTimeOffset aren't error-prone, just that I don't think they would have caused this particular issue.

respondido 10 mar '12, 16:03

I'm guessing it was done in C++ - Phil Pursglove

@PhilPursglove: What makes you think that? The Windows OS was written in mostly C and C++ and handles leap days properly. It was probably more to do with a date-related failure in the transfer certificate creation process. - En silico

Could it be parsing a date from a string? - Fijador

And yes you're right about the AddYears(1) as you pointed out. I'm not sure what went on internally. It was just an attempt to the give the question a bit of perspective, without going too deep into the blog post. - Fijador

It kinds of make sense. I honestly don't know if this was the case, but is credible. I've seen nasty code from big companies, MS included. But again, we really can't do much but speculate at this point. - Alpha

How can we develop coding practices designed to protect against leap year bugs? What coding practices could have prevented this?

Unit testing specific dates as John mentioned is one code practice that will assist however nothing beats what I define as a 'manual integration test'

change the clock on your development/testbed server and watch what happens when the time ticks over.

Don't get bogged down on specifics whether this is a 'coding practice' - Obviously you can't do this for every date on the calendar - pick the dates you are concerned with, be that the 29th Feb, end-of-month dates or daylight savings changeover dates.

respondido 13 mar '12, 23:03

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.