All,
I'm a bit confused as to why this is as big a deal at it seems be. I
_think_ I understand the implication of different calendars, leap seconds,
etc, but:
CF encodes time as "some unit of time since an epoch". e.g. "seconds since
2015-05-08T00:00:00+00:00"
This encoding makes the whole calendar thing a LOT easier than it could be,
because the ONLY place the calendar matters is in the epoch specification.
So for the most part, it's up to the client code to figure out what
calendar to use, and how to use it (If there is a need to translate the
rest of the time series to "human date-times". In fact, a while back there
was a discussion of allowing ISO8601 strings, rather than this "time since"
stuff for time axis -- darn good we didn't go with that!
So: which calendar the epoch is specified in should be clearly defined, but:
data creators know what they want -- and if they use a epoch near the time
of concern, it gets much harder for the users to get it wrong.
if users interpret it wrong, it only really matters when you are comparing
the data in this file with data from some other source, and doing it
differently for those two -- pretty unlikely, actually.
leap seconds rarely matter in this context -- you process your data with or
without them, but the worst possible result is the entire data set being
off a couple seconds from another whole data set -- and if the content
creators use a epoch near where the data actually are, then that wont even
be off.
In short -- yes, CF should be clear and precise, but we don't need to get
all worked up about older CF data sets not having a clearly defined default
calendar -- if it really matters (which it very rarely will), then
presumably new datasets will be created with well defined calendars.
Just my $0.02
-Chris
On Mon, May 11, 2015 at 7:35 AM, Jim Biard <jbiard at cicsnc.org> wrote:
> Jonathan,
>
> I still think there's
>
>
> On 5/10/15 3:29 AM, Jonathan Gregory wrote:
>
> Dear all
>
> A postscript to my last message. I am not myself convinced that the backward
> incompatibility (for data-writing) that I suggested below is really worth the
> pain it would cause! I think it might well be OK to retain the existing
> calendar name of "gregorian" from CF1.7, but give it a more precise definition,
> in order to eliminate the ambiguity about leap seconds. (Of course, there is
> nothing we can do to remedy this with existing data.)
>
> With all the arguments otherwise unaltered, that would make my proposal:
>
> * We redefine "gregorian", and introduce a new calendar "gregorian_utc", for
> the real-world calendar, without and with UTC leap seconds respectively.
>
> * We abolish the "standard" calendar (since one thing this discussion has
> shown is that there is not a single standard!) and we require the calendar
> always to be specified (no default).
>
> * We state that all the other calendars have fixed-length days with no leap
> seconds.
>
> My further suggestions, about dealing with the Julian/Gregorian transition
> and negative years in better ways, are unaffected (except to retain the name
> "gregorian"). Those are an optional extra, dealing with a different subject,
> but it would be opportune to do both at once.
>
> Cheers
>
> Jonathan
>
> ----- Forwarded message from Jonathan Gregory <j.m.gregory at reading.ac.uk-----
>
> Dear Jim
>
> Yes, I'm glad we appear to be approaching an agreement!
>
>
> "In order to calculate a new date and time given a base date, base time and a
> time increment one must know what calendar to use."
> and I think that is the sense in which I am using "calendar".
>
> I agree that this is a CF-consistent usage of the word calendar, but
> it runs against natural usage, and I think it's worth keeping that
> in mind.
>
> OK. Perhaps we can clarify what we mean in the CF convention.
>
>
> I agree that leap seconds haven't been carefully considered before.
> I disagree that nearly all existing time values have been encoded
> without leap seconds. I'd say that nearly all existing time values
> that were derived from true UTC timestamps are at risk of having
> leap second discontinuities encoded into the set of values.
>
> All right. In that case I think we may have to take more serious steps to
> avoid future problems. I'll come back to that.
>
>
> There are three issues here, so let's not conflate them. They are:
>
> 1. What to call the time system that is like UTC in overall form
> (Greenwich meridian, etc) but doesn't include leap seconds.
> 2. How to indicate which actual time system is being used for the time
> part of the reference time in the units attribute.
> 3. How to indicate whether or not the elapsed times in the time
> variable are certain to be free of leap second induced discontinuities.
>
> This is a useful classification, thanks, but I don't think the situation is
> quite as complicated as that.
>
> I'm not sure (2) is something we need to be concerned about - but it may
> be I've missed a point. In my understanding, there is no calendar implied by
> the reference time, because it's given as a timestamp YYYY-MM-DD hh:mm:ss
> [+-hh:mm], where the [] is the optional time-zone. A timestamp is calendar-
> neutral; it can be interpreted in any calendar, except that some dates will be
> illegal in some calendars, and some times might be illegal if leap seconds are
> in effect. The reference date-time is used for encoding and decoding in the
> calendar specified, with or without leap seconds, but itself implies nothing
> about the time system. We could state it as a requirement of CF that the
> reference date and time must be legal in the encoding which applies to the time
> coordinate.
>
> Regarding part of (1), we need a name for the time-zone which applies at the
> Greenwich meridian without summer/daylight-saving time. We shouldn't use "UTC",
> as the CF standard currently does (quoting from udunits(3)), because that's
> confusing. Maybe it should just be stated explicitly as I have done!
>
> I agree that it's important to write an explicit definition of the
> timestamp that doesn't reference UTC. I don't see timestamps as being time
> system neutral any more than datestamps are calendar neutral. They may be
> interpretable by more than one system, but that's not the same thing in my
> mind as neutrality.
>
> You can read a Gregorian date as a Julian date, but you are going to be
> off by a number of days. You can read a UTC time as a traditional time, but
> you are going to be off by a number of seconds. The new calendars you
> propose would address the question of what systems (calendar and time) were
> used for all parts of the time reference in the units attribute. That is
> what they accomplish, they address point (2). They don't really address
> point (3) at all.
>
> To address point (3), I continue to favour a subspecies of calendar name,
> rather than a modifier or separate attribute, because this distinction only
> applies to the real-world calendars. A decomposition of metadata is cumbersome
> if it isn't generally relevant. Suppose X can take values X1 or X2; if it's X1
> then Y can be Y1 or Y2, whereas Y is irrelevant if it's X2. In that situation I
> would have a single attribute with possible values X1_Y1 X1_Y2 and X2. A single
> attribute is easier for scanning a dataset, less work to write and read, and
> more likely to be correct because it's less likely that Y will not be coded if
> relevant or will be coded if irrelevant.
>
> I suggest that the leap seconds are needed only in the Gregorian calendar i.e.
> the real-world one. I think it's unlikely the proleptic Gregorian calendar will
> be used with leap-seconds; you only need this calendar if you're going back
> more than several centuries, in which case it's probably not a dataset which
> has UTC precision in time, and it's most often used with models, which do not
> have leap seconds. (However, a leap-second variety could be introduced if I am
> wrong and it is needed.) Leap seconds are not needed in the noleap, all_leap,
> 360_day and none calendars, which are all for model worlds. The julian calendar
> is used astronomically and might be used in models, but the web page you
> cited (http://www.ucolick.org/~sla/leapsecs/timescales.html) points out that's
> not a good idea to try to use it with leap seconds since it's based on units
> of day (=86400 s).
>
> Point (3) relates to the question of which calculator (implementation of
> a calendar and/or time system as software for converting date and/or time
> stamps to elapsed times since an epoch) was used to create the elapsed time
> values found in a given time variable. This question is largely independent
> of the question of which systems are represented by the reference date and
> time in the units attribute. An example in terms of dates alone may help
> clarify the issue.
>
> I have a set of datestamps that are based on the Gregorian calendar. I
> calculate elapsed days from the reference date stored in my units attribute
> on my time variable and populate my time variable with the values. I use a
> calculator that is based on the rules of the Julian calendar to get my
> elapsed times.
>
> As long as the span of dates from the reference date to the last datestamp
> in my set don't cross any year where the two calendars differ on whether or
> not to add a leap day, the values stored in my time variable will be
> correct. But a 1-day discontinuity will be encoded into my elapsed day
> values each time I do cross a place where the Julian and Gregorian
> calendars differ.
>
> If I turn my elapsed dates back into datestamps using the same Julian date
> calculator, no one will notice. If I instead use a Gregorian date
> calculator to recover datestamps and one or more discontinuities were
> encoded, I will find that I don't get back the same set of datestamps I
> started with. If I try to take differences between time variable values and
> one or more discontinuities were encoded, I will find that the results will
> contain an error if the difference is taken across the location of a
> discontinuity.
>
> This may sound a bit silly when speaking of Gregorian vs Julian calendars,
> but this is exactly what has been happening on the time system level when
> people have received UTC timestamps and naively used the *nix time
> functions to create elapsed time values to store into time variables. More
> explicitly defining the time system used for the time part of the reference
> date and time in the units attribute via an expanded calendar definition
> does not tell you how the elapsed times stored in the time variable were
> calculated.
>
> If this is the case, then I would propose that in the next version of CF:
>
> * We introduce two new calendars, "gregorian_noleaps" and "gregorian_utc", for
> the real-world calendar, without and with UTC leap seconds respectively. I
> suggest "noleaps" instead of "traditional", which you put forward, because
> "noleaps" is more self-explanatory, I think. I agree with you that "POSIX" is
> not so good because it implies a reference time.
>
> I agree with Ben Hetland that if we were to go with new calendars to
> address point (2) we should use a name like gregorian_noleapseconds or
> gregorian_noleapsec. I disagree with Ben about the question of how
> difficult it is to parse a fixed "calender [time-system [encoding]]"
> sequence.
>
> I think that backward compatibility, among other things, argues in favor
> of adding trailing space-separated modifiers to existing calendar names.
> Even if we end up going with some version of your proposed new calendar
> names, it's important to understand that those new names and definitions
> don't solve the issue in point (3).
>
> * We abolish the "gregorian" and "standard" calendars, and we require the
> calendar always to be specified (no default). This would be quite a radical
> step, but forcing the noleaps/utc property to be explicitly stated is only way
> I can see to avoid in future the ambiguity about whether elapsed times have
> been encoded with or without leap seconds. This does not invalidate existing
> data that adheres to CF1.6 or earlier, but it would invalidate existing
> data-writing software that does not write the calendar attribute, and it would
> require data-reading software to recognise the new calendars (although if it
> assumed the existing default for them that would not be too bad).
>
> * We state that all the other calendars have fixed-length days with no leap
> seconds.
>
> What do you think? Is it worth the pain?
>
> Although it's going to complicate this discussion, if we decided to abolish the
> existing default, we could take this opportunity to deal with problems with the
> mixed Julian/Gregorian calendar, as in http://cf-trac.llnl.gov/trac/ticket/96,
> opened by Dave Allured, but never concluded. To do that I would further propose
> (copying some text from that ticket):
>
> * The gregorian_noleaps and gregorian_utc calendars can be used only to encode
> dates since 1582-10-15, and must not have reference dates earlier than that.
> This means they cannot cross the Julian/Gregorian transition. This is desirable
> because there are ambiguities introduced by assuming different dates for the
> change in calendar. By this rule, the common choice of 1-1-1 as a reference
> date would be disallowed in these calendars, for example.
>
> * We introduce a new calendar "mixed_gregorian_julian", which is the calendar
> of udunits, with no leap seconds. However we make it stricter than currently,
> in these ways: (1) The reference date is not allowed to be any of the dates in
> the transitional period 1582-10-5 to 1582-10-14 inclusive. (2) Neither the
> reference date nor any date which is encoded with this calendar is allowed to
> be a negative year. (3) Year 0 is interpreted as climatological time in this
> calendar, following COARDS, but this is deprecated in favour of the CF
> conventions of Sect 7.4.
>
> * Because of problems caused by the discontinuity, it is recommended that the
> mixed_gregorian_julian calendar be used only in datasets with real-world
> historical dates which span the change of calendar from Julian to Gregorian. In
> datasets with real-world historical dates that all precede the change of
> calendar, the julian calendar should be used. In datasets with real-world
> historical dates that all follow the change of calendar, and in simulated
> datasets in which there is no change of calendar, the proleptic_gregorian
> calendar should be used.
>
> * We disallow dates to be encoded or reference dates to be used in year zero or
> negative years for the julian calendar, because it's ambiguous whether this
> calendar has a year 0.
>
> * We state that year 0 is valid in the proleptic_gregorian, noleap, all_leap,
> 360_day and none calendars.
>
> If these further proposals complicate the previous discussion, we can defer
> them until we've reached agreement on leap seconds!
>
> Best wishes
>
> Jonathan
>
> ----- End forwarded message -----
> _______________________________________________
> CF-metadata mailing listCF-metadata at cgd.ucar.eduhttp://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> Grace and peace,
>
> Jim
> --
> [image: CICS-NC] <http://www.cicsnc.org/> Visit us on
> Facebook <http://www.facebook.com/cicsnc> *Jim Biard*
> *Research Scholar*
> Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
> North Carolina State University <http://ncsu.edu/>
> NOAA National Centers for Environmental Information
> <http://ncdc.noaa.gov/>
> *formerly NOAA?s National Climatic Data Center*
> 151 Patton Ave, Asheville, NC 28801
> e: jbiard at cicsnc.org
> o: +1 828 271 4900
>
> *We will be updating our social media soon. Follow our current Facebook
> (NOAA National Climatic Data Center
> <https://www.facebook.com/NOAANationalClimaticDataCenter> and NOAA National
> Oceanographic Data Center <https://www.facebook.com/noaa.nodc>) and Twitter
> (_at_NOAANCDC <https://twitter.com/NOAANCDC> and @NOAAOceanData
> <https://twitter.com/NOAAOceanData>) accounts for the latest information.*
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20150511/43df4083/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CicsLogoTiny.png
Type: image/png
Size: 15784 bytes
Desc: not available
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20150511/43df4083/attachment-0001.png>
Received on Mon May 11 2015 - 12:18:00 BST