⇐ ⇒

[CF-metadata] How to define time coordinate in GPS?

From: Jonathan Gregory <j.m.gregory>
Date: Sun, 10 May 2015 08:29:23 +0100

Dear all

A postscript to my last message. I am not myself convinced that the backward
incompatibility (for data-writing) that I suggested below is really worth the
pain it would cause! I think it might well be OK to retain the existing
calendar name of "gregorian" from CF1.7, but give it a more precise definition,
in order to eliminate the ambiguity about leap seconds. (Of course, there is
nothing we can do to remedy this with existing data.)

With all the arguments otherwise unaltered, that would make my proposal:

* We redefine "gregorian", and introduce a new calendar "gregorian_utc", for
the real-world calendar, without and with UTC leap seconds respectively.

* We abolish the "standard" calendar (since one thing this discussion has
shown is that there is not a single standard!) and we require the calendar
always to be specified (no default).

* We state that all the other calendars have fixed-length days with no leap
seconds.

My further suggestions, about dealing with the Julian/Gregorian transition
and negative years in better ways, are unaffected (except to retain the name
"gregorian"). Those are an optional extra, dealing with a different subject,
but it would be opportune to do both at once.

Cheers

Jonathan

----- Forwarded message from Jonathan Gregory <j.m.gregory at reading.ac.uk-----

Dear Jim

Yes, I'm glad we appear to be approaching an agreement!

> >"In order to calculate a new date and time given a base date, base time and a
> >time increment one must know what calendar to use."
> >and I think that is the sense in which I am using "calendar".
> I agree that this is a CF-consistent usage of the word calendar, but
> it runs against natural usage, and I think it's worth keeping that
> in mind.

OK. Perhaps we can clarify what we mean in the CF convention.

> I agree that leap seconds haven't been carefully considered before.
> I disagree that nearly all existing time values have been encoded
> without leap seconds. I'd say that nearly all existing time values
> that were derived from true UTC timestamps are at risk of having
> leap second discontinuities encoded into the set of values.

All right. In that case I think we may have to take more serious steps to
avoid future problems. I'll come back to that.

> There are three issues here, so let's not conflate them. They are:
>
> 1. What to call the time system that is like UTC in overall form
> (Greenwich meridian, etc) but doesn't include leap seconds.
> 2. How to indicate which actual time system is being used for the time
> part of the reference time in the units attribute.
> 3. How to indicate whether or not the elapsed times in the time
> variable are certain to be free of leap second induced discontinuities.

This is a useful classification, thanks, but I don't think the situation is
quite as complicated as that.

I'm not sure (2) is something we need to be concerned about - but it may
be I've missed a point. In my understanding, there is no calendar implied by
the reference time, because it's given as a timestamp YYYY-MM-DD hh:mm:ss
[+-hh:mm], where the [] is the optional time-zone. A timestamp is calendar-
neutral; it can be interpreted in any calendar, except that some dates will be
illegal in some calendars, and some times might be illegal if leap seconds are
in effect. The reference date-time is used for encoding and decoding in the
calendar specified, with or without leap seconds, but itself implies nothing
about the time system. We could state it as a requirement of CF that the
reference date and time must be legal in the encoding which applies to the time
coordinate.

Regarding part of (1), we need a name for the time-zone which applies at the
Greenwich meridian without summer/daylight-saving time. We shouldn't use "UTC",
as the CF standard currently does (quoting from udunits(3)), because that's
confusing. Maybe it should just be stated explicitly as I have done!

To address point (3), I continue to favour a subspecies of calendar name,
rather than a modifier or separate attribute, because this distinction only
applies to the real-world calendars. A decomposition of metadata is cumbersome
if it isn't generally relevant. Suppose X can take values X1 or X2; if it's X1
then Y can be Y1 or Y2, whereas Y is irrelevant if it's X2. In that situation I
would have a single attribute with possible values X1_Y1 X1_Y2 and X2. A single
attribute is easier for scanning a dataset, less work to write and read, and
more likely to be correct because it's less likely that Y will not be coded if
relevant or will be coded if irrelevant.

I suggest that the leap seconds are needed only in the Gregorian calendar i.e.
the real-world one. I think it's unlikely the proleptic Gregorian calendar will
be used with leap-seconds; you only need this calendar if you're going back
more than several centuries, in which case it's probably not a dataset which
has UTC precision in time, and it's most often used with models, which do not
have leap seconds. (However, a leap-second variety could be introduced if I am
wrong and it is needed.) Leap seconds are not needed in the noleap, all_leap,
360_day and none calendars, which are all for model worlds. The julian calendar
is used astronomically and might be used in models, but the web page you
cited (http://www.ucolick.org/~sla/leapsecs/timescales.html) points out that's
not a good idea to try to use it with leap seconds since it's based on units
of day (=86400 s).

If this is the case, then I would propose that in the next version of CF:

* We introduce two new calendars, "gregorian_noleaps" and "gregorian_utc", for
the real-world calendar, without and with UTC leap seconds respectively. I
suggest "noleaps" instead of "traditional", which you put forward, because
"noleaps" is more self-explanatory, I think. I agree with you that "POSIX" is
not so good because it implies a reference time.

* We abolish the "gregorian" and "standard" calendars, and we require the
calendar always to be specified (no default). This would be quite a radical
step, but forcing the noleaps/utc property to be explicitly stated is only way
I can see to avoid in future the ambiguity about whether elapsed times have
been encoded with or without leap seconds. This does not invalidate existing
data that adheres to CF1.6 or earlier, but it would invalidate existing
data-writing software that does not write the calendar attribute, and it would
require data-reading software to recognise the new calendars (although if it
assumed the existing default for them that would not be too bad).

* We state that all the other calendars have fixed-length days with no leap
seconds.

What do you think? Is it worth the pain?

Although it's going to complicate this discussion, if we decided to abolish the
existing default, we could take this opportunity to deal with problems with the
mixed Julian/Gregorian calendar, as in http://cf-trac.llnl.gov/trac/ticket/96,
opened by Dave Allured, but never concluded. To do that I would further propose
(copying some text from that ticket):

* The gregorian_noleaps and gregorian_utc calendars can be used only to encode
dates since 1582-10-15, and must not have reference dates earlier than that.
This means they cannot cross the Julian/Gregorian transition. This is desirable
because there are ambiguities introduced by assuming different dates for the
change in calendar. By this rule, the common choice of 1-1-1 as a reference
date would be disallowed in these calendars, for example.

* We introduce a new calendar "mixed_gregorian_julian", which is the calendar
of udunits, with no leap seconds. However we make it stricter than currently,
in these ways: (1) The reference date is not allowed to be any of the dates in
the transitional period 1582-10-5 to 1582-10-14 inclusive. (2) Neither the
reference date nor any date which is encoded with this calendar is allowed to
be a negative year. (3) Year 0 is interpreted as climatological time in this
calendar, following COARDS, but this is deprecated in favour of the CF
conventions of Sect 7.4.

* Because of problems caused by the discontinuity, it is recommended that the
mixed_gregorian_julian calendar be used only in datasets with real-world
historical dates which span the change of calendar from Julian to Gregorian. In
datasets with real-world historical dates that all precede the change of
calendar, the julian calendar should be used. In datasets with real-world
historical dates that all follow the change of calendar, and in simulated
datasets in which there is no change of calendar, the proleptic_gregorian
calendar should be used.

* We disallow dates to be encoded or reference dates to be used in year zero or
negative years for the julian calendar, because it's ambiguous whether this
calendar has a year 0.

* We state that year 0 is valid in the proleptic_gregorian, noleap, all_leap,
360_day and none calendars.

If these further proposals complicate the previous discussion, we can defer
them until we've reached agreement on leap seconds!

Best wishes

Jonathan

----- End forwarded message -----
Received on Sun May 10 2015 - 01:29:23 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒