Dear Jonathan,
I think we will make it more difficult on users to write and interpret 
CF data  if the calendar attribute is too complicated and the meaning of 
"gregorian" changes from what it meant in the past. In the past we got 
away with a single "gregorian" option, and I suspect all the 
CF-compliant model output and nearly all the observational data stored 
under CF would be correctly interpreted if the definition of "gregorian" 
included the following sentences:
----------------------------------
Under the "gregorian" calendar the length of the solar day can be 
assumed to be exactly 86400 seconds long (i.e., there are no leap 
seconds).  This means that for models where this assumption almost 
invariably is valid, conversion from elapsed time to clock time is 
straight-forward and exact, whereas for observations, conversion to 
clock time may introduce errors as large as 16 seconds because it is 
unknown whether the UTC or GPS time system has been used in specifying 
the reference times (appearing in the time units attribute), and it is 
also unknown whether leap seconds have been properly accounted for in 
converting UTC clock times to elapsed time.
----------------------------------
Interpreted as above, the "gregorian" calendar would make it possible 
for users to invariably decode *model* output and not encounter the 
problems you discussed in your first paragraph.  Of course with 
*observations*, they might encounter such problems, but that's because 
the observationalist storing the data is apparently o.k. with errors of 
up to 16 seconds (otherwise they would rewrite their data with one of 
the newly proposed calendars specifying UTC or GPS).
In the future, I think we should interpret "gregorian" the same as we 
have in the past, but we would also offer two new calendars 
(gregorian_utc and gregorian_gps) for those who need to indicate that 
their reference times are defined by a specific time system, and one 
more calendar (gregorian_utc_nls) for those who choose not to properly 
account for leap seconds in converting from UTC clock time to elapsed 
time.   These new calendars would mostly be used for observations, but 
conceivably there might be a model initialized from observations  (and 
subsequently compared against observations perhaps only a few seconds 
later), where one would want to precisely record whether the reference 
time (included in the units) follows the UTC or GPS time system, just as 
in the observational data set it is being compared with.  In these 
(rare) cases, the calendar would be indicated as being either 
gregorian_utc or gregorian_gps for the model output, just as in the 
observational data set.
You argue that interpretation of "gregorian" depends on whether it 
describes observations or model output.  That's true, and apparently 
that has always been the case.  We can't change that, and why should we 
change it going forward?
I don't see a case for including gregorian_nls (for models), unless we 
decide to redefine "gregorian" to mean:
"a calendar that: 1) might or might not account for leap seconds, 2) 
might or might not assume the length of the solar day is exactly 86400 
seconds long, and 3) might express the reference time according to 
either UTC or GPS"
This definition would also be consistent with past usage of "gregorian" 
but would make virtually all the model data stored already under CF with 
calendar="gregorian" seem to be imprecise in specifying the 
time-coordinates, even though the coordinates are in fact defined such 
that they can be converted to wall clock time assuming the solar day is 
exactly 86400 seconds long.  If you want to adopt this alternative 
definition (rather than the one I suggest in the 2nd paragraph above), 
then we should probably introduce "gregorian_nls" as a calendar/time 
system for which the length of the solar day is exactly 86400 seconds 
long".    In the future gregorian_nls would probably be used (instead of 
"gregorian") in all but a few model-produced datasets.
best regards,
Karl
On 7/14/15 10:48 AM, Jonathan Gregory wrote:
> Dear Karl
>
> Thank you for your useful summary, which I think is quite right. That will
> provide some good text for the standard document.
>
> You suggest merging gregorian_nls (for models, exactly 86400-second days)
> into gregorian (imprecise about which calendar is used and how encoded),
> distinguishing them according to whether the data is model or observational.
>
> I'm not comfortable with that. I can't think of another case in CF where the
> metadata is designed to be interpreted differently for models and observations,
> and it would not be easy to do, because there's no metadata that is guaranteed
> to be present in a standard form to tell you if it's model or observational.
> Yet I think this distinction must be made. It would not be satisfactory if
> users interpreted the imprecision of "gregorian" to mean they could decode
> model data e.g. from CMIP using the UTC calendar, and found days that appear to
> start 16 seconds different from midnight. I am sure this would cause problems
> e.g. wrong months selected. That's why I think we need gregorian_nls as a model
> calendar, to be used instead of gregorian in future where applicable. We need
> to be able to assert that the 86400-second day definitely applies.
>
> I agree with Jim that there is a distinction between gregorian_utc_nls and
> gregorian too. Some people supplying observational data don't require the
> precision of specifying UTC (or GPS), so they don't want to choose gregorian_
> utc or gregorian_gps. Nan argued this case. Others however may wish to be
> precise about UTC timestamps, but choose to encode it without leap seconds.
> So I think we need the meaning of gregorian_utc_nls.
>
> However, on reflection I convinced myself (at least! - but not Jim) that the
> distinction between gregorian_nls (for models) and gregorian_utc_nls (for
> the real world) is too subtle to make reliably, so I suggested we should use
> gregorian_nls for both, and say that *if* it is observational data, it must
> be UTC. That's not quite the same as your suggestion, because the timestamps
> can be exactly recovered without knowing if it's model or observational, but
> you would need to know in order to tell whether the elapsed times are accurate
> (as they are for model data) or perhaps not accurate (for real world data).
> Whereas I regard timestamps as more important, Jim tends to regard elapsed
> times as more important, so I guess this second issue would count more for him.
> If it is crucial, then we need both gregorian_nls and gregorian_utc_nls. The
> distinction is whether it is model or real-world time. My concern is that when
> when models are used to simulate events that happened in real-world time, data-
> producers may often find it hard to decide between these alternatives, and it's
> unclear whether it's useful to do so anyway.
>
> Best wishes
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Tue Jul 14 2015 - 16:15:55 BST