⇐ ⇒

[CF-metadata] New standard name: datetime_iso8601

From: John Caron <caron>
Date: Fri, 15 Mar 2013 14:39:48 -0600

Hi All:

Ok, its friday afternoon so ill bite on this, and wax philosophical even
before the beer.

An insidious mistake is to think that problems should be fixed in
software libraries. Actually the deep mistake is to mistake the
reference software with the file encoding. Why bother fixing the
encoding when a few lines in _your_ software can fix the problem
transparently? Ive seen this occur in all three of the great western
religious systems of our day: netCDF, HDF and OPeNDAP libraries.

Better is to do the encoding of information as cleanly as possible.
Post-apocalyptic software engineers who have lost all knowledge of what
netCDF and CF mean and are painstakingly uncovering climate archives
with their whisk brooms will thank us.

"35246 hours since 1970-01-01" isnt just unreadable; it uses a calendar
system that may be non-trivial. Calendars are hard; Java has got it
wrong already twice, and is now trying for a 3rd time (with jsr 310 in
Java 8, based on experience with joda-time).

"1974-01-08T14:00:00Z" ( == "35246 hours since 1970-01-01" in the
standard calendar) is a better representation of that date. because at
least you know what the user thought the damn date was.

The good argument for "35246 hours since 1970-01-01" representation, is
that given two of them, at least you know what the user thought the damn
time interval is between them.

Anyway, I think both are good, and should be allowed. Finish your beer
and ill order another round.

John

PS: NEMO files use time units of "days since -4713-01-01T00:00:00Z".
Dont know why, and no disrespect intended, but I cant say I like it.


On 3/15/2013 4:30 AM, Hattersley, Richard wrote:
> Hi all,
> I think Steve has got to the heart of the issue with his comment, "IMHO
> it is the client libraries that hold the answer to this question."
> However you choose to peer into your netCDF files you are seeing them
> through the lens of a "client library". And it's worth noting that
> date/times aren't special in this regard, this is just as true for
> floating point numbers.
> With the Iris library(*), we are working towards improving the
> readability and usability of date/times because we recognise that simply
> displaying "35246 hours since 1970-01-01" is essentially meaningless to
> a human. From what I've seen, I think we can all agree on that! So we
> want our users to be able to see date/times in a human-readable form,
> but that certainly doesn't mean we will represent them that way internally.
> The only real benefit I can see for the string representation is that
> _some_ date/time values can be made more human-readable when viewed
> through client libraries which don't support the
> machine-readable-to-human-readable conversion. But if the library can't
> do machine-to-human then it probably can't do human-to-machine. In which
> case there's very little you can actually _do_ with the date/time values
> (e.g. determine ordering or compute intervals). If the library is that
> limited then adding the string representation to CF isn't really fixing
> the right problem. If you'll excuse the analogy, it's like taking the
> engine out of a car because the brakes don't work.
> In short, I am against the addition of a string representation for
> date/times.
> *) Iris - a candidate reference implementation of CF implemented in
> Python (http://scitools.org.uk/).
> Richard Hattersley Iris Benevolent Dictator
> *Met Office* FitzRoy Road Exeter Devon EX1 3PB United Kingdom
> Tel: +44 (0)1392 885702
> Email: richard.hattersley at metoffice.gov.uk
> <mailto:richard.hattersley at metoffice.gov.uk> Web: www.metoffice.gov.uk
> <http://www.metoffice.gov.uk/>
>
> ------------------------------------------------------------------------
> *From:* CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] *On Behalf
> Of *Steve Hankin
> *Sent:* 24 February 2013 19:07
> *To:* John Caron
> *Cc:* cf-metadata at cgd.ucar.edu
> *Subject:* Re: [CF-metadata] New standard name: datetime_iso8601
>
>
> On 2/23/2013 1:41 PM, John Caron wrote:
>> Hi Chris, and all:
>>
>> On 1/11/2013 2:37 PM, Chris Barker - NOAA Federal wrote:
>>> On Fri, Jan 11, 2013 at 9:00 AM, Aleksandar Jelenak - NOAA Affiliate
>>> <aleksandar.jelenak at noaa.gov> wrote:
>>>
>>>> Here's the modified proposal for the datetime_iso8601 standard name:
>>> ...
>>>> String representing date-time information according to the ISO
>>>> 8601:2004(E) standard.
>>>
>>> I think we should NOT adopt a string option for datetime variables.
>>>
>>> To quote Jonathan Gregory:
>>>
>>> """
>>> In CF we have always applied the
>>> principle that we only add to CF when there is a need to do so, i.e.
>>> there is
>>> a use-case for something which cannot already be represented in CF
>>> """
>>>
>>> We already have a way to encode datetimes in CF-netcdf.
>>
>> Yes, but <time since date> is not as good as <date> as an encoding.
>> The <time since date> is a result of cramming calendar handling into a
>> units package.
>>
>> I would advocate both should be allowed.
>
> Hi John,
>
> The bell is ringing, "round three" on the ISO dates issue.
>
> The arguments *for *supporting ISO dates are:
>
> 1. they are the clear standard for date/time interoperability and
> deserve support _of some kind_ in CF
> 2. they offer good human readability
> 3. there are widely available support libraries (though with problems
> as articulated below)
>
> The arguments *against *are:
>
> 1. introducing a new encoding information that is already fully
> supported is a clear loss of interoperability that won't get ironed
> out of CF for years -- until older applications are updated to
> support it
> 2. introducing two encodings for the same information is a clear
> increase in complexity
> 3. ISO dates cannot handle all of the situations the CF commonly
> encounters -- climatologies, non-leap calendars, etc. This requires
> non-standard extensions which are much less well supported than the
> base ISO standard. Extra complexity.
> 4. The base ISO date standard is overly complex for CF needs. CF would
> need to profile it down. More complexity.
> 5. ISO dates are in fact *NOT* a good encoding for the needs of a
> coordinate axis. They are a good external representation and a good
> interchange format. They make nice metadata representations of
> dates. They muddy the simplicity of time as a measurable,
> computable quantity that monotonically increments like other
> coordinates. ISO dates are not, themselves, typically encoded as
> ISO date strings in their internal representation in code (nor
> should they be in CF).
>
> ==> All of the advantages of ISO dates can be build into CF _if we add
> just a couple of tools in the CF support libraries_
>
> * add easy ability for an application program to convert between ISO
> dates and CF representations ==> simple code
> * easy ability for humans to read units-since-T0 encodings in CF ==>
> already included in ncdump today
>
> IMHO it is the client libraries that hold the answer to this question.
> They give CF all of the advantages without increasing complexity or
> compromising interoperability.
>
> - Steve
>
>
>>
>>>
>>> I believe this proposal resulted from the discussion about adding a
>>> more flexible approach to datetimes in the CF Data Model. I think
>>> that's a good idea, but separate from what encoding is used in
>>> CF-netcdf. ( see my recent note for more detail about the difference
>>> between and encoding and a data model ).
>>>
>>> 1) Having multiple ways to encode the same data in file format adds
>>> complication to all client code -- client code would need a way to
>>> process both ISO strings and "time_unit since datetime"
>>
>> client code already has to parse the "date" in "time since date". So
>> theres no extra code involved.
>>
>>
>>>
>>> 2) Any client code that can process ISO strings is likely to need to
>>> convert them to a client-specific datetime representation anyway, in
>>> order to plot, calculate with, etc them.
>>>
>>> 3) Any client library that can process ISO strings is very likely to
>>> be able to also work with "time_unit since datetime" encoded data
>>> anyway -- and it had better, as that encoding is part of the standard
>>> anyway.
>>>
>>> As a result, we would be complicating client code, and getting no new
>>> functionality.
>>
>> We get new functionality in that "date" is clearer than "time since
>> date", and more likely to be correctly understood by non CF specific
>> software and users of our data in 100 years when theres no more CF
>> discussion group to help people out.
>>
>> when you have non-standard calendars, the difficulty is compounded
>> many times over. How many seconds since 1970 is April 3, 2045 at 1:13
>> am in the no-leap calendar? Are you sure? What if you could just put
>> into your file "2045-04-03T01:13:00" ?? Even rocket scientists can do
>> that ;^)
>>
>>>
>>> The one advantage I can see at the moment is that simple, non-CF-aware
>>> clients, like ncdump, could easily present a nice human-readable
>>> format. But I don't think that is worth the additional complication.
>>
>> Ideally file encodings should be as independent as possible from
>> libraries and applications. We have historically had an unfortunate
>> dependence on the udunits reference library for date parsing. We are
>> slowly unwinding that dependence. I think in this case widening the
>> allowed encoding for datetimes is well worth the complication.
>>
>> Regards,
>>
>> John
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
Received on Fri Mar 15 2013 - 14:39:48 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒