⇐ ⇒

[CF-metadata] time as ISO strings

From: Seth McGinnis <mcginnis>
Date: Fri, 22 Oct 2010 14:03:04 -0600

I agree completely. Using ISO date strings as an encoding for time coordinates
sounds terrible to me, doubly so if you consider that they would have to be
not-quite-ISO in order to accommodate non-Gregorian calendars.

That said, I also agree that it would be great if netcdf tools all knew how to
translate numeric time coordinates into ISO strings. (And translate from
string back to number, I suppose, but for me, human-readable output is the
important thing.)

At the start of this thread, I would have argued that it would be nice to have
an official spec for including a corresponding set of date strings as
ancillary/convenience metadata, but at this point, I think that best practice
can be summarized as simply "match what 'ncdump -t' does". (With perhaps the
addendum that people are likely to infer precision based on which fields are
present, so format accordingly.)

Cheers,

--Seth


On Thu, 21 Oct 2010 21:06:23 -0400
 Steve Hankin <Steven.C.Hankin at noaa.gov> wrote:
> Hi Jon, Benno, John and other pals,
>
>Since this email thread already contains an element of informal voting I'll
>cast my ballot: CF is a better standard *WITHOUT *admitting ISO date strings
>as an encoding for time coordinates. My opinion is is based upon this
>outlook:
>
> * '/To create quality software, the ability to say "no" is usually
> far more important than the ability to say "yes."/'
> (http://queue.acm.org/detail.cfm?id=1142044)
>
>Bloat and run away complexity are a continual threat to the quality of a
>standard as it evolves. I'd argue that the measure of whether a new feature
>deserves inclusion should be whether it adds useful new functionality and does
>so in a manner that is "clean" -- preserving the consistency and simplicity of
>the standard to the degree feasible.
>
> * Introducing ISO strings as coordinates adds no encoding power to
> CF. It raises issues of precision but then fails to address them
> adequately. (In fact it imports a host of ambiguities about
> interpretation of precision as Jon pointed out to us in the
> "metadata" versus "positioning" debate.)
> * It fails the test of consistency, since it is not applicable to
> virtually identical precision issues that exist for latitude,
> longitude and vertical coordinates.
> * the need to support two separate encodings (ISO and "days since
> xxx") for the same date/time coordinate information would force
> additional complexity into the standards documentation and into
> clients that would hope to be generic CF applications. It wound
> degrade interoperability for clients that did not support both
> encodings
> * it fails to address the multiple-calendar needs of CF. (creating
> another inconsistency) (ISO 8601 does not standardize the
> interpretation of dates prior to the 1582 Julian-Gregorian
> discontinuity ... which likely means that library codes cannot be
> trusted to give consistent results.)
> * it does not even add a useful measure of convenience. The "-t"
> option to ncdump already provides the needed convenience for file
> readers. And for file creators, a new utility that would
> translate ISO strings into CF time step values could probably be
> written in less time than this email dialog has occupied. (If
> this statement appears exaggerated consider that the effort to
> develop this utility would be paid over and over in the clients
> that would have to interpret this encoding.)
>
>None of this is a comment on the utility of ISO date/time strings as metadata.
>There are appropriate uses of ISO date/time strings in CF as non-coordinate
>variables and attributes. The NO vote is in regard to their use as CF
>coordinates.
>
> - Steve
>
>
>===================================
>
>On 10/21/2010 11:57 AM, Jon Blower wrote:
>> Hi Benno,
>>
>> 2010-09 is not necessarily a precise specification of a month - time zones
>make it a little fuzzy for one thing. Separate to this, there are parallel
>conversations going on in the ISO/OGC community about what time strings
>actually mean. A metadata person might say that "2010-09" is simply a
>shorthand for the fuzzy concept of "September 2010" and does not represent a
>precise interval (i.e. a square-wave function that is 1 during September and 0
>outside). Apart from the time zone issue which blurs the boundaries, this
>square-wave is simply not what humans mean when, for example, they tag a
>report as having been written in September 2010. It just distinguishes it
>from version 2 of the report, which was written in November. In this context,
>it's just a label with some temporal meaning.
>>
>> These "metadata guys" are in discussion with the "positioning guys" who view
>date/times as precisely-defined positions within a temporal CRS. You may (or
>may not!) like to look at the GeoAPI mailing list, in which we are trying to
>figure out whether we can actually use the same Java types for both of these
>subtly-different views of date/times (we hope we can but haven't agreed). One
>might think that they are obviously the same thing, but I don't think so.
>>
>> You *could* modify CF so that to represent data that are "representative of
>September 2010", you specify a nominal date half-way through September and set
>the bounds to the first and last instants of September. And perhaps use a new
>cell_methods of "representative". But the half-way point and the bounds would
>be quite (very) tedious to compute in the general case (months and years are
>of variable length for example and depend on the calendar system).
>>
>>> Of course, how the data is actually related to that interval is where the
>>> notion of precision might come in
>> Actually, you've probably gathered that I also consider the notion of
>precision to apply to the interval itself, not just how the data relates to
>it.
>>
>> This discussion repeats a bit of the previous discussion on this list
>entitled "bounds/precision for time axis". I like Jonathan's distinction
>between the concepts of temporal resolution and representivity:
>http://www.mail-archive.com/cf-metadata at cgd.ucar.edu/msg01341.html.
>>
>> And just for completeness we should not that ISO8601 strings are not
>fixed-length, nor do they have a maximum length (in contrast to what I said
>before, sorry). So I can see some implementation challenges in NetCDF.
>>
>> Cheers, Jon
>>
>>
>> -----Original Message-----
>> From: bennoblumenthal at gmail.com [mailto:bennoblumenthal at gmail.com] On Behalf
>Of Benno Blumenthal
>> Sent: 21 October 2010 15:43
>> To: Steve Hankin
>> Cc: Jon Blower; cf-metadata at cgd.ucar.edu
>> Subject: Re: [CF-metadata] New standard names for satellite obs data (time
>as ISO strings)
>>
>> While expressing precision in CF is an interesting issue, in this case
>> the Wikipedia quote is using the term in a different sense than I
>> (hopefully we) usually mean -- ISO8601 lets one express time intervals
>> succinctly in a single string, e.g. 2010-09 to mean all of september
>> 2010, which is not an accuracy issue, it is a precise specification of
>> a larger interval. It lets you write 2010-09-01/10-05 as well, i.e.
>> it is not limited to intervals that involve special notational
>> boundaries. As Steve points out CF expresses this using a bounds
>> coordinate, i.e. giving the precise edges of each interval. Of
>> course, how the data is actually related to that interval is where the
>> notion of precision might come in, which cell methods/measures
>> addresses, perhaps inadequately for the purpose at hand.
>>
>> ISO8601 is quite neat in the sense that it forces one to always
>> specify an interval, and CF software reading time bounds data and
>> rendering ISO8601 strings would do us all a lot of good.
>>
>> Benno
>>
>> On Wed, Oct 20, 2010 at 6:34 PM, Steve Hankin<Steven.C.Hankin at noaa.gov>
> wrote:
>>> Hi Jon,
>>>
>>> Why do you see this as an issue of date-times as ISO strings in particular?
>>> The same issues of precision are found in longitudes expressed as a
>>> degrees-minutes-seconds string compared to a floating point. Or indeed to
>a
>>> depth expressed as a decimal string of known numbers of digits. ("100.00"
>>> communicates different precision than "100" though both a represented by
>the
>>> same binary value.)
>>>
>>> CF provides the bounds attribute and the cell methods/measures to clarify
>>> (somewhat) these points. What is your proposal for improved representation
>>> of precisions? And wouldn't a general improvement in how to specify
>>> coordinate precision be preferable to a solution that applies to time,
>only?
>>>
>>> - Steve
>>>
>>> =============================
>>>
>>>
>>> On 10/20/2010 9:41 AM, Jon Blower wrote:
>>>
>>> Hi all,
>>>
>>> I haven't followed this debate closely, but I've had cause to do a fair
>>> amount of thinking (outside the CF context) on the pros and cons of
>>> identifying date/times as strings or numbers. I could probably write a
>>> very boring essay on this but in summary, they are not exactly
>>> equivalent ways of representing the same information.
>>>
>>> One way in which they are different is precision. A value of "x seconds
>>> since y" has no implied precision - typically in programs we take the
>>> precision to be milliseconds, but there's nothing to suggest this in the
>>> actual metadata (anyone who tries to populate a GUI from CF metadata
>>> struggles with this). Semantically it's a time instant; i.e. an
>>> infinitesimal position in a temporal coordinate reference system.
>>> However, an ISO8601 string can have various precisions. (The string
>>> "2009-10" is not considered equivalent to "2009-10-01T00:00:00.000Z".)
>>>
>>> > From Wikipedia (http://en.wikipedia.org/wiki/ISO_8601):
>>>
>>> "For reduced accuracy, any number of values may be dropped from any of
>>> the date and time representations, but in the order from the least to
>>> the most significant. For example, "2004-05" is a valid ISO 8601 date,
>>> which indicates May (the fifth month) 2004. This format will never
>>> represent the 5th day of an unspecified month in 2004, nor will it
>>> represent a time-span extending from 2004 into 2005."
>>>
>>> I've argued before in a previous thread on this list that it would be
>>> good to be able to specify the precision of time coordinates in terms of
>>> calendar date/time fields (which isn't the same thing as providing a
>>> tolerance value on the numeric coordinate value of a time axis).
>>>
>>> I'm not saying that we should definitely allow time strings in CF, just
>>> pointing out that they have some use cases we currently can't fulfil.
>>> I'm not sure they are definitively "bad practice" in all cases.
>>>
>>> (Regarding a technical point raised below, yes, it's a pain to represent
>>> variable length strings in NetCDF, but there is a maximum length for
>>> ISO8601 strings.)
>>>
>>> Hope this helps,
>>> Jon
>>>
>>> -----Original Message-----
>>> From: cf-metadata-bounces at cgd.ucar.edu
>>> [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Lowry, Roy K
>>> Sent: 20 October 2010 10:00
>>> To: Ben Hetland; cf-metadata at cgd.ucar.edu
>>> Subject: Re: [CF-metadata] New standard names for satellite obs data
>>>
>>> Dear All,
>>>
>>> As others have said, I think this debate is irrelevant as there should
>>> be no need for string timestamps in NetCDF. Providing a Standard Name
>>> only encourages what I consider to be bad practice.
>>>
>>> Cheers, Roy.
>>>
>>> -----Original Message-----
>>> From: cf-metadata-bounces at cgd.ucar.edu
>>> [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Ben Hetland
>>> Sent: 20 October 2010 09:14
>>> To: cf-metadata at cgd.ucar.edu
>>> Subject: Re: [CF-metadata] New standard names for satellite obs data
>>>
>>> On 19.10.2010 16:27, Seth McGinnis wrote:
>>>
>>> What about using 'date' for string-valued times? That was my homebrew
>>> solution when I was considering a similar problem.
>>>
>>> If I may butt in and contribute here, I usually prefer names like
>>> 'datetime' or 'timestamp' in cases like this, because 'date' is
>>> potentially confusing. It may not be immediately obvious to a future
>>> reader (or programmer) that a variable called 'date' supports points in
>>> time down to for example seconds of accuracy.
>>>
>>>
>>> (Note that string data is a big pain to deal with in NetCDF-3, because
>>> you're limited to fixed-length character arrays. You need to use
>>> NetCDF-4 / HDF5 to get Strings as a data type.)
>>>
>>> (It may not be such a practical issue with ISO 8601 strings, as a
>>> reasonable max. length can be determined, I presume.)
>>>
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>>
>>
>>
Received on Fri Oct 22 2010 - 14:03:04 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒