⇐ ⇒

[CF-metadata] time as ISO strings

From: Jon Blower <j.d.blower>
Date: Mon, 25 Oct 2010 09:37:35 +0100

Hi Steve,

 

I agree with all of these, as it seems does the community in general.
For the record, I'd like to add another objection, that might also argue
against using ISO8601 for time *metadata*. I'm not sure it's been
explicitly called out yet that UDUNITS time specifications are
inconsistent with ISO8601 in more than one way. For example, "1900-1-1
0:0:0" is legal in UDUNITS, but not ISO8601 (fields must be zero-padded
and there must be a "T" in between the date and time). This is in
addition to the different defaults to local time or UTC (actually I
guess this is a property of CF, not UDUNITS?)

 

Anyway, it seems that it might be confusing to introduce two string
representations of date/times into CF, even for metadata. Are there any
important cases in which ISO8601 is more expressive than UDUNITS syntax,
or could we stick to UDUNITS for everything?

 

Cheers, Jon

 

From: cf-metadata-bounces at cgd.ucar.edu
[mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Steve Hankin
Sent: 22 October 2010 02:06
To: CF metadata
Subject: Re: [CF-metadata] time as ISO strings

 

Hi Jon, Benno, John and other pals,

Since this email thread already contains an element of informal voting
I'll cast my ballot: CF is a better standard WITHOUT admitting ISO date
strings as an encoding for time coordinates. My opinion is is based
upon this outlook:

* 'To create quality software, the ability to say "no" is usually
far more important than the ability to say "yes."'
(http://queue.acm.org/detail.cfm?id=1142044)

Bloat and run away complexity are a continual threat to the quality of a
standard as it evolves. I'd argue that the measure of whether a new
feature deserves inclusion should be whether it adds useful new
functionality and does so in a manner that is "clean" -- preserving the
consistency and simplicity of the standard to the degree feasible.

* Introducing ISO strings as coordinates adds no encoding power to
CF. It raises issues of precision but then fails to address them
adequately. (In fact it imports a host of ambiguities about
interpretation of precision as Jon pointed out to us in the "metadata"
versus "positioning" debate.)
* It fails the test of consistency, since it is not applicable to
virtually identical precision issues that exist for latitude, longitude
and vertical coordinates.
* the need to support two separate encodings (ISO and "days since
xxx") for the same date/time coordinate information would force
additional complexity into the standards documentation and into clients
that would hope to be generic CF applications. It wound degrade
interoperability for clients that did not support both encodings
* it fails to address the multiple-calendar needs of CF.
(creating another inconsistency) (ISO 8601 does not standardize the
interpretation of dates prior to the 1582 Julian-Gregorian discontinuity
... which likely means that library codes cannot be trusted to give
consistent results.)
* it does not even add a useful measure of convenience. The "-t"
option to ncdump already provides the needed convenience for file
readers. And for file creators, a new utility that would translate ISO
strings into CF time step values could probably be written in less time
than this email dialog has occupied. (If this statement appears
exaggerated consider that the effort to develop this utility would be
paid over and over in the clients that would have to interpret this
encoding.)

None of this is a comment on the utility of ISO date/time strings as
metadata. There are appropriate uses of ISO date/time strings in CF as
non-coordinate variables and attributes. The NO vote is in regard to
their use as CF coordinates.

    - Steve


===================================

On 10/21/2010 11:57 AM, Jon Blower wrote:

Hi Benno,
 
2010-09 is not necessarily a precise specification of a month - time
zones make it a little fuzzy for one thing. Separate to this, there are
parallel conversations going on in the ISO/OGC community about what time
strings actually mean. A metadata person might say that "2010-09" is
simply a shorthand for the fuzzy concept of "September 2010" and does
not represent a precise interval (i.e. a square-wave function that is 1
during September and 0 outside). Apart from the time zone issue which
blurs the boundaries, this square-wave is simply not what humans mean
when, for example, they tag a report as having been written in September
2010. It just distinguishes it from version 2 of the report, which was
written in November. In this context, it's just a label with some
temporal meaning.
 
These "metadata guys" are in discussion with the "positioning guys" who
view date/times as precisely-defined positions within a temporal CRS.
You may (or may not!) like to look at the GeoAPI mailing list, in which
we are trying to figure out whether we can actually use the same Java
types for both of these subtly-different views of date/times (we hope we
can but haven't agreed). One might think that they are obviously the
same thing, but I don't think so.
 
You *could* modify CF so that to represent data that are "representative
of September 2010", you specify a nominal date half-way through
September and set the bounds to the first and last instants of
September. And perhaps use a new cell_methods of "representative". But
the half-way point and the bounds would be quite (very) tedious to
compute in the general case (months and years are of variable length for
example and depend on the calendar system).
 

        Of course, how the data is actually related to that interval is
where the
        notion of precision might come in

 
Actually, you've probably gathered that I also consider the notion of
precision to apply to the interval itself, not just how the data relates
to it.
 
This discussion repeats a bit of the previous discussion on this list
entitled "bounds/precision for time axis". I like Jonathan's
distinction between the concepts of temporal resolution and
representivity:
http://www.mail-archive.com/cf-metadata at cgd.ucar.edu/msg01341.html.
 
And just for completeness we should not that ISO8601 strings are not
fixed-length, nor do they have a maximum length (in contrast to what I
said before, sorry). So I can see some implementation challenges in
NetCDF.
 
Cheers, Jon
 
 
-----Original Message-----
From: bennoblumenthal at gmail.com [mailto:bennoblumenthal at gmail.com] On
Behalf Of Benno Blumenthal
Sent: 21 October 2010 15:43
To: Steve Hankin
Cc: Jon Blower; cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data
(time as ISO strings)
 
While expressing precision in CF is an interesting issue, in this case
the Wikipedia quote is using the term in a different sense than I
(hopefully we) usually mean -- ISO8601 lets one express time intervals
succinctly in a single string, e.g. 2010-09 to mean all of september
2010, which is not an accuracy issue, it is a precise specification of
a larger interval. It lets you write 2010-09-01/10-05 as well, i.e.
it is not limited to intervals that involve special notational
boundaries. As Steve points out CF expresses this using a bounds
coordinate, i.e. giving the precise edges of each interval. Of
course, how the data is actually related to that interval is where the
notion of precision might come in, which cell methods/measures
addresses, perhaps inadequately for the purpose at hand.
 
ISO8601 is quite neat in the sense that it forces one to always
specify an interval, and CF software reading time bounds data and
rendering ISO8601 strings would do us all a lot of good.
 
Benno
 
On Wed, Oct 20, 2010 at 6:34 PM, Steve Hankin <Steven.C.Hankin at noaa.gov>
<mailto:Steven.C.Hankin at noaa.gov> wrote:

        Hi Jon,
         
        Why do you see this as an issue of date-times as ISO strings in
particular?
        The same issues of precision are found in longitudes expressed
as a
        degrees-minutes-seconds string compared to a floating point. Or
indeed to a
        depth expressed as a decimal string of known numbers of digits.
("100.00"
        communicates different precision than "100" though both a
represented by the
        same binary value.)
         
        CF provides the bounds attribute and the cell methods/measures
to clarify
        (somewhat) these points. What is your proposal for improved
representation
        of precisions? And wouldn't a general improvement in how to
specify
        coordinate precision be preferable to a solution that applies to
time, only?
         
            - Steve
         
        =============================
         
         
        On 10/20/2010 9:41 AM, Jon Blower wrote:
         
        Hi all,
         
        I haven't followed this debate closely, but I've had cause to do
a fair
        amount of thinking (outside the CF context) on the pros and cons
of
        identifying date/times as strings or numbers. I could probably
write a
        very boring essay on this but in summary, they are not exactly
        equivalent ways of representing the same information.
         
        One way in which they are different is precision. A value of "x
seconds
        since y" has no implied precision - typically in programs we
take the
        precision to be milliseconds, but there's nothing to suggest
this in the
        actual metadata (anyone who tries to populate a GUI from CF
metadata
        struggles with this). Semantically it's a time instant; i.e. an
        infinitesimal position in a temporal coordinate reference
system.
        However, an ISO8601 string can have various precisions. (The
string
        "2009-10" is not considered equivalent to
"2009-10-01T00:00:00.000Z".)
         
>From Wikipedia (http://en.wikipedia.org/wiki/ISO_8601):
         
        "For reduced accuracy, any number of values may be dropped from
any of
        the date and time representations, but in the order from the
least to
        the most significant. For example, "2004-05" is a valid ISO 8601
date,
        which indicates May (the fifth month) 2004. This format will
never
        represent the 5th day of an unspecified month in 2004, nor will
it
        represent a time-span extending from 2004 into 2005."
         
        I've argued before in a previous thread on this list that it
would be
        good to be able to specify the precision of time coordinates in
terms of
        calendar date/time fields (which isn't the same thing as
providing a
        tolerance value on the numeric coordinate value of a time axis).
         
        I'm not saying that we should definitely allow time strings in
CF, just
        pointing out that they have some use cases we currently can't
fulfil.
        I'm not sure they are definitively "bad practice" in all cases.
         
        (Regarding a technical point raised below, yes, it's a pain to
represent
        variable length strings in NetCDF, but there is a maximum length
for
        ISO8601 strings.)
         
        Hope this helps,
        Jon
         
        -----Original Message-----
        From: cf-metadata-bounces at cgd.ucar.edu
        [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Lowry,
Roy K
        Sent: 20 October 2010 10:00
        To: Ben Hetland; cf-metadata at cgd.ucar.edu
        Subject: Re: [CF-metadata] New standard names for satellite obs
data
         
        Dear All,
         
        As others have said, I think this debate is irrelevant as there
should
        be no need for string timestamps in NetCDF. Providing a Standard
Name
        only encourages what I consider to be bad practice.
         
        Cheers, Roy.
         
        -----Original Message-----
        From: cf-metadata-bounces at cgd.ucar.edu
        [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Ben
Hetland
        Sent: 20 October 2010 09:14
        To: cf-metadata at cgd.ucar.edu
        Subject: Re: [CF-metadata] New standard names for satellite obs
data
         
        On 19.10.2010 16:27, Seth McGinnis wrote:
         
        What about using 'date' for string-valued times? That was my
homebrew
        solution when I was considering a similar problem.
         
        If I may butt in and contribute here, I usually prefer names
like
        'datetime' or 'timestamp' in cases like this, because 'date' is
        potentially confusing. It may not be immediately obvious to a
future
        reader (or programmer) that a variable called 'date' supports
points in
        time down to for example seconds of accuracy.
         
         
        (Note that string data is a big pain to deal with in NetCDF-3,
because
        you're limited to fixed-length character arrays. You need to
use
        NetCDF-4 / HDF5 to get Strings as a data type.)
         
        (It may not be such a practical issue with ISO 8601 strings, as
a
        reasonable max. length can be determined, I presume.)
         
         
        _______________________________________________
        CF-metadata mailing list
        CF-metadata at cgd.ucar.edu
        http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
         
         

 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20101025/9a4f32e3/attachment-0001.html>
Received on Mon Oct 25 2010 - 02:37:35 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒