⇐ ⇒

[CF-metadata] New standard names for satellite obs data

From: Jon Blower <j.d.blower>
Date: Thu, 21 Oct 2010 15:28:42 +0100

Hi all,

OK, I had four replies to my email concerning ISO8601 time strings and
precision so I'm going to respond briefly to them all simultaneously,
without the aid of a safety net. I hope I'm capturing the important
points:


1. Steve and Roy both asked (I think) why we can't have a more general
notion of precision that applies to all coordinate axes, not just time.
I argue that precision in time *can be* (but is not always) different
semantically. When I say "these data are representative of June 2009" I
don't mean "these data were collected on the 15th of June 2009, plus or
minus 15 days". The concept of "data representative of a time period"
is very different from "data collected at an uncertain time".
"Representative" data is commonly found in high-level analysed products,
not so much in instrument records I guess.

2. Also (still replying to Steve), there's nothing to stop me
interpreting "100.0" and "100.00" as exactly the same value - there's no
*explicit* precision, although a human *might* infer one. (Anyway, when
numbers are represented as binary there's no way of telling the
difference.) In ISO8601, the truncation is defined to imply precision
(I think).

3. Nan said: "...there's nothing to prevent us from indicating lower
resolution by using "minutes since" or "hours since" instead...".
There's nothing in the CF conventions that says that "minutes since"
implies a precision of minutes, *although perhaps it could* (a topic for
a proposal?). AFAIK, "60.0 seconds since X" is currently treated by the
conventions as exactly the same as "1.0 minute since X". But what if
the actual precision is nanoseconds? An axis defined as "y nanoseconds
since x" would have some pretty big numbers, maybe that's not a problem.


I can see another couple of disadvantages to using ISO8601 strings
though:

i. UDUNITS strings are not ISO8601 strings, although they are close.
Would it be confusing to have two different syntaxes for specifying
time?

ii. ISO8601 strings are defined to be in the Gregorian calendar
(although it could be sensible to relax this assumption and just take
the string to indicate year/month/day etc in a specified calendar, as
long as the new calendar has the same concepts of fields.)


Jon

-----Original Message-----
From: cf-metadata-bounces at cgd.ucar.edu
[mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of John Graybeal
Sent: 20 October 2010 20:00
To: John Caron
Cc: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] New standard names for satellite obs data

Count me in the group of people who are sorry you lost your bid to
include ISO-8601 time strings, John. I have voted on the ISO 8601 side
myself (although as I recall, more in the spirit of representing
multiple times in a single file).

I understand it raises complexity considerably to allow ISO-8601
formatted time in place of the regular format of the udunits time. So I
can accept not going down that path. With John C's note that this would
merely permit the addition of ISO-8601 variables, not the replacement of
the standard coordinate, I fail to see how it would be a bad thing. It
really is a common data representation and content, for which there is
currently no acceptable standard name. Under these conditions, is there
a specific bad practice being violated?

Here are the advantages of this option as I see them:

1) Readability in native form without conversion. Understandably not a
high priority for a binary standard like netCDF, but for auxiliary
variables (not the time coordinate) this is a non-trivial benefit.

2) User chooses appropriate resolution, which is unambiguous. My ISO
8601 timestamp can be YYYYMMDD, YYYYMMDDThhmmss, or many other
variations of these, according to my own data set. If I have mixed data
in one netCDF file, I can even represent different resolutions within
the same file. I am not aware of any equivalent way to represent time
precision in netCDF.

3) It can represent date, time, or a combination thereof. It might be
argued this is a negative due to the lack of a priori certainty (which
kind of value is represented here?). If this is a problem, it can be
resolved via slightly finer standard_name selection (e.g.,
datetime_iso8601 vs timecode_iso8601)

4) It can include rich time zone information. Often this is relevant in
time data (that is, timestamps) collected from sensors or computer
systems.

5) It gives me a standard_name for storing a quite common encoding of
data values (considering time as a data value, which it often is)
without transformation. By allocating the max length, all smaller ISO
8601 strings can be accommodated. (Note: because "There is no limit on
the number of decimal places for the decimal fraction", I'm not sure
there is an a priori limit on all ISO 8601 strings -- this would have to
be set in the variable definition for the file.)

ISO 8601 definitely a string format, mixing encoding format and concept
name in the same standard_name. In CF, perhaps that itself is a bad
practice. But it is such a commonly used standard for a data value, I
wonder if the practice is worth allowing in this case? Or if not, then
supporting the use of ISO 8601 via some other, automatically detectable
alternative?

John





On Oct 20, 2010, at 05:35, John Caron wrote:

> On 10/19/2010 12:55 PM, Mike Grant wrote:
>> On 19/10/10 14:21, Aleksandar Jelenak wrote:
>>> Actually, I don't think it is possible to use 'time' standard name
in
>>> such cases. If I correctly interpret CF rules for using standard
names,
>>> 'time' data can be only in the physically-equivalent units to
"seconds".
>>> Strings, being dimensionless, do not qualify.
>>
>> Out of curiosity, why do you want to store time as strings? It's
easy
>> to create those strings from numerical values, and numerical values
are
>> easier to handle in code (and in netcdf-3, as Seth said).
>>
>> Cheers,
>>
>> Mike.
>
> I made a proposal a few years ago to allow ISO-8601 time strings to be
an allowable form of time coordinates, which was not accepted. I would
be interested to hear what your reasons are to use this form vs udunits
(eg "secs since reference")? ISO-8601 time strings are fixed length (21
I think?) so handling in netcdf-3 is not so hard.
>
> Your proposal would amount to standardizing how to include ISO-8601
time strings, but the standard udunits time coordinate would still be
required.
>
> Clarification of your purpose might clarify the name. At first glance,
I might prefer "time_iso8601" over "time_label_iso8601".
>
> John
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata



John Graybeal <mailto:jgraybeal at ucsd.edu>
phone: 858-534-2162
System Development Manager
Ocean Observatories Initiative Cyberinfrastructure Project:
http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://marinemetadata.org

_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Thu Oct 21 2010 - 08:28:42 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒