⇐ ⇒

[CF-metadata] time as ISO strings

From: Ben Hetland <ben.a.hetland>
Date: Mon, 25 Oct 2010 21:27:53 +0200

On 25.10.2010 10:37, Jon Blower wrote:
> For the record, I?d like to add another objection, that might also
> argue against using ISO8601 for time *metadata*. I?m not sure it?s
> been explicitly called out yet that UDUNITS time specifications are
> inconsistent with ISO8601 in more than one way. For example,
> ?1900-1-1 0:0:0? is legal in UDUNITS, but not ISO8601 (fields must
> be zero-padded and there must be a ?T? in between the date and time).

If UDUNITS generally is more permissive than ISO8601 in what it accepts,
I would say this might be a good case for deprecating one form over the
other. I.e. if there were no (critical) incompatibilites between them,
and strict ISO8601 is still accepted by UDUNITS [1], then one could
deprecate any form which is NOT strictly ISO8601, yet still generally
allow or accept it as input. In this way:

a) Existing CF data sets are still generally accessible.

b) It is possibly a bit easier to implement parsing of these things in
clients that for some reason do not make use of the UDUNITS library.
(This is at the expense of not being able to read some legacy data
files, which may not be critical to that application.)

c) We align with a more formal standard like ISO's is, compared to the
de facto one that UDUNITS represents.
        (Where is that UDUNITS time format documented anyway? I only found [2].)

d) Easier to integrate with other system expecting or producing ISO8601.


If such a convention is accepted, then 1900-1-1 0:0:0 would be accepted
on input, but a validator tool would only accept a canonical form like
1900-01-01T00:00:00Z, which would also be the form that the library APIs
would produce.

In the same manner one could also settle for only a _subset_ of ISO8601,
by reducing the number of alternatives and dropping support for features
that appears not to be applicable with CF. For example, always specify
as "days since 2010-10-25", never "days since 20101025" or "days since
2010W431". The rationale for such a decision could be that they aren't
_strictly_ necessary, and allowing them generally makes life more
complex for the ones who implement the reader software, thereby
generally acting as an impediment to data exchange.[3] In this case,
though, one should adopt the ISO-compliant variants if one finds that
any of them becomes applicable during a subsequent revision of CF.



> This is in addition to the different defaults to local time or UTC
> (actually I guess this is a property of CF, not UDUNITS?)

The timezone defaults is perhaps the only major "incompatibility"
between the two alternative? Is it "major" enough to warrant loosing the
advantages I have indicated? Is the following a viable solution, then?

1) If formatted the "UDUNITS way", defaults to UTC.

2) If formatted the "ISO8601 way", defaults to local timezone according
to ISO8601. Also, the special meaning of "-00:00" could apply.

3) Recommended practice is to use the suffix "Z" with UTC times only.
Rationale: Facilitate data exchange and time-related processing across
as many software applications as possible.


An alternative to case 2 could be NOT to allow any "default semantic" at
all in that case. While being narrower than ISO8601 allows, this could
catch situations like "forgot to specify". Since few probably have used
ISO8601-formatting already, maybe it wouldn't even break backwards
compatibility with existing data files? Another bonus: The receiver
wouldn't need to scratch his or her head figuring out exactly which
local zone was meant... :-)



> Anyway, it seems that it might be confusing to introduce two string
> representations of date/times into CF, even for metadata. Are
> there any important cases in which ISO8601 is more expressive than
> UDUNITS syntax, or could we stick to UDUNITS for everything?

Well, besides the week-day "expressiveness" illustrated above, ISO8601
can indicate entire centuries (just "20" means the range of years from
2000 to 2099, I believe; "poor resolution" if you like, or maybe just
sufficient or more suitable for some cases).

There are time durations and such, but those are perhaps sufficiently
supported in UDUNITS too? (Or by the combination of UDUNITS formats in
combination with multiple attributes in netCDF.)

ISO8601 can express recurring time intervals, I don't think UDUNITS can.
But are they potentially needed in CF?

There is some disagreement regarding dates before the Gregorian era
started in 1582. UDUNITS switches to Julian calendar and seems to be
able to support years back to 9999 BCE ("-9999"), although somewhat
ambigious (?) as to what the values "-0", "0" and "+0" would mean--if at
all legal. ISO8601 is more clear, but in its strictest sense allows for
years outside 1582..9999 only "by agreement of the partners in
information interchange". One might also argue that it is somewhat less
expressive (by not allowing a calendar change[4], for instance).
However, I must point out that the "CF Convention" might be just such an
excellent place to indicate the mutual agreement for extending the
supported dates to any desired range.[5]



Notes:

[1] I don't think it actually is, but maybe it ought to in a future
revision of UDUNITS? It doesn't seem like a date like 2004-167, 20060101
or even 2010W431 would match, for instance.

[2]
<http://www.unidata.ucar.edu/software/udunits/udunits-2/udunits2lib.html#Time>

[3] Adapting the formatting is generally trivial for the writer software
compared to the demands of implementing a robust reader, so putting a
bit more requirements on that end doesn't imply the same degree of
impediment.

[4] On the other hand, what about expressing Julian dates after 1582 in
UDUNITS? The change didn't happen in 1582 in quite a lot of places, so
this is just a potential source of confusion especially for those that
need to represent a lot of historical dates.

[5] ISO8601 section 4.1.2.4 "Expanded representations". At least the
semantics are fairly clear, and with a fairly well-defined axis scale
for the time dimension common to all such cases.

-- 
Regards,
	-+-Ben-+-
Opinions expressed are my own, not necessarily those of my employer.
Received on Mon Oct 25 2010 - 13:27:53 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒