⇐ ⇒

[CF-metadata] udunits handling of fuzzy time units

From: Christopher Barker <Chris.Barker>
Date: Thu, 17 Mar 2011 09:50:41 -0700

On 3/16/11 8:47 AM, John Caron wrote:

> 1. time instants vs time duration
> - one must distinguish between dimensional time ("time duration",
> units="secs"), and calendar time ("time instant", or "point on the time
> continuum") which is not dimensional.

yup -- key clarification in all this.

> - calendar time always references a calendar, default calendar is
> Gregorian, aka standard

> - udunits is a good reference library for dimensional time, but not for
> calendar time

I still think the library is not the issue here -- CF should be
independent of any particular library implementation.

Anyway, I think the real issue here is: what are appropriate units for
what John is calling "dimensional time"? -- "months" and "years" are
poor choices, period.

As an example, the Python datetime module has a "timedelta" object --
this is John's "dimensional time". timedelta's can be instantiated by
specifying: days, seconds, microseconds, milliseconds, minutes, hours,
or weeks. not months or years, for exactly these reasons.

> 2. calendar time

> - calendar time representation needs to be clarified
> - udunits should no longer be the reference library for calendar time. a
> new reference library is needed, which handles non-standard calendars.

again, the lib is not the point -- the point is that the calendar in use
needs to be clearly defined when specifying a calendar time. Personally,
I'd rather the "idealized Gregorian calendar" be the standard, but there
may be reasons to use others. So if people have a good use case for
calendars other than Gregorian, then there needs to be a way to specify
what calendar is used, and that needs to be a part of the CF standard --
and that is really an independent proposal.

> - udunit date representation ("n timeUnit since ISO_date") must be
> retained for backwards compatibility. "month" and "year" timeUnit should
> be redefined in CF version 1.x to refer to calendar fields, not fixed
> length time durations.

ouch! -- I think that is a bad idea.

1) it's a change, so backward compatibility is broken.

2) "month" and "year" in this context should be deprecated.

It seems abundantly clear to me that if you want:

January, February, March, ...

you are specifying something different than when you are specifying
something that happens every n seconds, there should be a different way
to specify that. Why? because the kinds of operations you can/should do
with the data are different. To work with "n timeUnit since ISO_date",
you can immediately work with that data as an axis, and compute stuff
from it, smoothing, integrals, all the stuff Steve mentioned. Often I
don't need to know, and don't use, the start_date part of the
specification at all, because all I care about is the interval.

"monthly" data, on the other hand, inherently has to be handled
differently. If the above proposal where enacted, I'd have to convert
the start date to a datetime, then use a library to convert the monthly
interval to a real time, etc. before I could do anything with it -- I
couldn't assume anything about the timestep (not accurately anyway).

Or, if I wanted to use the data in a categorical way, I'd still have to
use a library to convert it to named months.

so, no matter how you want to use it, you need to to use a library to
manipulate the data first.


One other question:

One of the things that has come up here is information such as "monthly
average" i.e. the average temperature in January. How does that get
expressed? There is no start date to use.

On 3/16/11 10:13 AM, Jon Blower wrote:
> How about the following: if we want to add fixed durations to the temporal datum, we use the current syntax:
>
> "duration since datum"
>
> as we do currently. But if we want to add calendar fields to the datum we use:
>
> "field_name since datum by calendar field"
>
> or something similar?

+1

> i think the UTC_Calendar case that Tim brought up indicates that all
> units like month, year, day, hour, (not second), should mean "increment
> that field using calendar system x". if that's the case, maybe prepending
> "_calendar" is not needed?

Ouch! -- despite my general preference for purity, in this case I think
"practicality beat purity". Generally accepted definitions for the
length of minute, hour and day (and week) are available, we may as well
use them.

> If that's the case, maybe prepending "_calendar" is not needed?

Being explicit is a good thing!


>
>>> - the grammar for udunit date representations should be defined,
>>> so that multiple libraries can implement it
>> It is not perfectly obvious what it should do. n months since 1st of a
>> month
>> makes sense, but what does "1 calendar_month since 31 Jan 2008" mean, for
>> instance.

Another good reason to avoid this whole concept! "30 days since 31 Jan
2008" is perfectly well defined.

What does your "monthly" data mean? If it's categorical, use a
categorical representation, if it's a time step, use a unit appropriate
for time steps.

> perhaps forbid month/year in the udunits. removing fractions in the "by calendar field" would make life a lot easier.

+1

> could someone define what resolution means ?

I think in this context, "resolution" is the same as "precision", as
opposed to "accuracy". Accuracy expresses how correct a value is.
Precision expresses how specifically defined a value is. So to say that
a temperature is 31C is only precise to 1 degree, but could be perfectly
accurate, whereas a temperature of 31.13543C is far more precise, but if
measured by a badly calibrated thermometer, not accurate at all.
Ideally, the two are in-sync when data are represented.

In this context, having a resolution of 1 day means that that data
applies to the day as a whole, perhaps an average, or simply the only
measurement taken that day.

Or does that just confuse things more?

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception
Chris.Barker at noaa.gov
Received on Thu Mar 17 2011 - 10:50:41 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒