⇐ ⇒

[CF-metadata] udunits handling of fuzzy time units

From: John Caron <caron>
Date: Fri, 18 Mar 2011 13:21:29 -0600

On 3/17/2011 10:50 AM, Christopher Barker wrote:
>
>> 2. calendar time
>
>> - calendar time representation needs to be clarified
>> - udunits should no longer be the reference library for calendar time. a
>> new reference library is needed, which handles non-standard calendars.
>
> again, the lib is not the point -- the point is that the calendar in
> use needs to be clearly defined when specifying a calendar time.
> Personally, I'd rather the "idealized Gregorian calendar" be the
> standard, but there may be reasons to use others. So if people have a
> good use case for calendars other than Gregorian, then there needs to
> be a way to specify what calendar is used, and that needs to be a part
> of the CF standard -- and that is really an independent proposal.

CF has the calendar attribute (see section 4.4.1)

>
>> - udunit date representation ("n timeUnit since ISO_date") must be
>> retained for backwards compatibility. "month" and "year" timeUnit should
>> be redefined in CF version 1.x to refer to calendar fields, not fixed
>> length time durations.
>
> ouch! -- I think that is a bad idea.
>
> 1) it's a change, so backward compatibility is broken.
>
> 2) "month" and "year" in this context should be deprecated.
>
> It seems abundantly clear to me that if you want:
>
> January, February, March, ...
>
> you are specifying something different than when you are specifying
> something that happens every n seconds, there should be a different
> way to specify that. Why? because the kinds of operations you
> can/should do with the data are different. To work with "n timeUnit
> since ISO_date", you can immediately work with that data as an axis,
> and compute stuff from it, smoothing, integrals, all the stuff Steve
> mentioned. Often I don't need to know, and don't use, the start_date
> part of the specification at all, because all I care about is the
> interval.
>
> "monthly" data, on the other hand, inherently has to be handled
> differently. If the above proposal where enacted, I'd have to convert
> the start date to a datetime, then use a library to convert the
> monthly interval to a real time, etc. before I could do anything with
> it -- I couldn't assume anything about the timestep (not accurately
> anyway).

well, plenty of algorithms just want, say, the monthly average. but if
you want to know the exact time interval, you need a library that deals
with calendars. the nice thing about the current udunits way of doing
things is that the data provider calculates the interval for you.
unfortunately, for complicated cases, she may not do it correctly (so we
should create a standard library that does it), and/or the calendar date
that you derive from the time unit may not be what was intended (so we
should create a standard library that does it)

>
> Or, if I wanted to use the data in a categorical way, I'd still have
> to use a library to convert it to named months.
>
> so, no matter how you want to use it, you need to to use a library to
> manipulate the data first.
>
>
> One other question:
>
> One of the things that has come up here is information such as
> "monthly average" i.e. the average temperature in January. How does
> that get expressed? There is no start date to use.

CF has the "bounds" attribute to be precise about the start and end of
the time interval associated with each coordinate (section 7.1). These
should always be used for something like an average, along with
cell_method (7.3).


>> could someone define what resolution means ?
>
> I think in this context, "resolution" is the same as "precision", as
> opposed to "accuracy". Accuracy expresses how correct a value is.
> Precision expresses how specifically defined a value is. So to say
> that a temperature is 31C is only precise to 1 degree, but could be
> perfectly accurate, whereas a temperature of 31.13543C is far more
> precise, but if measured by a badly calibrated thermometer, not
> accurate at all. Ideally, the two are in-sync when data are represented.
>
> In this context, having a resolution of 1 day means that that data
> applies to the day as a whole, perhaps an average, or simply the only
> measurement taken that day.
>
> Or does that just confuse things more?

i guess id just say that the coordinate bounds is the way we handle this
currently, and its seems good enough, or is there something missing?
Received on Fri Mar 18 2011 - 13:21:29 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒