⇐ ⇒

[CF-metadata] udunits time unit question

From: Christopher Barker <Chris.Barker>
Date: Tue, 29 Mar 2011 09:35:29 -0700

On 3/26/11 6:07 PM, Benno Blumenthal wrote:
> 1/365 years since 1960-01-01 calendar 365_days (equivalent to days in
> 365_day calendar)

Is it? what varies here, the length of the year or the length of the
day? if the length of the year is well defined (what is it here?), the
1/365 years is well defined, and it very well may not be 24 hours. If if
is, then why the heck do you not use "days since"?

> 1/360 years since 1960-01-01 calendar 360_days (equivalent to days in
> 360_day calendar)
> 1/12 years since 1960-01-01 calendar 360_days (the much maligned months)

but which month? one with a defined number of hours, all the same, or
one that varies depending on where in the calendar year the data is?

As I read these, I'm quite confused -- I can certainly see why one might
what to collect or store data at such frequencies, but that's not what
the CF standard is about. It is about clearly defining the data when
stored, transmitted and shared -- ALL of the above examples are ripe for
confusion, and could just as easily be expressed as "hours since" or
"days since".

> months since 1960-01-01 calendar 360_days

what is a "calendar 360_day"?

> (just as a reminder -- it is
> important to us, no matter how many people write to say months are
> meaningless)

I don't think any of us think months are meaningless -- simple that they
are not a good choice for well-defined time durations.

It seems to me that there are two cases:

1) The data is defined on some kind of regular interval (or irregular,
but on clearly defined points in the time continuum) - in which case
use an appropriate unit: seconds, hours, days. (days is open to debate,
I suppose)

or

2) the data correspond to calendar months, i.e. monthly averaged data,
etc -- a way to specify "calendar unit" makes sense: "calendar months
since 1989-01"

but using all these peculiar combination of units, so that the time
variable can be: (1,2,3,4), rather than say, (0, 5, 10, 15) makes no
sense to me.

And is it used in any other context? If you have an instrument that is
gathering data at 1/2hz, would you do:

"2 seconds since date-time" then have your time variable be: (0,1,2,2)?

I think not, you'd do:

"seconds since date-time" and have your time variable be: (1,2,4,6)


> However, then you lose the semantics of a 3 month interval. As Benno (sorry for spelling
> your name wrong last time) showed, the semantics of the qualifier for x since have real
> meaning in climate datasets.

>> I am looking at how hard that would be to support, but it does add
>> perhaps unneeded complexity.
>
> But it adds semantics of the time periods in question.

The question is "is the added information worth the added complexity?".
Also, one of the goal of the CF standard is to make it easier to have
client software that can easily work with data from a variety of data
sets -- this kind of thing makes it harder, not easier. If you want to
give the user some information about the data collection interval, put
it in optional meta-data.


There is a lot of talk about udunits as the "reference library", and
while I do see the value of a reference library, I also think that we
need to remember that:

1) we can define a standard without a specific reference lib actually
existing.

2) not every one is going to use udunits -- which means that if we add
all this complexity to the standard, we need to not only add a bunch of
code to handle it in udunits, but also everyone else that uses other
libraries for units has to deal with it -- please don't make me write
that code!

I have no idea how anyone else handles time in their client code, but
what I do is:

- first convert the time access to date-time objects
- second -- convert to whatever I need to do with it.

I do this so that my client code can be all the same, I don't have to
deal with multiple units, reference dates, etc in most of my code.

On 3/28/11 9:31 AM, Steve Hankin wrote:
> 1. the *use of "months" as a unit of measure*, with the intention
> that this refer to calendar months of varying length ... not a
> "unit of measure" by the normal meaning of that word

It's not clear to me that that is the intention, actually -- I htink it
means that in some uses, and means "1/12 of a year" in others (even
though year isn't clearly defined, either!)

In fact, since udunits is the official reference lib,and it does define
a month has being a specific length, I think current practice is the
later use.

> The question before us is, does the fact that there are some existing
> "CF" files that utilize these encodings, require that those encodings be
> formalized into CF? It is hard to say "no" in such cases, because doing
> so creates inconvenience for our colleagues and their users.

Sure, but a standrad is defeined as "everything in current use", there
isin't really much point in having it at all.

I think my point above is relevant here:

There are data sets that use "months since date-time" in existence.
However, anyone using those data sets with the udunits lib is
interpreting that data in a way that may not be what the data creator
intended -- this is simply broken, It can not be enshrined as a standard.

> We should look at each of these questions from the point of view of the
> complexity of the software *reading* the file.

+1

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception
Chris.Barker at noaa.gov
Received on Tue Mar 29 2011 - 10:35:29 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒