⇐ ⇒

[CF-metadata] New standard name: datetime_iso8601 (standard_name or units?)

From: Jim Biard <jim.biard>
Date: Thu, 28 Mar 2013 12:06:44 -0400

I agree wholeheartedly with Steve!

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

jim.biard at noaa.gov
828-271-4900

On Mar 28, 2013, at 11:54 AM, Steve Hankin <steven.c.hankin at noaa.gov> wrote:

> ... and to remind us of the road we're considering taking
>
> netCDF files are in every sense "binary" files. They cannot be read except by custom-built utilities. (Or is there a constituency that wants to access CF using the unix "strings" command?) In all cases except the present discussion, it is the job of those custom-built utilities to generate formatted string representations of the information contained in the CF binary encoded variables.
>
> The entire current discussion would not be happening, if the custom-built utilities and standard code libraries supported the ability to get time information into and out of our binary files using formatted ISO 8601 strings. As the saying goes, if the only tool you have is a hammer, then everything looks to you like a nail. This email forum is for discussing changes to the CF standard (our hammer), so we are hammering away at the current need (interoperability with ISO 8601). There is no doubt in my mind that it is the wrong tool for the task.
>
> - Steve
>
> =========================================
>
> On 3/28/2013 7:49 AM, Jim Biard wrote:
>> Hi.
>>
>> The format of the string is not what is being described. That can be described by the documentation (be it CF, ISO, or a combination). So what is it that we are trying to describe, apart from questions of format? (Expanding on Chris's previous mention, a user-defined type that was a structure with elements for year, month, day, hour, minute, and second could be used instead of a string.)
>>
>> It seems to me that we are trying to figure out how to denote that a variable contains a "non-arithmetic" expression of time, similar to "degree minute second hemisphere" representations of latitude and longitude. (Non-arithmetic may be a poor way of expressing what I mean. I'm trying to say that you can't just take two values and add or subtract them in an atomic operation.) You can represent such values in strings, but you can also represent them by packing them into long integers (to millisecond accuracy). The question of whether or not this is a wise thing to do is something else altogether.
>>
>> I see no reason to exclude the use of the units attribute to denote that the values are expressions of time in which the time since the epoch has been diced up into years, months, days, hours, minutes, and seconds (with varying precision indicated by omission of finer resolution elements). Our current use of the units attribute for time does more than just specify the units (days vs hours, etc). What are the units for such a non-arithmetic time value? They are complex. We could specify something like "years months days" (in the case of a variable that contained dates only), or we could specify something like "datetime". When you went to the units table to find out datetime means, you would find a description.
>>
>> As far as that goes, I can see a valid argument for declaring a new standard name to use for such variables. If we had a standard name "date" or "datetime", we could use this to differentiate between arithmetic and non-arithmetic time expressions. The units attribute could then express which elements were present in the representation, or such variables could be considered to have no units. We could also specify that variables with a standard name of "date" (for example) must be of string type. (This also has a side benefit - at least to some - of preventing such variables from being used as time axes.)
>>
>> In all these cases, the calendar attribute is critical to placing the values into a reference frame, and must be included.
>>
>> Regarding Roy's alternatives, I get serious heartburn when considering 1) and 2). The long name is not supposed to be a place where machines would go to get information about how to interpret the contents of a variable. Everybody seems to want to encroach on it lately. Similarly, the calendar attribute has a specific role, which is to identify the reference frame for the time information. Adding type/units information to this attribute just muddies the water even further.
>>
>> As far as alternative 3 goes, I have no problem with adding one or more attributes to such variables if it helps clarify something for posterity, but I think we must still resolve what to do with standard name and units for such variables.
>>
>> Having thought through all of that, I am leaning towards using a standard name of "date" or "datetime" (and use of units, etc as described above) if we are going to add non-arithmetic expressions of time to CF. I would prefer that we stick with the current restriction that the storage format for times be numeric (that is, in essence, what we currently have), and leave the question of representation formats up to other layers, but I understand the desire to have a way to store human-readable dates/times that would be consistent across files.
>>
>> I've had many headaches maintaining a proprietary legacy software base (not netCDF-related) that didn't separate storage and representation formats because of the amount of code that was needed handle all of the cross-conversions.
>>
>> Grace and peace,
>>
>> Jim
>>
>> Jim Biard
>> Research Scholar
>> Cooperative Institute for Climate and Satellites
>> Remote Sensing and Applications Division
>> National Climatic Data Center
>> 151 Patton Ave, Asheville, NC 28801-5001
>>
>> jim.biard at noaa.gov
>> 828-271-4900
>>
>> On Mar 28, 2013, at 5:48 AM, "Lowry, Roy K." <rkl at bodc.ac.uk> wrote:
>>
>>> Dear All,
>>>
>>> I think Chris has hit the nail on the head here. In my view neither the Standard Name nor the units of measure are the way to describe what is in essence the format of a string. So, what other options are there open to us? I can see three alternatives:
>>>
>>> 1) Use the long name to describe the string format (not just the standard used but the profile)
>>> 2) Use the existing calendar attribute
>>> 3) Specify a suitable extension to CF to do the job.
>>>
>>> These are roughly in my order of preference.
>>>
>>> Cheers, Roy.
>>>
>>> ________________________________________
>>> From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Chris Barker - NOAA Federal [chris.barker at noaa.gov]
>>> Sent: 27 March 2013 15:56
>>> Cc: cf-metadata at cgd.ucar.edu
>>> Subject: Re: [CF-metadata] New standard name: datetime_iso8601 (standard_name or units?)
>>>
>>> On Wed, Mar 27, 2013 at 8:05 AM, Steve Hankin <steven.c.hankin at noaa.gov> wrote:
>>>
>>>> ISO date-time strings are a way of encoding the physical quantity
>>>> that we know as TIME. So TIME is the "right" standard_name for ISO
>>>> date-time strings per the definition quoted above.
>>>>
>>>> Now, it may be that there is a compelling argument to violating the normal
>>>> definition of standard_name for the case of ISO date-time strings. Or on
>>>> the other hand is it preferable to use the units attribute to indicate the
>>>> use of an ISO date-time string?
>>>
>>> An ISO string for a datetime is not a name (it's still time), but it
>>> is not a unit either.
>>>
>>> What it is is a data type -- more akin to a float or integer -- i.e. a
>>> particular way to translate bytes to a value. The bytes are a char
>>> array, and the value is the datetime itself.
>>>
>>> I don't know if thinking about it this way is helpful, as we are
>>> building on netcdf, and I don't now that netcdf allows you to define
>>> new data types, but food for thought.
>>>
>>> Also, of course, all the other data types in netcdf (and CF) are
>>> direct translations to commonly used binary formats in computers, and
>>> this one is not.
>>>
>>> hmm -- a quick peak at the netcdf4 docs says:
>>>
>>> "The richer enhanced model supports user-defined types and data structures"
>>>
>>> So maybe this could be a user defined type?
>>>
>>> Having said that, I don't support using ISO strings to define
>>> datetimes in CF. I understand particular use-cases, like keeping the
>>> original time stamp from a data collection system and the like, but
>>> then maybe it's really just arbitrary auxiliary text information, in
>>> which case maybe we don't need a standard name or custom data types at
>>> all.
>>>
>>> -Chris
>>>
>>>
>>>
>>> --
>>>
>>> Christopher Barker, Ph.D.
>>> Oceanographer
>>>
>>> Emergency Response Division
>>> NOAA/NOS/OR&R (206) 526-6959 voice
>>> 7600 Sand Point Way NE (206) 526-6329 fax
>>> Seattle, WA 98115 (206) 526-6317 main reception
>>>
>>> Chris.Barker at noaa.gov
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>> This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130328/e3840503/attachment-0001.html>
Received on Thu Mar 28 2013 - 10:06:44 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒