Dear all,
I won't make any recommendations for udunits, but I will comment on the
CF-conventions.
In general, I think we should discourage usage of calendar month as a
unit of measurement because for the real world, these are only defined
for 12 special periods each year (the beginning to the end of each
calendar month) and the "unit" is not constant throughout the year.?
Nevertheless there are some good arguments for considering adopting a
special calendar month unit by the CF conventions, but only for limited
and very specific purposes. (I'll refer to these new units as
"calendar_months" in the following, with the understanding that the unit
will depend on the calendar adopted and will in general vary for
different months of the year and depend on whether or not the year is a
leap year.)? Here are reasons (already articulated by others) why we
might want to adopt "calendar_months" as a unit:
1) there are existing data sets with monthly-mean data and a
time-coordinate that is supposed to indicate how many calendar months
have passed since some base month/year. Such data sets are not easily
accommodated by CF.
2) for some judiciously selected reference times, coordinates expressed
in "calendar_months since" can be easily generated without the help of
calendar-aware time-translation algorithms.? For example, given units of
"calendar_months since 2001-01-01" we can trivially generate the
coordinate values for the middle of each month: 0.5, 1.5, 2.5, etc. It
would be much more difficult to generate the coordinate values in units
of "days since ...".
3) monthly mean data sets with time coordinates based on different
calendars can more easily be compared because if all the data sets
adopted the same reference time, then comparable months would have the
same coordinate values, independent of the actual month length defined
by different calendars.? [In contrast, if the time coordinate were given
in units of "days since ... ", the coordinate values would depend on the
calendar.]
If we define a unit of "calendar_months since ..." we would need to
address Jonathan's point that it is not immediately obvious how to
handle fractions of months and reference times different from the
beginning of a month.? One approach we should consider is to restrict
use of calendar_month units to datasets reporting monthly values only
(not, daily, annual, hourly, etc.)? Also for simplicity we could
restrict the reference time to be the beginning of a calendar year
(i.e., January 1 at 0:0:0 for some specified year). If we did this, it
would be relatively easy to define what we mean by fractions of months
and it would also be easy to generate the values needed to define the
time-coordinates and the cell_bounds or the bounds needed to define
climatologies.? [It would be almost as easy if we allowed the reference
time to be the beginning of *any* month, but let's consider the more
restrictive "beginning of a year" option first.]
I note that using calendar_month units to report data at intervals other
than monthly intervals offers no advantages.? For example different
fractional month increments would have to be used to report data at an
invariant daily interval.? This would seem to introduce complexity to a
simple time-coordinate and is why I would restrict use of calendar_month
units to monthly data.
Since months are not all equal in length, the interval of times
represented by the same fraction may differ between months. For a
Gregorian calendar, half-way through the month of January would be noon
on the 16th of January (15.5 days from the beginning of the month),
whereas half-way through the month of June would be the 16th of June at
00:00:00 (15.0 days from the beginning of the month. Thus 0.5 months
since 2001-01-01 would be 2001-01-16 12:00:00 and 5.5 months since
2001-01-01 would be 2001-06-16 00:00:00. Also note that the date
corresponding to the middle of February depends on whether the year is a
leap year or not.
Of course for a different calendar (e.g., 360_day), the dates
corresponding to 0.5 months since 2001-01-01 and 5.5 months since
2001-01-01 would be different from those for the Gregorian calendar.
The bottom line is that under the above restrictions, it is easy to
convert fractions of months to dates (and also to alternative units such
as "days since ...").
Common types of "monthly" data sets are:
1) reports of monthly statistics computed from multiple samples within
each month (e.g., means, standard deviation of daily values, maximum
daily mean; requires cell_bounds)
2) reports of monthly accumulations? (e.g. monthly precipitation amount;
requires cell_bounds)
3) monthly climatologies (requiring climatology attribute pointing to
climatology bounds)
For all of the above the bounds coincide with the beginning and end of
each month so with the reference time restricted to the beginning of a
year, the bounds will all be integer values of "months since". The
coordinate value for any of the above can be assigned any value in the
interval defined by the bounds.? For monthly statistics one might choose
the middle of each month so coordinate values would be numbers like 0.5,
1.5, 2.5, etc.? For monthly accumulations one might prefer to use the
time at the end of each interval to represent the coordinate value
(e.g., 1.0, 2.0, 3.0, etc.).
I understand that defining a unit of calendar_months is not compatible
with udunits, but I think we can rely on tools other than udunits to
handle more generally this new unit and the various CF conventions
calendars to convert coordinate values to other time units like "days
since ...".
Perhaps the biggest argument in favor of introducing a calendar_month
unit is that it should make it much easier for data providers to
generate correct time coordinates for data reported at monthly
intervals.? Regardless of calendar, I think it is easy to generate
monthly time coordinates under the current CF standard that are simply
wrong.? In contrast, everyone should be able to trivially create a
coordinate axis with values like (1, 2, 3, .... ) or (0.5, 1.5, 2.5,
...) without making a mistake.
best regards,
Karl
On 10/18/18 10:58 AM, Jonathan Gregory wrote:
> Dear Martin
>
> The definition of a calendar_month unit would also need rules about calendar
> arithmetic e.g. What does 1 calendar_month since 31 January mean? What does
> 7.23 calendar_months since 31 January mean?
>
> Best wishes
>
> Jonathan
>
>
> ----- Forwarded message from Martin Juckes - UKRI STFC <martin.juckes at stfc.ac.uk> -----
>
>> Date: Thu, 18 Oct 2018 16:33:28 +0000
>> From: Martin Juckes - UKRI STFC <martin.juckes at stfc.ac.uk>
>> To: Jonathan Gregory <j.m.gregory at reading.ac.uk>, "cf-metadata at cgd.ucar.edu"
>> <cf-metadata at cgd.ucar.edu>
>> Subject: Re: [CF-metadata] 'months since' and 'years since' time units
>>
>> Dear Jonathan,
>>
>>
>> I think you could go further and disallow the use of "month" or "year" as a time unit when the calendar is not standard.
>>
>>
>> If the "ncdump -t" option produces what the user expects when he has units "months since 1900-01-01" and a 360 day calendar, then it is going to be inconsistent with the current convention.
>>
>>
>> I still feel that there is an argument for enabling the storage of information in user months in the files. E.g. I wish to compare monthly mean data from 20 climate models which use a range of different calendars. The mean across the models is not on any specific calendar ... I could pretend it is and use units of "days since ...", but the mappings from input time coordinates to output time coordinates then become rather complex, when they should be trivial. Having a "date" standard name which allowed the input data to have a "calendar_month" coordinate would solve this (and I think Klaus's suggestion would also solve it),
>>
>>
>> regards,
>>
>> Martin
>>
>> ________________________________
>> From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu> on behalf of Jonathan Gregory <j.m.gregory at reading.ac.uk>
>> Sent: 18 October 2018 17:10:46
>> To: cf-metadata at cgd.ucar.edu
>> Subject: Re: [CF-metadata] 'months since' and 'years since' time units
>>
>> Dear all
>>
>> This is an interesting discussion, and I agree that's a tricky subject. If only
>> we could have a well-behaved Earth which orbited the sun in an integral and
>> easily factorisable number of days!
>>
>> So far I still think that we should not change the way we interpret the units
>> string. It's in udunits format, and should be interpreted according to the
>> calendar attribute. I would suggest that it's helpful to regard time coords as
>> *encoded* and not necessarily easy for humans to read. It's certainly nice to
>> see "time=1, 2, 3, ..." months since a reference date - that is easy to read -
>> but when you get to 747 or 4689 months since a reference date, you don't know
>> what it means any more (unless you're extremely good at mental arithmetic), and
>> you might as well encode it as days.
>>
>> The antidote to inconvenient encoding is convenient software. For example,
>> could cftime allow the user to construct a time coordinate variable with a
>> spacing of calendar months, but encode it in the netCDF file in days? Then it's
>> transparent. Similarly, time coordinate variables can be decoded into human-
>> readable strings by calendar-aware software. It seems to me that this isn't
>> different in principle from using ncdump to read a netCDF file, rather than
>> insisting it should be intelligible when read in hexadecimal. In fact, ncdump
>> itself has a -t option, which should help, according to the man page:
>>
>> "-t controls display of time data, if stored in a variable that uses a udunits
>> compliant time representation such as `days since 1970-01-01' or `seconds since
>> 2009-03-15 12:01:17' .... If this option is specified, time data values are
>> displayed as human-readable date-time strings rather than numerical values,
>> interpreted in terms of a `calendar' variable attribute, if specified. ...
>> Calendar attribute values interpreted with this option include the CF
>> Conventions values `gregorian' or `standard', `proleptic_gregorian', `noleap'
>> or `365_day', `all_leap' or `366_day', `360_day', and `julian'."
>>
>> I agree with comments that if we introduced new units such as calendar_month
>> or 30day_month, people might well not use them, and would still be disappointed
>> that "month" wasn't what they expected.
>>
>> The CF conformance document has a recommendation that "year" and "month" should
>> be used "with caution". I don't what the CF checker currently does with this.
>> We could change it to a recommendation that they should *not* be used, in which
>> case the checker would give a warning if they were.
>>
>> Best wishes
>>
>> Jonathan
>>
>> ----- Forwarded message from B?rring Lars <Lars.Barring at smhi.se> -----
>>
>>> Date: Thu, 18 Oct 2018 13:31:10 +0000
>>> From: B?rring Lars <Lars.Barring at smhi.se>
>>> To: Martin Juckes - UKRI STFC <martin.juckes at stfc.ac.uk>, David Blodgett
>>> <dblodgett at usgs.gov>, Ryan Abernathey <ryan.abernathey at gmail.com>
>>> Cc: "cf-metadata at cgd.ucar.edu" <cf-metadata at cgd.ucar.edu>
>>> Subject: Re: [CF-metadata] 'months since' and 'years since' time units
>>>
>>> Dear Martin, David, all,
>>>
>>> As Klaus points out, the aim of my suggestion is to make software using CF aware of the fact that the unit "year" is different depending on which calendar the model is implementing. To give an example:
>>> If I want to know when the global average temperature has increased by 1.5C, or 4C, above pre-industrial time in the CMIP5 ensemble I will get answers as a timedelta in days. As this is not really helpful I might feel inclined to convert this to years, but now UDUNITS definition of year is not helpful for those models having a 360_day or 365_day calendar. However, with the calendar-aware definition of year such a calculation would be supported without having to deal with it manually.
>>>
>>> Now, on to the discussion about months. Before my previous post I quickly read through extensive exchange on this list back in 2011, so I really appreciate David's comment that it is a complex subject. And that is the reason for why I suggested is always month is always a year / 12. So, here is an attempt to summarise the suggestion in a different way:
>>>
>>> * standard and proleptic_gregorian calendars (status quo):
>>> o the number of days in a month is not an integer
>>> o same issue with respect to ordinary (western) world months.
>>>
>>> * 365_day calendar:
>>> + the number of seconds in a month would change from being "ill-defined (?)" as 84600 * 365.242198781 / 12 = 2574957.50141, to more properly 84600 * 365 / 12 = 2573250
>>> o same issue with respect to ordinary (western) world months.
>>>
>>> * 360_calendar:
>>> + the number of seconds in a month would change from being "very ill-defined (?)" as 84600 * 365.242198781 / 12 = 2574957.50141, to more properly 84600 * 360 / 12 = 2538000
>>> + the number of days in a month is an integer; 12 * 30 * 84600 = 2538000
>>> + the definition of a month is consistent with what is expected in the "360_day world"
>>> o same issue with respect to ordinary (western) world months.
>>>
>>> That is, even though the suggestion certainly do not solve everything (of course!), the only argument against it, that I can see, is the work to tease out the details and implement it in software packages. As was extensively discussed in the 2011 threads, the real problem is the varying length of the western world calendar months. But that is the topic for another thread.
>>>
>>>
>>> Kind regards,
>>> Lars
>>>
>>> ________________________________
>>> Fr?n: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] f?r David Blodgett [dblodgett at usgs.gov]
>>> Skickat: den 18 oktober 2018 13:58
>>> Till: Ryan Abernathey
>>> Kopia: cf-metadata at cgd.ucar.edu
>>> ?mne: Re: [CF-metadata] 'months since' and 'years since' time units
>>>
>>> Dear Ryan, All,
>>>
>>> I hesitate to chime in on this thread as I know just how fraught this topic can be, but then I think I know how fraught it can be so may have something to offer. My experience is at the intersection of climatological models and landscape models that are calibrated with "real" data. I've worked with daily and monthly time series model output and interpolated weather products that needs to match up to observations but uses a noleap or 360 calendar. It's an enormous pain and we as a community should do better. -- so the business case for taking this complexity head on is there!
>>>
>>> One resource I've found useful over the years is the [CDM implementation](https://www.unidata.ucar.edu/software/thredds/current/netcdf-java/CDM/CalendarDateTime.html)
>>>
>>> There are two factors at play.
>>>
>>> 1) Adding "calendar" to a udunits string avoids conversion to a number of shorter time increments for long time increments (e.g. seconds per month). It keeps things in the declared time units so you hit the precise date boundaries you would expect.
>>> 2) The "calendar" attribute gets you to how to interpret the datum of the time axis.
>>>
>>> Especially relevant to this thread is:
>>>
>>> * uniform30day or 360_day = All years are 360 days divided into 30 day months.
>>>
>>> With these two, I think the problems here are solved. However, inevitably, people will omit the addition attribute for calendar or fall back on normal "months since ..." when they actually mean "calendar months since ..." and tell us 'why would you interpret my data that way it makes no sense?!?' This is perfectly parallel to spatial coordinates where people don't declare a datum for their latitude/longidute coordinates. Without that information one can not use the information with a level of precision that some use cases require.
>>>
>>> What I'm getting at is that CF should probably:
>>> 1) adopt enough attribute precision to fully describe what we are trying to convey
>>> 2) make said attributes required or declare sensible defaults that reduce ambiguity when users come knocking.
>>>
>>> That said, I've had no success pushing the community to accept that there should be a default lat/lon datum for software developers to go on and I would not doubt that the same will be true here as ambiguity and uncertainty is better than dead wrong in many cases. My stance is that we should all be dead wrong for the same reason rather than each implementor making an arbitrary decision so we all get different answers (more ambiguity) from our software du-jour.
>>>
>>> All the best,
>>>
>>> Dave
>>>
>>>
>>> On Oct 18, 2018, at 6:08 AM, Martin Juckes - UKRI STFC <martin.juckes at stfc.ac.uk<mailto:martin.juckes at stfc.ac.uk>> wrote:
>>>
>>> Hello All,
>>>
>>>
>>> I think the UNIDATA pull request referenced Jeff (https://github.com/Unidata/cftime/pull/69) is mis-quoting the CF Convention. As far as I can see, Unidata says that a month is exactly one 12th of a year, and CF inherits this -- with the statement "For similar reasons the unit month, which is defined in udunits.dat<http://www.unidata.ucar.edu/software/udunits/> to be exactly year/12, should also be used with caution."
>>>
>>>
>>> I can't see the difference between Lars's suggestion and the status quo. In UNIDATA a day is clearly defined as "period of time equal to 24 hours", which gives 84600 seconds.
>>>
>>> regards,
>>> Martin
>>>
>>>
>>>
>>> ________________________________
>>> From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu<mailto:cf-metadata-bounces at cgd.ucar.edu>> on behalf of B?rring Lars <Lars.Barring at smhi.se<mailto:Lars.Barring at smhi.se>>
>>> Sent: 18 October 2018 09:29:50
>>> To: Ryan Abernathey; whitaker.jeffrey at gmail.com<mailto:whitaker.jeffrey at gmail.com>; cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
>>> Subject: Re: [CF-metadata] 'months since' and 'years since' time units
>>>
>>> Hi,
>>>
>>> I have have come to think about this from a somewhat different perspective. For some analyses, as well as when calculating certain derived climatological statistics (aka climate indices), using datasets based on different calendars the problem becomes obvious.
>>>
>>> In the model world of a 365-day GCM one year _is_ 365 days, and in a 360_day GCM a year _is_ 360 days. In the case of a gregorian/standard calendar GCM I am not sure whether it is 365.25 or 365.242198781 but this is in practice less of a problem.
>>>
>>> For datasets based non-standard calendars imposing the current UDUNITS definition of a year leads to complications that require workarounds if one is interested in for example the time elapsed until something happens or the duration of some (long-lasting) events. One way to partly mitigate these issues would be to use the time unit of years_since_START or months_since_START, but this is warned against in the CF Conventions and may software tools do not support it .
>>>
>>> The fundamental issue is the inconsistency between the GCM year and the UDUNITS year. So I would like to call on the wisdom of this list to see whether the CF Convention could include a modification to the definition of a year and month:
>>>
>>> * standard calendar (no change)
>>> 1 day = 84600 seconds
>>> 1 year = 365.242198781 days
>>> 1 month = 365.242198781 / 12 days
>>>
>>> * 365_day calendar
>>> 1 day = 84600 seconds
>>> 1 year = 365 days
>>> 1 month = 365 / 12 days
>>>
>>> * 360_day calendar
>>> 1 day = 84600 seconds
>>> 1 year = 360 days
>>> 1 month = 360 / 12 days
>>>
>>> That is, the seconds per day ratio is not changed thus maintaining the consistency to other SI units. And, for the 360_day calendar month follows the suggestion by Ryan and Jeffrey.
>>>
>>>
>>> Kind regards,
>>> Lars
>>>
>>> --
>>> Lars B?rring
>>>
>>> FDr, Forskare
>>> PhD, Research Scientist
>>>
>>> SMHI / Swedish Meteorological and Hydrological Institute
>>> Rossby Centre
>>> SE - 601 76 NORRK?PING
>>> Tel / Phone: +46 (0)11 495 8604
>>> Fax: +46 (0)11 495 8001
>>> Bes?ksadress / Visiting address: Folkborgsv?gen 17
>>> ________________________________
>>> Fr?n: CF-metadata [cf-metadata-bounces at cgd.ucar.edu<mailto:cf-metadata-bounces at cgd.ucar.edu>] f?r Ryan Abernathey [ryan.abernathey at gmail.com<mailto:ryan.abernathey at gmail.com>]
>>> Skickat: den 17 oktober 2018 21:22
>>> Till: whitaker.jeffrey at gmail.com<mailto:whitaker.jeffrey at gmail.com>
>>> Kopia: cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
>>> ?mne: Re: [CF-metadata] 'months since' and 'years since' time units
>>>
>>> Hi everyone,
>>>
>>> I am that user, and I'm new to this mailing list. Thank you all for your work on CF conventions. It's such a valuable effort!
>>>
>>> I want to note that this was inspired by the proliferation of datasets in the wild that use "month" as their units. For example, nearly all of the IRI Data Library does this, in conjunction with a 3"60_day" calendar (example: https://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP-NCAR/.CDAS-1/.MONTHLY/.Diagnostic/.surface/.temp/).
>>>
>>> My impression from talking to data providers is that no one is using "360_day" calendar and "months" as units, and then expecting "months" to be interpreted as 365.242198781/12 days. They all expect it to be interpreted as 30 days. While there are various workarounds that can be used at different levels of the software stack, the best solution, IMHO, is to explicitly allow in CF conventions what Jeff proposed: "months and years be interpreted as calendar months and years for those calendars where they have a fixed length". I don't think this will break existing applications.
>>>
>>> Thanks,
>>> Ryan
>>>
>>> On Wed, Oct 17, 2018 at 3:06 PM Jeffrey Whitaker <whitaker.jeffrey at gmail.com<mailto:whitaker.jeffrey at gmail.com><mailto:whitaker.jeffrey at gmail.com>> wrote:
>>> Hi: I'm a developer of the 'cftime' python package (https://github.com/Unidata/cftime). A user submitted a pull request (https://github.com/Unidata/cftime/pull/69) that implements support for a 30-day calendar month time unit for the '360_day' CF calendar. Although using a 'month' time unit is a tricky proposition in general, for this calendar it seems straightforward since every month has the same length. However, in the discussion of the pull request it was pointed out that CF expects that "the value of the units attribute is a string that can be recognized by UNIDATA?s Udunits package", and that UDUNITS defines a month as 365.242198781/12 days. My question is this - is it reasonable for our python package to make an exception to this rule for the 360_day calendar? More generally, can months and years be interpreted as calendar months and years for those calendars where they have a fixed length, or will this deviate from the existing CF conventions and break existing applications?
>>>
>>> Regards, Jeff
>>>
>>> --
>>> Jeffrey S. Whitaker
>>> NOAA/OAR/PSD R/PSD1
>>> 325 Broadway, Boulder, CO, 80305-3328
>>> Phone: (303)497-6313
>>> FAX: (303)497-6449
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu><mailto:CF-metadata at cgd.ucar.edu>
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>> ----- End forwarded message -----
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> ----- End forwarded message -----
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Thu Oct 18 2018 - 22:14:03 BST