Opened 8 years ago

Last modified 8 years ago

#96 new enhancement

Julian/Gregorian calendar name and constraints

Reported by: Dave.Allured Owned by: cf-conventions@…
Priority: medium Milestone:
Component: cf-conventions Version:
Keywords: calendar time julian gregorian udunits Cc:

Description

1. Title

Julian/Gregorian? calendar name and constraints

2. Moderator

*

3. Requirement

An accurate name for the mixed Julian/Gregorian? calendar system is needed as an alternative to the current naming scheme, which hides the complexity and pitfalls of this calendar. Time range constraints enable error checking.

4. Initial Statement of Technical Proposal

A new calendar name "Julian/Gregorian?" is added to the existing CF synonyms for the mixed Julian/Gregorian? calendar as defined by UDUNITS.

However, this is proposed as an independent calendar definition, because it is also desirable to add time range constraints which are not part of the original definitions. Negative years, year zero, and illegal transition dates are all excluded. I am excluding Julian negative years because there is no unambiguous definition in the broader context of general history.

This new definition is fully compatible with existing translation routines in UDUNITS and other calendar software, provided that any name checking procedures are updated. Constraint checking is optional.

This proposal does not change existing usage or create incompatibility with existing data files.

5. Benefits

  • Provide an accurate and distinct alternate name for this calendar system, for those who wish to be more specific than the default or "standard".
  • Facilitate migration away from inadequate calendar names, for both existing and new data sets.
  • Increase user awareness of complicated calendar usage and computational issues.
  • Thereby reduce undetected computational errors resulting from misunderstanding or misapplication of the mixed Julian/Gregorian? calendar.
  • Preserve the mixed Julian/Gregorian? calendar definition within CF, for legitimate purposes.
  • Constraint specifications support optional range checking to catch accidental misapplication and computational errors in time coordinates.

6. Status Quo

If no alternative name is provided, the ongoing hidden usage of a complicated calendar will continue to engender misunderstanding, undetected errors, and controversy.

7. Text of Proposed Changes

Julian/Gregorian?

Mixed Julian/Gregorian? calendar, with constraints. All usage of negative years, year zero, and transitional dates 1582 October 5 through 1582 October 14 inclusive, is prohibited.

  • For consistency, change the term "Gregorian/Julian?" to "Julian/Gregorian?" in three places in section 4.4.1. Do not change the formal excerpt from the UDUNITS man page.
  • Add a new paragraph in 4.4.1, just above the third from last paragraph which now begins "The mixed Julian/Gregorian? calendar used by Udunits is explained ...":

The calendar name "Julian/Gregorian?" should be used only (a) in genuine historical data sets having a proper need to span the Julian/Gregorian? discontinuity; and (b) to facilitate backward compatibility in existing data sets.

8. References

There have been several discussions and proposals over the years about problems with the mixed Julian/Gregorian? calendar. Shown here is the most recent thread. Please consult the CF archives and related mailing lists for more.

  • Gregorian calendar, Wikipedia article with history of the transition between Julian and Gregorian calendars

Change History (7)

comment:1 follow-up: Changed 8 years ago by mcginnis

I recommend changing the proposed name to "mixed Julian/Gregorian?" instead of "Julian/Gregorian?".

Without some indicator like the word 'mixed' in there, I think that users unfamiliar with the complications of the historical calendaring system will be prone to reading the slash as meaning 'or', rather than 'and', which will mislead them into thinking it means something like "Julian or Gregorian, whatever, they're equivalent in this context."

(If avoiding spaces in the name is desirable, we could abbreviate it to "mixed". That would also avoid non-alphanumeric / in the string...)

comment:2 in reply to: ↑ 1 Changed 8 years ago by Dave.Allured

Replying to mcginnis:

I recommend changing the proposed name to "mixed Julian/Gregorian?" instead of "Julian/Gregorian?".

Agreed. I was hoping that the slightly shorter version would be sufficient, but you have added a depth of consideration for unfamiliar users. Thanks for this perspective. There is no need to avoid the space in the name, and "mixed" is too obscure for me.

--Dave

comment:3 follow-up: Changed 8 years ago by jonathan

Dear Dave

Thanks for opening this ticket.

For the name you propose for the calendar, I would suggest mixed_julian_gregorian, to conform with the general CF practice that the possible values of attributes defined by the convention use only letters and underscore.

To be clear, it appears you are proposing two things: (1) an alternative name for the default calendar, in addition to its names of gregorian and standard, (2) a redefinition of that calendar, to disallow years less than one and dates not supported by udunits in the transitional period between Julian and Gregorian. Is that right?

I think (1) is fine, but we should be aware that (2) would make it impossible to specify dates BC, even though it is currently possible in CF. But I agree that really we need a clear convention for doing this if it is to be allowed, and since this is not strictly a backward compatibility because it doesn't affect existing data, I think the benefit outweighs the drawbacks.

If dates BC are needed for the real world, the conventions that might be considered are proleptic_gregorian (already in CF), proleptic_julian (which I read in wikipedia is defined by propagating backward from AD 4, which was a leap year, and has no year zero) and astronomical (which is the same as proleptic_julian but uses year 0 for 1 BC, -1 for 2 BC, etc.). I guess NASA's solar eclipse website, which gives dates of eclipses back to 2000 BC, uses the proleptic Julian calendar.

Another problem is that year 0 is allowed in CF, following COARDS, to indicate climatological time, although CF has a better convention for this (section 7.4). Your change would mean we would have to remove the COARDS convention from CF. Is that acceptable to everyone?

In the email discussion, I made a proposal which goes further than yours, namely: (1) define a new calendar strict_gregorian, which does not permit dates or reference dates (in the time units string) before 1582 Oct 15; (2) change the default calendar to strict_gregorian. This would mean that dates which precede the introduction of the Gregorian calendar would require an explicit calendar attribute, to make it clear what convention was being used. In particular, it would disallow dates "since 1-1-1" with the default calendar; these are quite common and particularly problematic.

I could put this proposal in a separate ticket, but it's closely related and complementary to yours, so I wonder what you think about it.

Best wishes

Jonathan

comment:4 in reply to: ↑ 3 Changed 8 years ago by Dave.Allured

Replying to jonathan:

Hi Jonathan. Please let me reply to your name suggestion later, in a separate thread.

To be clear, it appears you are proposing two things: (1) an alternative name for the default calendar, in addition to its names of gregorian and standard, (2) a redefinition of that calendar, to disallow years less than one and dates not supported by udunits in the transitional period between Julian and Gregorian. Is that right?

Almost right. You say "an alternative name for the default calendar", when I actually mean a new definition similar to the current default, but distinct. This current ticket #96 does not propose to change or redefine the default or any existing named calendar. It simply establishes a new calendar system with more carefully designed rules, one that could be used to replace or retrofit some of the other calendars, in many use cases. In this ticket, I would like to focus simply on the careful naming and definition of this possible new calendar.

I think (1) is fine, but we should be aware that (2) would make it impossible to specify dates BC, even though it is currently possible in CF. But I agree that really we need a clear convention for doing this if it is to be allowed, and since this is not strictly a backward compatibility because it doesn't affect existing data, I think the benefit outweighs the drawbacks.

Yes, deliberately exclude dates BC from this new definition. There seems to be no current CF-related usage of dates BC. Recent discussions suggest that the mixed Julian/Gregorian? calendar is quite a poor choice for scientific encoding of dates BC, and there are better alternatives.

If dates BC are needed for the real world, the conventions that might be considered are proleptic_gregorian (already in CF), proleptic_julian (which I read in wikipedia is defined by propagating backward from AD 4, which was a leap year, and has no year zero) and astronomical (which is the same as proleptic_julian but uses year 0 for 1 BC, -1 for 2 BC, etc.). I guess NASA's solar eclipse website, which gives dates of eclipses back to 2000 BC, uses the proleptic Julian calendar.

Solutions for dates BC are not part of the current proposal. I think it would be best for CF to defer this issue until an actual use case arises, hopefully with some related expertise in the right kind of chronology.

Another problem is that year 0 is allowed in CF, following COARDS, to indicate climatological time, although CF has a better convention for this (section 7.4). Your change would mean we would have to remove the COARDS convention from CF. Is that acceptable to everyone?

I had forgotten that year 0 is an explicit part of COARDS, so thank you for pointing this out. The current proposal does not attempt to incorporate all time functions of COARDS, nor it does not hold a solution for year 0. On the other hand, it also does not propose removal of COARDS or the COARDS calendar from CF. So, can you agree to save the COARDS discussion for an appropriate proposal?

In the email discussion, I made a proposal which goes further than yours, namely: (1) define a new calendar strict_gregorian, which does not permit dates or reference dates (in the time units string) before 1582 Oct 15; (2) change the default calendar to strict_gregorian. This would mean that dates which precede the introduction of the Gregorian calendar would require an explicit calendar attribute, to make it clear what convention was being used. In particular, it would disallow dates "since 1-1-1" with the default calendar; these are quite common and particularly problematic.

I could put this proposal in a separate ticket, but it's closely related and complementary to yours, so I wonder what you think about it.

I think the calendar discussion needs to be carefully broken up into focused topics for easier digestion and community participation. This ticket is one part of that, an attempt to focus *only* on a new calendar definition that may prove to be useful.

I think that the future specification of the Gregorian calendar is critical, and deserves its own focused discussion. Changes to the default (unspecified) calendar may deserve a separate focused disussion. Please start a separate ticket if you would like.

However, please recall (my December 6 e-mail) that I want to retract the current loose usage of the calendar name "gregorian" to be reserved only for the actual Gregorian calendar. So I hope I can convince you to move in that direction, as well.

Thank you for your extremely thoughtful comments, here and in previous calendar discussions.

--Dave

comment:5 follow-up: Changed 8 years ago by jonathan

Dear Dave

Thanks for the clarification. At the start of the ticket you describe the new calendar name as a synonym and an alternative name. I understand now that the proposal is actually for a new calendar, not a synonym for an existing calendar.

It would be fine to have a more informative and correct name for the default calendar. However I wonder what would be achieved by introducing a new, more restrictive, version of that calendar. We would like to disallow use of years before year 1 (except for the COARDS climatological time convention, which we could deprecate), and we would like recommend against using a reference date which is before (or within) the transitional period if the time coordinates are all after the transitional period, because both of these are problematic with the udunits calendar. But a person who is aware that these give problems would not use them with the default calendar, I guess, and therefore does not need a new calendar to prevent/deprecate their use. On the other hand, a person who is unaware of the problems will also be unaware of the new restrictive calendar, and so will not use it. Hence I am not convinced this would really help in the end. Have I missed the point?

It seems to me that to prevent the problems we actually need to change the default calendar, as previously discussed. I think it would be tolerable to do that so long as the new default has the same interpretation as the old default for any time units and coordinates that remain legal i.e. the effect of the changed definition is only to make some things illegal or deprecated.

Best wishes

Jonathan

comment:6 in reply to: ↑ 5 Changed 8 years ago by Dave.Allured

Replying to jonathan:

Yes, this is technically a proposal for a "new calendar". I should have said this more clearly in the introduction, sorry.

My general rationale for the new calendar name is two-fold. The first purpose is simply to improve the quality of self-documentation of data sets. CF should have a simple way to declare explicitly and deliberately that the mixed Julian/Gregorian? calendar is being used. Currently there is no standardized way to do this.

The second purpose is to provide a transitional alternative, a mechanism to make it easier to deprecate previous calendar names and usage, where there can be a consensus that previous usage is bad practice or bad design. This new calendar name could be easily substituted for previous "inadequate" calendar names in many existing data sets, as needed, with no recomputation of time coordinates or other adjustments. With a reasonable alternative and the ability to retrofit existing data sets, some people may be able to accept more significant changes, such as removing the default calendar.

I tried to explain this rationale in other words above, under "Requirement" and "Benefits".

--Dave

comment:7 Changed 8 years ago by jonathan

Dear Dave

You did indeed explain your rationale before. I agree with your concerns over the default calendar and I am sympathetic to the motivation. My concern is whether introducing a new calendar will actually bring about an improvement in practice if there is no incentive to adopt it. Unless we are convinced that it will help this would be an unnecessary complication in the convention.

One extra step we could take would be to recommend against relying on the default. The consequence would be that the CF checker would give a warning whenever the calendar attribute was not present for a time coordinate (from the next CF release, so it would not warn about datasets written with CF 1.6 or earlier). This warning might encourage the authors of new datasets to include the attribute.

A further step, more serious, would be to deprecate the existing default calendar, even if explicitly called standard or gregorian. Again, deprecation (i.e. recommendation not to use it) would produce a warning. The warning could at the same time suggest using the new calendar instead for the real world.

What do you and others think of that? Would it be beneficial? Would it be too annoying? If we agree to do this, it would be done by inserting text in the definition of this existing calendar.

Since udunits refers to the current default as "mixed Gregorian/Julian?", it would be consistent to call your new calendar mixed_gregorian_julian. That would also involve less change to the convention than switching all the occurrences in the text to "Julian/Gregorian?". I would further suggest that the name should say more explicitly what the calendar is, to distinguish it from gregorian. To do that, perhaps the new calendar could instead be called strict_gregorian_julian.

One aspect of the strictness is to exclude negative years and year zero. Another that you suggest is that transitional dates be excluded, but since CF is defined using udunits for real-world time coordinates, transitional dates are in any case impossible to encode, so they do not need to be excluded. However, we should explicitly exclude 'reference dates' (i.e. 'since' in the units) which are transitional dates. How about this:

strict_gregorian_julian: Mixed Gregorian/Julian? calendar, with constraints. The reference date (following since in the time units) is not allowed to be any of the dates in the transitional period 1582 October 5 through 1582 October 14 inclusive. Neither the reference date nor any date which is encoded with this calendar is allowed to be in year zero or a negative year.

Finally, you have suggested some useful text describing when the calendar should be used. There is also some existing text about this, however, at the end of 4.4.1. Combining your text and the existing text, I'd propose:

Because of problems caused by the discontinuity, it is recommended that the strict_gregorian_julian calendar be used only in datasets with real-world historical dates which span the change of calendar from Julian to Gregorian. In datasets with real-world historical dates that all precede the change of calendar, the julian calendar should be used. In datasets with real-world historical dates that all follow the change of calendar, and in simulated datasets in which there is no change of calendar, the proleptic_gregorian calendar should be used.

How would that be?

Cheers

Jonathan

Note: See TracTickets for help on using tickets.