⇐ ⇒

[CF-metadata] Cell bounds associated with coordinate variable rather than data variable

From: John Caron <caron>
Date: Fri, 13 Nov 2009 07:11:10 -0700

Hi Steve:

Sorry, I see that my thinking isnt very clear. Thanks for reformulating the question as "the case where the boundaries of cell values is clearly understood, but it is unclear what coordinate value best to use for the grid point."

I think Seth's argument to keep the time coordinates the same for model output is correct. So my statement that "time coordinates want to use the end-point" is misleading, the KISS principle is that variables on the same time grid should have the same time coordinate. In model output, the time grid is "forecast time", not "midpoint of calculation period". The CDM's Forecast Model Run Collection (FMRC) processing, for example, depends on this. This may be much ado about nothing - we probably would agree on what to do about any specific example.

(It was probably a mistake to bring in other data types like point obs - they just use the time of measurement, end of story.)

"If one has a mixture of model variables to output and the interpretations of their time coordinates needs to be different, then placing two different time axes into the file is the only way to eliminate the confusion."

I think another possibility is to put the bounds information on the "continuous integral" data variables, rather than on the time coordinate. Otherwise we have this proliferation of time coordinates, which confuses things.

Steve Hankin wrote:
>
>
> John Caron wrote:
>> 1. The CDM library uses the bounds if they are present. If only the
>> coordinate values are present, the CDM generates bounds. These grids
>> bounds are used by ncWMS and other visualization software to draw
>> color filled images. The IDV (I think) uses a contouring algorithm
>> with just the coordinate values.
>>
>> 2. Spatial coordinates probably want to use midpoint values.
>>
>> 3. I think theres a good argument that time coordinates want to use
>> the end-point. Seth makes the argument for numerical models. In this
>> case, all the output variables should have the same time coordinate.
>> Im trying to think of a case where thats not true (point observations,
>> radar data etc), and im not thinking of any.
> Hi John,
>
> I'm not understanding the logic that suggests using midpoints for
> spatial coordinates, but endpoints for times. Whenever an applications
> sees a particular reason to place the grid point at something other than
> the midpoint (on whatever axis) of course it should do so. That may
> lead to placing the grid point at the start, middle or end of the
> interval. But the question that is before us is to say what the default
> should be for the case where the boundaries of cell values is clearly
> understood, but it is unclear what coordinate value best to use for the
> grid point.
>
> All other things being equal using a consistent strategy for space and
> time is the simpler, "KISS", approach. Both instantaneous and
> time-interval-averaged values are most naturally encoded using midpoint
> representation (disagreeing with both Seth's conclusion and your
> speculation. Are we using terms differently?). The compelling case for
> a time endpoint may be continuous integrals (e.g. accumulated
> precipitation). If one has a mixture of model variables to output and
> the interpretations of their time coordinates needs to be different,
> then placing two different time axes into the file is the only way to
> eliminate the confusion. Arbitrarily shifting the grid point locations
> by 1/2 time cell will not eliminate confusion, will it? It seems like
> it would merely hide the confusion and increase the chances of
> misinterpretation.
>
> (To be frank, although I have seen many CF datasets using both midpoint
> and start-point times, I have never encountered one previously that uses
> the end point of the time interval. It seems possible that as a
> practical matter this choice may introduce confusion rather than reduce
> it.)
>
> - Steve
>
> =====================
>
> Seth's argument about confusion remains the same if one
>>
>> 4. Perhaps "interval of accumulation" is different enough that one
>> should just encode it in a separate attribute or auxiliary coordinate
>> on the data variable. Numerical models can have different variables
>> with different intervals, possibly overlapping. This is perhaps not
>> really the same as the bounds on the coordinate, they just share the
>> same codomain (time). An advantage of this approach is that you dont
>> have to create new coordinate variables for each data variable, which
>> seems like more trouble than its worth.
Received on Fri Nov 13 2009 - 07:11:10 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒