⇐ ⇒

[CF-metadata] Cell bounds associated with coordinate variable rather than data variable

From: Steve Hankin <Steven.C.Hankin>
Date: Thu, 12 Nov 2009 12:47:40 -0800

John Caron wrote:
> 1. The CDM library uses the bounds if they are present. If only the
> coordinate values are present, the CDM generates bounds. These grids
> bounds are used by ncWMS and other visualization software to draw
> color filled images. The IDV (I think) uses a contouring algorithm
> with just the coordinate values.
>
> 2. Spatial coordinates probably want to use midpoint values.
>
> 3. I think theres a good argument that time coordinates want to use
> the end-point. Seth makes the argument for numerical models. In this
> case, all the output variables should have the same time coordinate.
> Im trying to think of a case where thats not true (point observations,
> radar data etc), and im not thinking of any.
Hi John,

I'm not understanding the logic that suggests using midpoints for
spatial coordinates, but endpoints for times. Whenever an applications
sees a particular reason to place the grid point at something other than
the midpoint (on whatever axis) of course it should do so. That may
lead to placing the grid point at the start, middle or end of the
interval. But the question that is before us is to say what the default
should be for the case where the boundaries of cell values is clearly
understood, but it is unclear what coordinate value best to use for the
grid point.

All other things being equal using a consistent strategy for space and
time is the simpler, "KISS", approach. Both instantaneous and
time-interval-averaged values are most naturally encoded using midpoint
representation (disagreeing with both Seth's conclusion and your
speculation. Are we using terms differently?). The compelling case for
a time endpoint may be continuous integrals (e.g. accumulated
precipitation). If one has a mixture of model variables to output and
the interpretations of their time coordinates needs to be different,
then placing two different time axes into the file is the only way to
eliminate the confusion. Arbitrarily shifting the grid point locations
by 1/2 time cell will not eliminate confusion, will it? It seems like
it would merely hide the confusion and increase the chances of
misinterpretation.

(To be frank, although I have seen many CF datasets using both midpoint
and start-point times, I have never encountered one previously that uses
the end point of the time interval. It seems possible that as a
practical matter this choice may introduce confusion rather than reduce it.)

    - Steve

=====================

Seth's argument about confusion remains the same if one
>
> 4. Perhaps "interval of accumulation" is different enough that one
> should just encode it in a separate attribute or auxiliary coordinate
> on the data variable. Numerical models can have different variables
> with different intervals, possibly overlapping. This is perhaps not
> really the same as the bounds on the coordinate, they just share the
> same codomain (time). An advantage of this approach is that you dont
> have to create new coordinate variables for each data variable, which
> seems like more trouble than its worth.
>
>
> Seth McGinnis wrote:
>> In the case of 'raw' output from numerical models, it probably makes
>> sense to
>> use the end-point of the time interval rather than the mid-point.
>> That's the
>> moment for which the model stores the data, whether they're
>> instantaneous
>> values (intensive variables) or time-averages over the previous timestep
>> (extensive variables).
>>
>> If you used the mid-point of the interval for extensive variables, they
>> wouldn't have the same time coordinates as the intensive variables,
>> which would
>> be very confusing. Using the end-point keeps everything aligned.
>>
>> --Seth
>>
>>
>> On Thu, 12 Nov 2009 14:41:26 +0000 (UTC)
>> Thomas Lavergne <thomasl at met.no> wrote:
>>
>>> Dear Jonathan,
>>>
>>> ----- "Jonathan Gregory" <j.m.gregory at reading.ac.uk> wrote:
>>>
>>>> Dear Thomas
>>>>
>>>> I'm not saying the coordinate *must* be the mid-point. If there's a
>>>> good reason
>>>> for it being something else, then you could choose it to be so. I was
>>>> suggesting that we could recommend it should be the mid-point if there
>>>> is
>>>> no strong basis for making another choice. We could also say that it
>>>> must not
>>>> be outside the bounds.
>>>>
>>> I agree with your recommendation.
>>> But I was also trying to gain support on "which axis value should I
>>> choose for
>>> my variable" and your answer does not help :-).
>>> I have rather little basis for making the choice of the end time for
>>> representing an accumulated quantity but, at least, CF does not
>>> forbid it. I
>>> guess I have to seek agreement inside my scientific community and
>>> that it is
>>> not CF's role to decide upon that.
>>> Are there people interested in taking the discussion further? We
>>> seek the
>>> answer to the question: "In which cases would another choice (other
>>> than
>>> mid-point) be relevant?".
>>>
>>> Thomas
>>>
>>>
>>>
>>>> You are right, it cannot be missing data. That would break some
>>>> applications,
>>>> anyway.
>>>>
>>>> Cheers
>>>>
>>>> Jonathan
>>>> _______________________________________________
>>>> CF-metadata mailing list
>>>> CF-metadata at cgd.ucar.edu
>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Thu Nov 12 2009 - 13:47:40 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒