⇐ ⇒

[CF-metadata] scalar coordinates

From: Hattersley, Richard <richard.hattersley>
Date: Fri, 10 May 2013 08:56:40 +0000

Perhaps it might be helpful to add some context, i.e. "Why do I care?"

My understanding is, Jonathan Gregory and Mark Hedley intended to
resolve this ambiguity in a subsequent revision of CF. And that
resolution will have an impact on both data producers and data
consumers.

As a data producer you might care because you're producing data which
will become invalid. As a data consumer you might find that software
tools interpret data differently, and hence you might have to change
your code.


> The question is this: "Does a Scalar Coordinate Variable....":
>
> Option A: Represent either a Coordinate Variable or an Auxiliary
> Coordinate? The presence of a scalar does not mandate the existence of
> a new dimension; it can imply an undeclared dimension of size one
> that is not explicitly defined in the file but it does not have to.
>
> Or
>
> Option B: Always represent a Coordinate variable which explicitly
> declares a dimension of size one, where this dimension is not stated
> in the file? An exception is provided for string scalar coordinate
> variables only, which are defined as Auxiliary Coordinates but also
> mandate a new dimension of size one.

It seems the difference hinges around the concept of "degrees of
freedom". In those terms...

Option A lets the data producer say, "Here are some scalar pieces of
metadata - data consumers can choose what to do with them."

Whereas option B implies, "These are the degrees of freedom - no more,
no less."


One impact of this is in the overdetermined case of time,
forecast_reference_time, and forecast_period. Even when a data variable
contains data for a single point in time, option B would require the
*producer* to decide which two variables describe the two degrees of
freedom, and which variable is the dependent variable.

But as a consumer I might choose to aggregate a collection of these
single-time-point data variables which are best parameterised by a
*different* pair of time, forecast_reference_time, or forecast_period.
In general, it's not possible for the data producer to know in advance
which two variables best parameterise the collection I'm interested in.


For this, and other related reasons involving ensembles, I'm in favour
of option A.


Richard Hattersley
Iris Benevolent Dictator
Met Office
Received on Fri May 10 2013 - 02:56:40 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒