⇐ ⇒

[CF-metadata] Re: projections in CF

From: Jonathan Gregory <jonathan.gregory>
Date: Fri, 21 Feb 2003 17:03:31 +0000

Dear All

Three schemes are being discussed:

* CF-beta. A per-data-variable attribute which contains the names of the
mapping, the parameters and their values.

* Caron-Rew. A string variable naming the mapping, with special attributes to
define its parameters, referred to by name from a per-data-variable attribute.

* Gregory. A per-data-variable attribute which contains the names of the
mapping, the parameters and the variables containing the parameter values. This
avoids the awkwardness that Bob raised of having to get numerical values from
strings.

Readability and telling differences by inspection: It is clear from the
discussion that these issues are matters of personal preference, since we
disagree on what's convenient for a human (I believe we are all human). Hence
we can't use these factors as a basis for decision.

Simplicity: My concern about lots of new attributes is the same as Karl's -
it makes the standard more complicated. For the same reason I do not like
introducing a new "kind" of variable for the sake of something to attach these
attributes to.

The main area of disagreement is the desirability of factoring out duplicated
information. There are several aspects to this.

(a) Testing whether variables have the same grid mapping. In the Caron-Rew
scheme you can conclude they do have the same mapping if they have the same
mapping name. But there might be two mappings with identical definitions. This
could happen if you assemble a file from several files. It could happen for
example if a model unconditionally creates separate mapping variables for the
velocity and mass grids. To prevent this happening you would have to place a
requirement on the data-writer always to check whether mappings are the same
and eliminate duplicates whenever a file is created. If you cannot guarantee
that mappings are unique, you have to test whether they are the same by
comparing parameters. If you *ever* have to be able to do this, you need
software to do it, and I argue that this advantage of factoring out the mapping
has then essentially been lost. You may as well always compare the mappings
parameter by parameter. It's more reliable that way.

(b) Order of parameters in a single attribute. We disagree about how easy this
is for a human to deal with, but for a program it is surely not an issue. A
program can scan the attribute repeatedly to find the parameters, regardless of
their order. The parser need not be bothered by the introduction of new
parameters it doesn't recognise - it can just ignore them by skipping forward
to the next keyword.

(c) Potential inconsistency if each variable has its own mapping definition.
In my scheme, this is partly avoided, because the different data variables can
share the parameter variables. However, I really would appreciate an example
which illustrates the concern with potential inconsistency. I don't think that
a per-data-variable attribute *is* truly duplicating information. Each variable
has its *own* grid mapping, though they may all happen to be the same. In the
Caron-Rew scheme and in my scheme you can change the value of a mapping
parameter for all data variables which share a mapping by altering a single
number. In the Caron-Rew scheme you can change the mapping itself (e.g. from
rotated pole to polar stereographic). But in what situation would you actually
want to do this by a global operation (except to correct a mistake)? Changing
the mapping means entirely recomputing the variable; it must have a new grid
and new data values. You have to do this on each variable separately. It is not
a global operation.

Because of (a-c), I don't think that factoring out the information gives any
great advantage, but I do think that to do it this way involves significantly
more work for the data-writer and is more complicated for the user of the
standard. So I still prefer CF-beta or my modified form of it.

Apart from these, there is a final, completely different issue which Russ
raises under "deletion anomalies", that the mapping can't exist if no variables
use it, in the case of a per-data-variable approach. Brian has raised a similar
issue in the past regarding auxiliary coordinate variables, which look just
like data variables if they are by themselves in a file. I feel this is a
different thing altogether. Do we want to be able to have "grids" (whatever we
might mean by that :-)) which exist independently of data variables? This is a
philosophical design point. CF has not tried to accommodate that idea up to
now. The aim of the standard is to provide metadata for data which exists, not
to provide metadata which could describe some data not yet provided.

I would say that this is a more general than grid mappings. If we want to do
this, I would propose we introduce something like the Caron-Rew idea, but for
all grid information. I'd call this an "abstract variable".

  dimensions:
    x=73;
    y=96;
    pressure=10;
    p_len=50;
    nb=2;
    nv=4;
  variables:
    char abstractvariable(p_len);
      abstractvariable:grid_mapping="rotated_latitude_longitude ",
        "grid_north_pole_latitude: nplat grid_north_pole_longitude: nplon";
      abstractvariable:coordinates="lat lon";
    float x(x);
      x:bounds="x_bounds";
    float x_bounds(x,nb);
    float y(y);
    float lat(y,x);
      lat:bounds="lat_bounds";
    float lat_bounds(y,x,nv);
    float lon(y,x);
  data:
    abstractvariable="x y pressure";
    nplon=170.0;
    nplat=32.5;
    lat=...;
    lon=...;
    x=...;
    y=...;

This is an entirely data-free variable. Its string value tells us its
dimensions, and it provides a home for the mapping, coordinates, auxiliary
coordinates and their boundaries. Its name could be taken as a name for this
grid as a whole. However, I'm not advocating this as a solution to the
practical problem of how to define the grid_mapping, but as a proposal to
consider if we want to introduce this more abstract kind of object as an
additional feature.

Have a good weekend. Cheers

Jonathan
Received on Fri Feb 21 2003 - 10:03:31 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒