⇐ ⇒

[CF-metadata] CF-1.0-beta5: curvilinear bounds "contiguous"attribute

From: jonathan.gregory at metoffice.com <jonathan.gregory>
Date: Mon, 17 Mar 2003 17:49:50 +0000

Dear Steve and John

Apart from the point concerning the record dimension, which is specific to
netCDF, we have also made the argument that it is simpler to support only one
convention. The (n+1) convention, although it is the usual case, cannot deal
with non-contiguous or overlapping cells, both of which occur in practice (as
mentioned by Steve and Bob), so we have to support the (n,2) convention. This
convention is not difficult to write or use, just redundant in the contiguous
case, but that isn't a very serious issue, as Bob said.

Steve remarked:

> Note that N*M*nvertices is ambiguous with both respect to orientation
> (clockwise or no) and to starting "corner". The n*2m*2 encoding has no
> ambiguities -- analogous to N*2.

I don't think this is quite true, actually, because it cannot be guaranteed
that the data has been written so that the bounds are ordered in the same
sense as the coordinates. In one dimension, for instance, say

x=1, 2, 3, 4;
bounds_x=1.5, 0.5, 2.5, 1.5, 3.5, 2.5, 4.5, 3.5;

Even if we legislated against this, I don't think it could be depended upon not
to happen. To be robust, an application should assume nothing about the
ordering of the bounds, except that it is the same for all cells. But it's not
too hard to deal with in practice - just "fussy", in Steve's word. In 2D, I
suppose you can compare two sets of vertices with unknown starting point and
ordering by doing something like sorting them in order of both coordinates
and then comparing the corresponding original indices.

I think it is more sensible to put the burden of interpreting things on
the user of the data than the writer of it. That approach makes the convention
generally lightweight, more attractive and more likely to be adhered to. A
complicated standard can be broken more easily by accident. A standard which
is likely to be often broken is not useful to the user of the data.

Bob said:

> I think the standard should prescribe one direction only, preferably
> counterclockwise

Of course, that would also make life easier for the user of the data, but I
fear that it is open to mistakes. The data-writer might have a different
interpretation of what counter-clockwise means (like left- and right-handed
rules for vectors), or simply get confused, and it's the kind of thing that
is terribly easy to get wrong (so I find, anyway). Again, I would argue that
it is safer not to depend on it.

We may choose to put the burden on the data-writer instead of on the user in
cases where the logic is potentially time-consuming to work out, which is the
main issue Steve raised wrt contiguousness. If I understand correctly, the
information to be recorded is, for each cell, which cells are adjacent.
Perhaps you also need to know which is their common edge. This kind of
information doesn't depend on any particular kind of grid. It can be supplied
even for unstructured grids dimensioned (cell). I agree with Bob that it
should be optional, as it *can* be worked out from the other information.
However, we might recommend including it for non-trivial grids (to be defined).
But as with all redundant information, should we worry about the possibility of
inconsistency?

I would propose that if we want to record such information, it should be put
in data variables with specified standard names. For example:

  dimensions:
    cell=7008;
    maxneighbour=4;
  variables:
    float temperature(cell);
      temperature:units="K";
      temperature:standard_name="air_temperature";
    int neighbour(cell,maxneighbour);
      neighbour:standard_name="adjacent_cell_index";
      neighbour:_FillValue=-1;
    int commonedge(cell,maxneighbour);
      commonedge:standard_name="common_boundary_segment_index";
      commonedge:_FillValue=-1;
  data:
    neighbour=1, 96, -1, -1, 0, 2, 97, -1, 1, ...;
    commonedge=1, 2, -1, -1, 3, 1, 2, -1, 3, ...;

This would mean that cell 0 adjoins cells 1 and 96. Cell 1 adjoins cells 0, 2
and 97. The common edge between cells 0 and 1 is the one from vertex 1 to 2
of cell 0, and from vertex 3 to 0 of cell 1. (That is, I have assumed that an
edge is numbered by the vertex of lower index.)

If we had had two dimensions (lat,lon) with lat=73 and lon=96 instead of a
single (cell) dimension, the information would be exactly the same. We have
already adopted a convention for numbering cells from 0, in section 8.2 on
compression by gathering.

Information of this kind is almost within the description of section 7.2 on
cell measures:

> For some calculations, information is needed about the size, shape or
> location of the cells that cannot be deduced from the coordinates and bounds
> without special knowledge that a generic application cannot be expected to
> have.

This isn't true because connectivity information *can* be deduced, but it is
complicated. Nonetheless, we could consider pointing to the neighbours and
common edges variables through the cell_measures attribute.

Ths scheme has to allow a maxneighbour dimension as big as necessary, and hence
wastes space. But the convention of 8.2 could also be used to compress these
variables, eliminating the wasted space:

  dimensions:
    cell=7008;
    maxneighbour=4;
    adjacencies=27694;
  variables:
    int adjacencies(adjacencies);
      adjacencies:compress="cell maxneighbour";
    int neighbour(adjacencies);
      neighbour:standard_name="adjacent_cell_index";
      neighbour:_FillValue=-1;
    int commonedge(adjacencies);
      commonedge:standard_name="common_boundary_segment_index";
      commonedge:_FillValue=-1;
  data:
    adjacencies=0, 1, 4, 5, 6, 8, ...
    neighbour=1, 96, 0, 2, 97, 1, ...;
    commonedge=1, 2, 3, 1, 2, 3, ...;

This means that neighbour(adjacencies) should be scattered into a 2D variable
(cell,maxneighbour), with the values of adjacencies(adjacencies) specifying
which cells of the 2D variables are populated, the rest being missing data.

Would such a scheme be suitable?

Cheers

Jonathan
Received on Mon Mar 17 2003 - 10:49:50 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒