⇐ ⇒

[CF-metadata] CF-1.0-beta5: curvilinear bounds "contiguous"attribute

From: Steve Hankin <hankin>
Date: Thu, 20 Mar 2003 20:57:17 -0800

Hi Jonathan,

You have done quite a thorough job of summarizing the issues. Thank you.

Just a few short remarks embedded below.

    - steve

============================

Jonathan Gregory wrote:

> Dear All
>
> Here are some more points about boundary coordinates.
>
> I accept the argument that it is nice to be able to tell from (n+1) boundaries
> that the cells are contiguous. This is of course an advantage, and another
> advantage is that this method has been commonly used. However, COARDS didn't
> standardise any method for boundaries so whatever we do will present problems
> for some existing data.
>
> Generic applications: As I've said, the reason I'm unhappy about supporting
> both (n+1) and (n,2) boundaries is that it makes the standard more
> complicated. I am worried in particular that writers of generic applications or
> data-users writing their own ad-hoc analysis code may not implement (n,2) if
> (n+1) is used more commonly. That would mean we would have to wait longer for
> full CF support, and that some programs would have problems with some
> data. Bounds are a pretty basic requirement so this would be a nuisance.

Given the difficulties of accomodating the netCDF record axis I will confine my
remarks to the nx2 approach for the immediate discussion. It is my assumption at
this point that the n+1 encoding is not on the table as a serioujs proposition at
this moment.

> Writing data: Even if we require (n+1) to be used for contiguous cells, writers
> of data might use (n,2) when it is more convenient. For example, if you are
> making a 3D variable (time,lat,lon) by concatenating separate (lat,lon)
> variables, each with its own 2-element time bounds, it is convenient to produce
> a (n,2) time boundary variable. To require (n+1) to be used means the
> data-writer has to do a contiguousness check and write data in two different
> formats according to the results. If we want to encourage people to write
> CF-compliant data, we don't want to make it more difficult for them to do
> so. On the other hand, if contiguous cells may be written in (2,n) format,
> applications reading data cannot make any assumptions and have to be able to
> check for themselves, just as if we did not support (n+1), so nothing has been
> gained by complicating the standard.
>
> Record dimension: This problem has already been raised. Doesn't it mean that
> (n+1) can't be used for the unlimited dimension? Wouldn't that imply, again,
> that applications need to be able to check (n,2) for contiguousness?
>
> Wrap-round axis: A longitude axis spanning the whole world has only (n)
> boundaries. Even if written in (n+1) form it must contain extreme boundaries
> which are equal under modulo 360; it is not possible to avoid having to check
> for contiguousness in order to determine whether it is a wrap-round axis.

The "modulo" attribute has been in fairly wide-spread usage in COARDS data sets for
some time (I am off-line in a hotel room and cannot check the details) and I
thought it was part of CF, too. It is vital. Using the modulo attribute it is
clear when the N+1 boundary point is or is not redundant.

For arbitrary curvilinear coordinate systems the modulo concepts are subtler but
still vitally important. Many curvilinear coordinate systems are not global -- for
example the SPEM model. In order to do graphics with arbitrary projections and an
arbitrary longitude center point the wrap-around properties must be well
understood. It seems to me that a more detailed analysis of the general
"wrap-around" properties of curvilinear global grids is needed. Is there
additional information on the connectivity that we should be encoding in our CF
files?

> 1D case: In an earlier posting I said that you couldn't depend on the order of
> the boundaries in (n,2). Obviously we could adopt one. For instance, we could
> say that the sequence bound(i,0), point(i), bound(i,1), bound(i+1,0), ... must
> be monotonic. In that case the extrema are bound(0,0) and bound(n-1,1), and
> contiguousness is tested by comparing bound(i,1) with bound(i+1,0) e.g. if
> (abs(bound(i+1,0)-bound(i,1)).lt.1e-5*abs(point(i+1)-point(i))) i.e. the
> boundaries between adjacent cells are much closer together - even if not
> exactly coincident because of numerical imprecision - than the grid
> points. This convention is easy to implement and check for correctness because
> it depends on numerical comparison.
>
> 2D case: I think we ought to distinguish between the polygon case and the
> quadrilateral case. The bounds for the former cannot be decomposed into two
> separate dimensions. Any number of vertices may be needed.

> ==> The cells themselves may not be in a 2D array.

This is an absolutely key point. For the polygon case (most commonly found in
finite element triangular meshes) the 2D representation of grid bounds breaks down
altogether. M*N*Nv doesn't map to the problem. Indeed, the entire duality of "grid
points" and "grid bounds" breaks down. For finite elements conceptually there is a
1D list of vertices and a (very large) sparse N*N connectivity matrix. A number of
non-standardized methods of cramming this information into files seems to be in
use, but none of them resemble M*N*Nv that I have ever seen. (Not to imply a
comprehensive knowledge of this area.)

I believe that separate encoding (in a future version of CF) is justified to handle
non-quadrateral grids. Serious thought has to be given to how to do this right.
At this point the requirement for M*N*Nv seems to be lacking. Can we agree to
abandon the M*N*Nv and to adopt some variant on the simple M*N*2*2 encoding? ...
applicable to quadrilateral grids, only.

    - steve

> Contiguousness is complicated to test. In the
> quadrilateral case, however, the bounds arrays can be dimensioned
> bounds(n,m,2,2), as Steve says. If we adopt the above 1D convention for
> ordering, we then know which 2D bounds to compare for contiguousness of any
> pair of adjacent cells. This has the same purpose as saying the four points
> must be ordered clockwise or anticlockwise. (The reason I am worried about
> (anti)clockwise ordering is that I fear people might sometimes get confused
> about whether they were considering index space or real space to decide the
> sign convention.) Two notes on this:
>
> * bounds(n,m,2,2) can be equivalenced/reformed to bounds(n,m,4) if users prefer
> to view them that way. The points would be arranged as a Z rather than as going
> round the perimeter.
>
> * In the quadrilateral case, the 2D bounds are on auxiliary coordinate
> variables. If the 1D coordinate variables have bounds as well, these can be
> checked for contiguousness, which is easier.
>
> I am sorry this is rather long. I hope it makes sense.
>
> Jonathan
Received on Thu Mar 20 2003 - 21:57:17 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒