John Caron wrote:
> Bob Drach wrote:
>
> >Hi Steve,
> >
> >Steve Hankin wrote:
> >
> >
> >
> >>Hi All,
> >>
> >>A message from Jonathan yesterday mentioned the impending
> >>release of the first non-BETA CF standard. (I second the
> >>plan, of course!) I would like to return to our previous
> >>discussions of Cell Bounds (just a year ago -- one sample
> >>message attached to help you locate the dialog). I'm not
> >>sure that the outcomes of that discussion are incorporated
> >>into the current CF beta. In particular we discussed an
> >>attribute to indicate that the cells were contiguous. I am
> >>perhaps overlooking it, but I did not find it in
> >>http://www.cgd.ucar.edu/cms/eaton/cf-metadata/CF-working.html#bnds
> >>
> >>1. The concern: Consider the pressing need to create model
> >>access and inter-comparison frameworks on the Web.
> >>Curvilinear grids will be commonplace. A Web server for
> >>this purpose (such as LAS) will need to open a CF file and
> >>rapidly (in interactive time) extract coordinates and data
> >>and create a plot. The most common and informative style of
> >>plot is one in which the individual cells are colored --
> >>producing a filled contour plot that also communicates grid
> >>resolution. The "contiguous" attribute is important here.
> >>If the application has no way of knowing a priori that the
> >>cells are contiguous, then it must either use
> >>extraordinarily inefficient graphics of plotting every
> >>polygon individually, or it must examine all relevant
> >>vertices to confirm that the data are contiguous. In a
> >>stateless Web server it must repeat this for every plot.
> >>
> >>
> >
> >I recently spent a fair amount of time adding CF-style curvilinear and general grids
> >into CDAT, so have an appreciation for the importance of this point. In some cases it
> >was necessary to make assumptions about the connectivity of the grid which need not
> >always hold. This is not meant as criticism of the current standard, but to reinforce
> >your point.
> >
> >This raises a question about the meaning of the contiguous attribute for curvilinear.
> >I'm assuming that contiguous means something like: cell(i,j) is adjacent to
> >cell(i+1,j) and cell(i,j+1). But mapping a horizontal, spherical grid to a
> >rectangular index space necessitates a cut or discontinuity on some line. For
> >rectilinear grids this is usually coincident with the endpoints of the longitude
> >axis, but not so for curvilinear grids. So the question is what precisely does it
> >mean to say that the grid is contiguous? And does it have any meaning for general
> >grids (with representation lon(ncell), lat(ncell))?
> >
> >BTW, if we add the contiguous attribute, it should be optional, and assumed true by
> >default.
> >
> >
> >
> >>====
> >>
> >>The "contiguous" attribute is the main point and one that I
> >>believe we agreed upon. What follows should be considered as
> >>a separate discussion.
> >>
> >>2. CF-1.0-beta5 encodes curvilinear coordinate bounds using
> >>n*m*nv instead of n*m*2*2. This is nasty for the common,
> >>simple graphics of contiguous cells. The application has to
> >>read all of the (redundant) vertices; then knowing neither
> >>the ordering of the vertices nor the starting "corner" it
> >>must figure out which edges the adjacent cells share. This
> >>is a fussy problem rather than hard, but it definitely makes
> >>reading CF files more complex and specialized code will be
> >>required. The problem gets even more fun in a model
> >>inter-comparison framework. Do file A and file B contain
> >>identical grids? The algorithm needed in order to know this
> >>involves a "blind grope" over the bounds of the first cells,
> >>knowing neither the relative orientation, nor the relative
> >>starting vertex ("corner") on the two grids. These problems
> >>are not shared by the n*m*2*2 encoding.
> >>
> >>Standards such as CF are built upon trade-offs -- commonly,
> >>generality vs. simplicity. In weighing the choice we
> >>consider the level of generality gained vs the level of
> >>complexity added. In our last discussions of the "nv vertex
> >>encoding" the stated generality goal was to accommodate a
> >>particular hexagonal grid. At that time our discussions
> >>revealed that i) no one had yet attempted to encode that
> >>grid in a CF file; and ii) the "hexagonal" grid actually had
> >>varying numbers of vertices (i.e. it was not a good fit to
> >>the standard, anyway). Before CF-1.0-beta5 becomes official
> >>are we comfortable that we have adequately tested cases
> >>where the generality of this encoding is required and we
> >>have demonstrated that the trade-off is positive on balance?
> >>
> >>
> >
> >I'll try to condense previous discussions on this point:
> >
> >The advantages of the current standard for bounds representation are:
> >- It's a consistent representation for all grid types.
> >- It's general, allowing for the possibility of missing (noncontiguous?) cells, or
> >even overlapping cells.
> >
> >The main disadvantage is that there will usually be some redundancy of information.
> >
> >I don't see redundancy as a major drawback - typically the grid description is a
> >small fraction of the data volume.
> >
> >Also, I don't see any real difference between the n*m*4 and n*m*2*2 representations.
> >In the latter case collapsing the last two dimensions results in n*m*4 where the last
> >dimension is 'z-ordered', while CF stipulates the ordering to be consistently
> >clockwise or counterclockwise (BTW, I think the standard should prescribe one
> >direction only, preferably counterclockwise).
> >
> >Since the earlier discussion I've written some sample CF files, including one with
> >the 'geodesic (icosahedral) grid'. It's seems reasonable to assume that we'll see
> >more such grids appear in climate models. I've also run across some ocean model grids
> >where some cells overlap others.
> >
> >One last data point: the SCRIP regridding package from LANL uses the n*m*nv
> >representation for bounds, and stipulates the ordering to be counterclockwise.
> >
> >Best regards,
> >
> >Bob
> >
> >
> >_______________________________________________
> >CF-metadata mailing list
> >CF-metadata at cgd.ucar.edu
> >http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
>
> In our NcML encoding, we have tentatively agreed upon 2 ways to specify
> coordinate bounds, the first from CF, the second the simple case of a
> contiguous grid:
>
> CoordinateAxisBoundary element is a Variable which must have the shape:
>
> 1) (dim1,dim2..., nvertices), where dim1,dim2... are from the
> CoordinateAxis, nvertices number of vertices of the boundary. [CF
> definition]
>
> 2) (dim1+1, dim2+1,...) where (dim1, dim2...) are from the
> CoordinateAxis [contiguous case]
Hi John,
I think the spirit of this is an excellent suggestion -- better in principle than an
optional "contiguous" attribute because the "contiguousness" becomes a self-describing
property of the (dim1+1, dim2+1,...) encoding. Experience has shown that the majority of
file outputs would be fully described by this encoding and very efficiently processed.
The awkwardness of this approach lies in the limitations of the netCDF record attribute --
that there is no way simultaneously to have a record axis of length N and a bounds
variable of length N+1 on the record axis. Therefore one cannot readily append to a file
that represents axis bounds this way (typ. the time axis). That (and other factors) lead
to the N*2 encoding of 1D axes. The N*2 encoding allows for non-contiguous points --
sometimes useful, often redundant (especially useful on time axes). The n*m*2*2 case is
the obvious 2-dimensional analog of the N*2 encoding (It could alternatively be
(n*2)*(m*2), with minor trade-offs.)
Note that N*M*nvertices is ambiguous with both respect to orientation (clockwise or no)
and to starting "corner". The n*2m*2 encoding has no ambiguities -- analogous to N*2.
All of this is by way of a neutral analysis -- not arguing for one solution over another.
- steve
>
> I am wondering if you considered this simple and common case ? Perhaps
> this is not needed because it is assumed if bounds are not specified ?
>
> I assume 1) is the n*m*nv case. It seems 2) is (n+1)*(m+1). Could
> someone explain what the n*m*2*2 case is ?
--
Steve Hankin, NOAA/PMEL -- Steven.C.Hankin at noaa.gov
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744
Received on Fri Mar 14 2003 - 15:46:38 GMT