[CF-metadata] Feedback requested on proposed CF Simple Geometries from David Blodgett on 2016-09-26 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: David Blodgett <dblodgett>
Date: Mon, 26 Sep 2016 10:37:55 -0500

Dear Jonathan, Chris, Bert and others,

Thanks for the responses. This is all very thoughtful and good discussion.

Ben, Tim and I are preparing a more thorough response than any one of us could prepare on our own and will share it with the list soon.

Regards,

- Dave Blodgett

> On Sep 26, 2016, at 6:44 AM, Bert Jagers <Bert.Jagers at deltares.nl> wrote:
>
> Dear Jonathan, Chris, and everybody,
>
> Concerning the reference to UGRID:
>
> Yes, there are similarities between UGRID and the Simple Geometries, but there are differences as well.
> We discussed about merging or diverging these two proposals before Ben sent his mail to this group, but it's good to hear the opinion of the broader community on this topic.
>
> 1) As Chris indicates UGRID is about defining a mesh or grid, i.e. bunch of polygons that all share vertices. These polygons tend to be of similar order (usually 3 or 4 nodes per polygon, but 5 or 6 or even more might also occur). Since the variation in the number of nodes is usually limited, we have so far accepted that the full matrix is stored with missing values for entries that aren't needed for defining a polygon with less than the maximum number of nodes. I know that there are numerical models that use mostly triangles but accept complex elements with over 1000 nodes and for such cases the face_node_connectivity matrix becomes awfully sparse.
>
> 2) UGRID uses an indirection layer because nodes are shared and the fact that nodes are shared is an important element of the mesh topology/connectivity. So, UGRID defines primarily node coordinates and defines edges (i.e. lines) and faces (i.e. polygons) by referring to the node indices. The current Simple Geometries proposal inherits the indirection layer from the UGRID but it is not immediately clear that this indirection is useful. If we think of an arbitrary set of polygons then nodes will be rarely shared, however in case of a coverage of the continent by complex hydrological catchments then one may expect that almost all interior nodes will be shared by two or more catchment polygons.
>
> 3) If the polygons form the basis of a hydrological model and connectivity between these complex faces is important, then users may prefer to store the polygons using UGRID and then UGRID may have to support faces with holes and multi-part faces as well. It's a useful extension which we have considered for the results of our water quality model which tend to run on an aggregated version of the hydrodynamic mesh. In this case the hydrodynamic mesh is composed of simple polygons (typically 3-6 nodes), while the water quality data is defined on complex aggregates thereof (20 node polygons) with possibly holes but multiple parts are seldom (although the shared edges between such complex polygons may be composed of multiple (disconnected) edges). By keeping Simple Geometries aligned with UGRID one could make the full list of features available to numerical modellers. Since the indirection is an important part of UGRID, this would for consistency be included in Simple Geometries as well al
> though it adds little value and looks like overhead as indicated by Jonathan.
>
> 4) The other alternative as given by Jonathan would be to link this to the bounds definition. I would like to remark that the current Cell Boundaries section doesn't say anything about the storing cells with a varying number of nodes. In UGRID we have assumed that the missing value can be used to fill up the arrays of small polygons, but this isn't listed in the CF conventions as an option. Anyway, this option looks more natural if the sharing of nodes is seldom or not important. In this case one loses the option to mark the outer and inner boundaries using flags, but there are alternative methods as already proposed (in such case we would need to check how the inside/outside flag variable is referenced from the bounds). However, this would not solve the need for such features in UGRID and by diverging away from UGRID's indirection the solution couldn't be reused inside UGRID as needed.
>
> 5) Besides inventing our own storage format (either in line with UGRID or CF), a third way was discussed namely: to store the simple geometry shapes as ascii or binary blobs in an extended format NetCDF 4 file. Since there are good starting points within UGRID and CF for storing polygons, we haven't really considered this third option yet since it would be less easily readable without using GIS libraries like GDAL. However, the main strength of this approach would be that other standard simple features such as circles, donuts, and circular arcs would be automatically covered consistently by this method.
>
> 6) Actually, I have a related use case for storing a network of polylines (rather than straight edges) in UGRID compatible format for 1D hydraulic models. In this case I need to store polylines representing river branches: their connectivity at bifurcations and confluences is important, but so is their overall length - river chainage - and hence I can't split them into the base edges. This network defines basically the 1D coordinate system to be used by the actual 1D (UGRID) simulation mesh which will be defined on top of this channel network. Because of the link to 1D numerical modelling, this will be discussed in a separate thread in the UGRID community first.
>
> Best regards,
>
> Bert
>
> -----Original Message-----
> From: CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Jonathan Gregory
> Sent: 22 September 2016 18:26
> To: cf-metadata at cgd.ucar.edu
> Subject: Re: [CF-metadata] Feedback requested on proposed CF Simple Geometries
>
> Dear Chris
>
>>> If the regions were irregular
>>> polygons in latitude and longitude, nv would be the number of
>>> vertices and the lat and lon bounds would trace the outline of the
>>> polygon e.g. nv=3,
>>> lat=0,90,0
>>> and lon=0,0,90 describes the eighth of the sphere which is bounded
>>> by the meridians at 0E and 90E and the Equator. I think, therefore,
>>> we do not need an additional convention for points or polygonal
>>> regions.
>>
>> this seems fine for this simple example, but burying a bunch of
>> coordinates of a complex polygon in a text string in an attribute is
>> really not a good idea -- the coordinates of a polygon should be in
>> the array data one way or another, rather than having to parse out attribute strings.
>
> To avoid confusion:
>
> I didn't suggest parsing attribute strings. The same numbers that Ben would put in his x and y auxiliary coordinate variables for a single polygon can appear in coordinate bounds variables according to the existing convention.
>
>> * I suspect that geometries of this kind can be described by the ugrid
>>> convention http://ugrid-conventions.github.io/ugrid-conventions,
>>> which is compliant with CF. Their purpose is to describe a set of
>>> connected points, edges or faces at which values are given,
>>
>> I'm not so sure -- UGRID is about defining a bunch of polygons that
>> all share vertices, and are all of the same order (usually all
>> triangles, or quads, or maybe hexes). if they are a mixture, you still
>> store the full set (say, six vertices), while marking some as unused.
>> But it's not that well set up for a bunch of polygons of different order.
>>
>> Not too bad if there are only one or two complex polygons, but it
>> would be a bit weird -- you'd have vertices and boundaries, but no
>> faces. And you'd lose t order of the vertices (thought that could
>> probably be added to the UGRID standard)
>
> OK. I didn't investigate this, but it would be good to know about it. If ugrid can do something like this, but not all of it, maybe ugrid could be extended. If ugrid seems too complicated for these cases, maybe a "light"
> version of ugrid could be proposed for them. I think we should avoid having two partially overlapping conventions.
>
>> * So far CF does not say anything about the use of netCDF-4 features (i.e.
>>> not
>>> the classic model). We have often discussed allowing them but the
>>> general argument is also made that there has to be a compelling case
>>> for providing a new way to do something which can already be done.
>>> (Steve Hankin often made this argument, but since he's mostly
>>> retired I'll make it now in his name
>>> :-)
>>>
>>
>> Maybe it's time to embrace netcdf4? It's been a while! Though maybe
>> for CF
>> 2.* -- any movement on that?
>
> I think, as we generally do, that we should adopt netCDF-4 features if there is a definite need to do so. I mean something you can't do with an existing mechanism, or which is done so much more easily with a new mechanism that it justifies the extra effort of requiring alternatives to be programmed in software. I'm not arguing against it in general, but I think it has to be argued for each specific need within the convention.
>
> CF2 is not well-defined. I have to admit to being nervous about that. I am very much opposed to an idea of "starting all over again" and maintaining two conventions in parallel (since old data would continue to exist for a long time and so the old CF would have to be supported), and I also think backwards- incompability has to be strongly justified. So I favour step-by-step evolution.
> Another idea we've discussed, which I'm comfortable with, is of defining "strict" compliance to the convention, which a data-writer could optionally adhere to. This could exclude older features we wanted to deprecate. However this is really not the subject of the discussion - it's another thread.
>
>> I think the ragged array option ins fine -- though I haven't looked at
>> vlen arrays enough to know if they offer a compelling alternative. One
>> issue is that the programming environments that we use to work with
>> the data may not have an equivalent of vlen arrays.
>
> That's a good point, and a reason why we have to be cautious in general about adopting netCDF-4 features.
>
> Best wishes
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> DISCLAIMER: This message is intended exclusively for the addressee(s) and may contain confidential and privileged information. If you are not the intended recipient please notify the sender immediately and destroy this message. Unauthorized use, disclosure or copying of this message is strictly prohibited. The foundation 'Stichting Deltares', which has its seat at Delft, The Netherlands, Commercial Registration Number 41146461, is not liable in any way whatsoever for consequences and/or damages resulting from the improper, incomplete and untimely dispatch, receipt and/or content of this e-mail.
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
Received on Mon Sep 26 2016 - 09:37:55 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST