Opened 4 years ago

Last modified 3 years ago

#164 new enhancement

Simple Geometries in CF

Reported by: whiteaker Owned by: cf-conventions@…
Priority: medium Milestone:
Component: cf-conventions Version:
Keywords: geometry Cc:

Description

Dave Blodgett, others, and I have developed a contribution which augments Chapter 7 of the CF conventions with the ability to describe simple geometries such as lines and polygons. This is useful when encoding feature geometries for data variables like average precipitation over a watershed or velocity in a river segment.

We proposed this contribution on GitHub? via a pull request, in part to help test the GitHub? workflow for making changes to CF. This has been a bit of a struggle as we iron out the GitHub? kinks. In the meantime, Jonathan Gregory suggested that to move the process along, we open a trac ticket for this as a proposed change to 1.7.

GitHub? pull request and discussion: https://github.com/cf-convention/cf-conventions/pull/115

Rendered proposal (the Geometries section is at the end): https://github.com/dblodgett-usgs/cf-conventions/blob/7768e33e7edff459482e8ef8057ea6b8e015c9eb/ch07.adoc

I appreciate your consideration of this proposal. If more discussion of this proposal is required, I ask that you continue the discussion in GitHub?. Thanks!

Tim Whiteaker

Change History (24)

comment:1 Changed 4 years ago by jonathan

Thank you for your hard work on this proposal, which I support in its current form.

Jonathan

comment:2 Changed 4 years ago by davidhassell

Thanks indeed, Tim et al.

I'd like to volunteer to moderate this discussion, and shall shortly post a summary of what I think are the outstanding issues.

All the best, David

comment:3 Changed 4 years ago by davidhassell

Hello,

As far as I can tell, the only conceptual outstanding issue with this ticket is whether multi- and single-part geometry types should be combined into a single type - i.e. all geometries could be "multi", with the current "single" instances being a special case of multi. This is discussed here: https://github.com/cf-convention/cf-conventions/pull/115/files/5f8c0859b7158fd3ebe94a898bbd2837233a22a3#r122266537

I think that the arguments presented against a sole type are reduced interoperability and increased software complexity. The argument for is reduced conventions complexity (although Chris Little might want to add to this?).

Related to this single/multi issue is that the proposal currently allows different cells to have different geometry types. Is this actively desired? or required? Is there a case for insisting that all cells have the same type, for simplicity? I would have thought that we do not need to constrain the proposal in this way, but it's worth mentioning to see what others think.

A minor point: A table defining each of the geometry types would be useful - it seems that currently they are introduced with the assumption that you arlready know what they are.

All the best,

David

comment:4 Changed 3 years ago by davidhassell

Hello - forwarding a comment from Chris Little that didn't make it here:

As David mentioned me, I though it worth adding an argument for 'Multi' types - they clarify some inherent potential parallelism in the structures, and the structures, and their processing, are ones that map to the parallel hardware found in GPUs. GPUs have evolved over the last 30 years to process exactly these kinds of structures.

As a provocative aside, this strikes me like an "Assembler is faster than Fortran so we stick with Assembler" kind of debate.

Chris, running for cover.

comment:5 Changed 3 years ago by davidhassell

Hello,

It seems from comments at ​https://github.com/cf-convention/cf-conventions/pull/115/files/5f8c0859b7158fd3ebe94a898bbd2837233a22a3#r122266537] and on this ticket that all geometries being "multi" is acceptable to everyone - does anyone disagree?

Assuming this is the case, do the "single" lines need to be removed from the table that was added: https://github.com/dblodgett-usgs/cf-conventions/blob/7768e33e7edff459482e8ef8057ea6b8e015c9eb/ch07.adoc

If that is so, we ought to be close to wrapping this ticket up (subject to modifications to the conformance document).

Many thanks,

David

comment:6 Changed 3 years ago by whiteaker

In October I consolidated the items in the table to simply: point, line, and polygon. I also edited the rest of the text to reflect this consolidation. You can view the latest version of the files here: https://github.com/cf-convention/cf-conventions/pull/115/files

I created a branch for conformance updates with most of the new text here: https://github.com/twhiteaker/Conformance/blob/Geometry_Conformance/conformance.adoc#75-geometries

...plus a mention of geometry node coordinates here: https://github.com/twhiteaker/Conformance/blob/Geometry_Conformance/conformance.adoc#4-coordinate-types

Does that address all outstanding issues on this proposal?

comment:7 Changed 3 years ago by davidhassell

Hello,

It would be great if any interested parties could review Tim's latest changes on this ticket. In particular the many conformance requirements.

I have a minor suggestion:

In paragraph 2, a polygon has that it should be assumed that the first and last node are connected. Note that CF does not require the first and last node to be identical but allows them to be coincident if desired. However, in table 7.1 it says a polygon is <snip> a closed line (i.e., the first point in the line is coincident with the last point).

Also the conformance recommends (not insists) that the coordinates of the first and last node in a given ring of a polygon geometry should be coincident.

This is a bit confusing to me. I think that table 7.1 should be made consistent with paragraph 2, and the recommendation should be removed.

All the best,

David

comment:8 Changed 3 years ago by jonathan

Dear Tim

Thanks for doing such a thorough and careful job of these documents. I have a few comments, which are all small.

  • In appendix A, it would be helpful to introduce a new possible value in the Use column, to indicate geometry container variable, so that these don't have to be -.
  • It would be good to add some words in the preamble of Sect 7 pointing to the simple geometries Sect 7.5 as an alternative to bounds.
  • In the text of 7.5, "but in that case every cell must have the same number of vertices and must be single polygon rings" - this sentence is confused about singular/plural.
  • To "Lines and polygons are also nearly the same except that it should be assumed that the first and last node are connected", append "for a polygon".
  • You write, "The only legal values of geometry_type are point, line, and polygon (case sensitive)". Must it be case-sensitive? CF attributes are not usually case-sensitive. Requiring case-sensitivity means more possibility for mistakes, and it doesn't seem helpful, because you wouldn't define a convention that depended on case to distinguish possibilities - that would be a recipe for confusion!
  • In the conformance doc, you recommend, "The coordinates of the first and last node in a given ring of a polygon geometry should be coincident". Why is this recommended? It's redundant and it doesn't seem preferable to me. I think David is also suggesting omitting this.

Best wishes and thanks again

Jonathan

comment:9 Changed 3 years ago by whiteaker

Those suggestions make sense to me. I've implemented them in my forks. For the CF conventions, I've got a pull request in to Dave Blodgett's repo, which when accepted would then be ready to absorb in the official CF conventions. For conformance, I've got a geometry_conformance branch that I can create a pull request for once this ticket is closed (or would you prefer I do it now?).

Conformance: https://github.com/twhiteaker/Conformance/blob/Geometry_Conformance/conformance.adoc

Conventions (for now, until Dave accepts my pull request): https://github.com/dblodgett-usgs/cf-conventions/pull/6

comment:10 Changed 3 years ago by davidhassell

Hello,

I am wondering about how node_coordinates and coordinates can be correctly paired. For example in this case:

float geometry_container ;
    geometry_container:geometry_type = "polygon" ;
    geometry_container:node_count = "node_count" ;
    geometry_container:node_coordinates = "x y" ;
    geometry_container:coordinates = "lat lon"

how do we know which of x and y goes with lat? The answer is, of course, that the netCDF variables x, y, lat and lon must have sufficient metadata, but I'm can't see that this is specified in the text.

The proposed text already says The geometry node coordinate variables must each have an axis attribute whose allowable values are X, Y, and Z.

Can we also say that if coordinates are provided then those which correspond to node_coordinates must also have the axis attribute? In this way, there will be no ambiguity.

All the best,

David

comment:11 Changed 3 years ago by whiteaker

Hi David. What's the motivation for having y paired with lat? In other words, how might the ambiguity hurt us? node_coordinates and coordinates are two distinct representations of the feature's geometry, and they would be interpreted independently of each other. For a polygon, coordinates would represent a point while node_coordinates would represent a polygon, so you'd be pairing all the y-coordinates from a polygon with a single latitude value. Thanks, Tim

comment:12 Changed 3 years ago by davidhassell

Hi Tim,

My motivation came from considering how the simple geometries could fit into the CF data model (CFDM).

In the CFDM, coordinates and their bounds are bundled up into a single construct (the "auxiliary coordinate construct" in this case). In the usual case, matching coordinates with their bounds is unambiguous, via the bounds attribute. The node coordinates are essentially bounds (albeit complex ones) to the coordinates, so I was looking to extend the definition of coordinate bounds to allow for these geometries. For this to work, I was looking for a similarly robust way of pairing up coordinates with their node coordinates (i.e. "bounds"). We can't rely on standard names, for example, because they are optional; nor units, for example, because they could be the same for both X and Y coordinates.

I hope that I'm making sense,

David

comment:13 Changed 3 years ago by whiteaker

Hi David,

I've been trying to specify simple geometries while minimizing impact on existing CF standards such as how coordinate variables are represented; however I do recognize that in some cases (CF 1.8 Section 3.1, second paragraph) latitude and longitude units could both be degrees rather than degrees_east and degrees_north, potentially resulting in ambiguity. I'm OK with the addition to the geometry section that you mentioned. If there is no dissent in the next couple of days, I'll make the change in the pull request.

comment:14 Changed 3 years ago by whiteaker

Hi David,

I've made your requested change. Conformance doc is updated in my fork: https://github.com/twhiteaker/Conformance/blob/master/conformance.adoc

CH7 and history are updated in conventions in my fork, and should show up in pull request 115 to the conventions on GitHub? as soon as Dave Blodgett accepts my changes. You can also view my fork directly here: https://github.com/twhiteaker/cf-conventions

Does anyone have any more comments on the geometries contribution?

comment:15 Changed 3 years ago by jonathan

No more comments from me, thanks. I think this proposal is fine. Jonathan

comment:16 Changed 3 years ago by david.arctur

Glad to see Jonathan’s comment. Does anyone else have any issues with this? Can we bring it to completion / adoption?

dka

comment:17 Changed 3 years ago by jonathan

The rule is that if no more comments are made within three weeks following the last response (which was 24th Jan, comment 14 above) the ticket is accepted. Jonathan

comment:18 Changed 3 years ago by jonathan

That'll be tomorrow, I see.

comment:19 Changed 3 years ago by whiteaker

Is this now accepted? If so, I'll start updating the text in the Python reference implementation [1] to reflect the accepted conventions. I'll also submit a pull request to the conformance rules on GitHub?.

[1] https://github.com/twhiteaker/netCDF-CF-simple-geometry

comment:20 Changed 3 years ago by jonathan

Yes, the ticket is now accepted and should be included in CF 1.8. Thanks very much for your work on it! We will keep this ticket open until 1.8 is released. Jonathan

comment:21 Changed 3 years ago by jonathan

When this ticket is implemented in the CF standard document, Dave Blodgett and Tim Whiteaker should be added as additional authors of the convention. Jonathan

comment:22 Changed 3 years ago by whiteaker

There's a requirement in the proposal such that if a coordinates attribute is carried by the geometry container variable, then those coordinate variables which correspond to node coordinate variables must also have the axis attribute. I think this was originally put in place because David Hassell requested it for conformance or data model reasons. Since then, David has requested to me that instead of requiring the axis attribute on the coordinate variables, we instead include a bounds attribute pointing to the node coordinate variables. The node coordinates serve as the bounds for coordinate variables. For example, if features are represented as polygons with node coordinates x and y, and the centroids of those polygons with coordinates lon and lat, then the lon coordinate variable would have a bounds attribute naming the x node coordinate variable.

Notwithstanding any objections, I'll update the Pull Request to the conventions in GitHub? to reflect this change. I will also update the conformance pull request to reflect this and some other non-breaking changes requested by David.

comment:23 Changed 3 years ago by whiteaker

Though the three week rule should have applied after the 24th Jan comment, the pull request was never merged in GitHub?. Since then, we've addressed typos and other cleaning up of the revision, and pull request 115 has been accepted by three reviewers. So here again we potentially start the 3-week process, and hopefully by the end of that time, we will have worked out how to go about merging changes in GitHub? (see separate issue https://github.com/cf-convention/cf-conventions/issues/130).

comment:24 Changed 3 years ago by davidhassell

Sounds good to me. As moderator of github 130 I'll try and keep that discussion going...

Note: See TracTickets for help on using tickets.