[CF-metadata] Feedback requested on proposed CF Simple Geometries from Jonathan Gregory on 2016-10-26 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Jonathan Gregory <j.m.gregory>
Date: Wed, 26 Oct 2016 14:33:37 +0100

Dear Ben and Bert

Thanks for your emails, which help me to understand the simple geometry
proposals better. Just to be clear, I'd like to repeat my first question.

> You explain that the need is to specify spatial coordinates with a simple
> geometry for a timeSeries variable. For example, this could be for the
> discharge as a function of time across some line in a river (your example),
> or I suppose it could be an average temperature as a function of time for
> the Atlantic Ocean, where you wanted to supply the polygon which drew the
> outline of the basin. Have I got the idea?

to which you replied

> Yes, you have this mostly right. It?s common to have a collection of points
> (weather stations), lines (stream reaches), or polygons (hydrologic
> catchments) with an associated time series

I was asking whether this means that for each *collection* (of points, lines or
polygons) there is a *single* timeseries. For instance, in your example of a
single geometry composed of several polygons, there is a single number for each
time. But that is not the case for weather stations; for each weather station
there is a timeseries, and at each time there is a different number (value of
temperature, precipitation or whatever) for each weather station. You also
write, "The US National Weather Service?s National Water Model (NWM) ...
forecasts streamflow rates in about 2.7 million stream segments averaging 2km."
The stream network is a MultiLineString geometry, but I don't think there is
just one value of streamflow applying to the entire network at any given time;
I guess there is a different timeseries for each stream segment. But in my
example above, the Atlantic Ocean is a single polygon with a single timeseries
for its average temperature, not a different timeseries for each node. Thus I
am unclear about the dimensions of the data. In terms of your original example,
does the data have dimensions (time,geometry, where geometry=1) or (time,node)?

This seems to me to be a crucial difference. In the former case the simple
geometry can be regarded as a more complex alternative to cells bounds - the
cell has a complicated geometry of nodes and lines, but it's still a single
cell. In the latter case you're providing many timeseries in an unstructured
geometry, which is what ugrid describes. Which do you have in mind?

Nonetheless in both cases the geometries have to be described. I think the
difference is how we attach this description to the data or coordinates, rather
than how the description is constructed.

You propose the index variable in order for the convention to be like ugrid.
However this still seems to me to be an unnecessary complexity and use of space
if you aren't going to have many shared nodes. I think the case for having
another convention, distinct from ugrid, is stronger if it is *unlike* ugrid
in this respect, and therefore simpler as well.

I agree that repeating the inside/outside flag many times is wasteful. That,
coupled with your clarification that you may have several geometries, each
consisting of several elements (points, lines, polygons), means that you need,
in effect, a ragged array of ragged arrays (geometry,element,node). This is
more complicated than DSGs, but it seems to me it would be reasonably easy to
understand if your multi-geometry example
https://github.com/bekozi/netCDF-CF-simple-geometry/wiki/VLEN-Arrays-in-NetCDF-3#multipolygon-example
was stored something like this:

  geom=3;
  part=11;
  node=36;
  int number_of_parts(geom);
    number_of_parts:parts="number_of_nodes";
  int number_of_nodes(part);
    number_of_nodes:inout="inout";
  char inout(part);
  float x(node);
  float y(node);
  number_of_parts=6, 3, 2;
  number_of_nodes=4, 3, 3, 3, 3, 3, 3, 5, 3, 3, 3;
  inout="OIIIOOOIO";
  x=0, 20, 20, 0, 1, 10, 19, 5, 7, 9, 11, 13, 15, 5, 9, 7, 11, 15, 13, -40,
  -20, -45, -20, -10, -10, -30, -45, -30, -20, -20, 30, 45, 10, 25, 50, 30;
  y = 0, 0, 20, 20, 1, 5, 1, 15, 19, 15, 15, 19, 15, 25, 25, 29, 25, 25, 29,
  -40, -45, -30, -35, -30, -10, -5, -20, -20, -15, -25, 20, 40, 40, 5, 10, 15;

where I assume that all polygons are closed.

What do you think?

Best wishes

Jonathan
Received on Wed Oct 26 2016 - 07:33:37 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST