Hello Ben
I think this is fascinating and fantastic work which is likely to prove very useful for a range of domains.
I am afraid that, just now, I don't have any specific insights with regard to:
> Questions for the CF Community
>
> 1. Are our VLEN netCDF-3 and netCDF-4 approaches acceptable? What changes would you recommend?
> 2. Are the geometry types point, line, polygon, and their multipart equivalents sufficient for the community?
but I do think these are really valuable areas to get feedback on.
all the best
mark
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] on behalf of Ben Koziol - NOAA Affiliate [ben.koziol at noaa.gov]
Sent: 07 September 2016 19:13
To: CF metadata
Cc: Bob Simons - NOAA Federal; Whiteaker, Timothy L
Subject: [CF-metadata] Feedback requested on proposed CF Simple Geometries
Greetings,
As part of an EarthCube project for
advancing netCDF-CF [1], we are developing an approach to represent simple geometries in enhanced netCDF-4 with a variable length array backport for netCDF-3. Simple geometries, for example, may be used to associate stream discharge with river lines or surface
runoff with watershed polygons. We've drafted an initial approach and reference implementation on the GitHub netCDF-CF-simple-geometry project [2] and would greatly appreciate feedback from the CF community. We'd like to make sure our scope is appropriate
and our approach is acceptable.
Scope
The result of this effort will be a standard that the CF
timeSeries
feature type could use to specify spatial coordinates (define a simple geometry) for a
timeSeries
variable.
For those familiar with the OGC WKT standard geometry types [3], we will
include Point, LineString, Polygon, Multipoint, MultiLineString, and MultiPolygon (WKT primitives and multipart geometries).
We anticipate that the six chosen geometry
types will cover the needs of most people generating netCDF data. These types also align with other geospatial data formats such as GeoJSON and ESRI Shapefile. If our approach is well received by the CF community, we may later adapt it to include parametric
shapes such as circles and ellipses.
Simple Geometry Encoding
Method
Driven by the possibility that different
features will require different numbers of coordinates to describe their geometries, our approach uses variable length (VLEN) arrays in enhanced netCDF-4 and continuous ragged arrays (CRAs) in netCDF-3. We describe the VLEN netCDF-4 approach first. The netCDF-3
CRA description follows.
In our approach, a VLEN
coordinate_index
variable which identifies the indices of geometry coordinates in separate coordinate arrays. The
coordinate_index
variable includes a coordinates
attribute which stores the names of the coordinate variables and a geom_type
attribute to indicate the geometry type.
For multipart geometries, the coordinate
index variable may include a negative integer flag(s) indicating the start of each new geometry "part" for the current feature. The first geometry part is not preceded by the negative integer flag. The variable shall include an attribute named
multipart_break_value
identifying the flag's value.
For polygon geometries with holes (also
called "interiors"), the coordinate index values shall include a negative integer flagging the start of each hole. In this case, the variable shall include a
hole_break_value
attribute to indicate the flag value.
Other attributes on the coordinate index
variable describe clockwise or anticlockwise node order for polygons and polygon closure convention. For additional details, see the wiki [4]. With these concepts defined, an example for multipolygons with holes is shown below. You can copy the WKT description
below into Wicket [5] if you'd like to see what the geometry in this example looks like.
Well-Known Text (WKT):
MULTIPOLYGON(((0 0, 20 0, 20 20, 0 20, 0 0), (1 1, 10 5, 19 1, 1 1), (5 15, 7 19, 9 15, 5 15),
(11 15, 13 19, 15 15, 11 15)), ((5 25, 9 25, 7 29, 5 25)), ((11 25, 15 25, 13 29, 11 25)))
Common Data Language (CDL) for netCDF-4
VLEN Arrays:
netcdf multipolygon_example
{
types:
int64(*) geom_VLType ;
dimensions:
node = 25 ;
geom = 1 ;
variables:
geom_VLType coordinate_index(geom)
;
string coordinate_index:geom_type
= "multipolygon" ;
string coordinate_index:coordinates
= "x y" ;
coordinate_index:multipart_break_value
= -1 ;
coordinate_index:hole_break_value
= -2 ;
string coordinate_index:outer_ring_order
= "anticlockwise" ;
string coordinate_index:closure_convention
= "last_node_equals_first" ;
double x(node) ;
double y(node) ;
data:
coordinate_index =
{0, 1, 2, 3, 4, -2, 5, 6,
7, 8, -2, 9, 10, 11, 12, -2, 13, 14, 15, 16, -1, 17, 18, 19, 20, -1, 21, 22, 23, 24} ;
x = 0, 20, 20, 0, 0, 1, 10,
19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, 5, 11, 15, 13, 11 ;
y = 0, 0, 20, 20, 0, 1, 5, 1,
1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, 29, 25, 25, 25, 29, 25 ;
}
You'll find additional examples for
VLEN geometry storage on our wiki [6].
Variable Length (VLEN)
Arrays in NetCDF-3
To support netCDF-3, we created a VLEN
approach for netCDF-3 [7]. Inspired by CF continuous ragged arrays (CRAs), our approach drops the CRA
count
variable in favor of a stop
variable that stores the stop index for each geometry within an array of geometry coordinates. This improves random accessibility of the CRA "elements" avoiding the need to sum counts preceding the target element index. The
stop
variable includes a contiguous_ragged_dimension
attribute whose value is the name of the dimension for which stop indices apply (similar to the CRA
sample_dimension
attribute). An example showing how strings can be stored with this approach is shown below.
Common Data Language (CDL) for netCDF-3
CRAs:
netcdf dwarf_planets {
dimensions:
dwarf_planet
= 5 ; // number of dwarf planets described in this file
dwarf_planet_chars
= 28 ; // total number of characters for all planet names
variables:
char
dwarf_planet_name(dwarf_planet_chars) ;
int
dwarf_planet_name_stop(dwarf_planet) ;
dwarf_planet_name_stop:contiguous_ragged_dimension
= "dwarf_planet_chars" ;
data:
dwarf_planet_name = "PlutoCeresErisHaumeaMakemake"
;
dwarf_planet_name_stop = 5,
10, 14, 20, 28 ;
}
For the above geometry example, the
VLEN coordinate_index
netCDF-4 array is replaced by a netCDF-3 CRA.
netcdf multipolygon_example
{
dimensions:
node = 25 ;
indices = 30;
geom = 1 ;
variables:
int coordinate_index(indices)
;
coordinate_index:geom_type
= "multipolygon" ;
coordinate_index:coordinates
= "x y" ;
coordinate_index:multipart_break_value
= -1 ;
coordinate_index:hole_break_value
= -2 ;
coordinate_index:outer_ring_order
= "anticlockwise" ;
coordinate_index:closure_convention
= "last_node_equals_first" ;
int coordinate_index_stop(geom)
;
coordinate_index_stop:contiguous_ragged_dimension
= "indices" ;
double x(node) ;
double y(node) ;
data:
coordinate_index = 0, 1, 2,
3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, -2, 13, 14, 15, 16, -1, 17, 18, 19, 20, -1, 21, 22, 23, 24 ;
coordinate_index_stop = 30 ;
x = 0, 20, 20, 0, 0, 1, 10,
19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, 5, 11, 15, 13, 11 ;
y = 0, 0, 20, 20, 0, 1, 5, 1,
1, 15, 19, 15, 15, 15, 19, 15, 15, 25, 25, 29, 25, 25, 25, 29, 25 ;
}
The CRA method could of course be used
in place of VLEN in netCDF-4. See our wiki page on GitHub [7] for more details and examples.
Questions for the CF
Community
Are our VLEN netCDF-3 and netCDF-4 approaches acceptable? What changes would
you recommend?
Are the geometry types point, line, polygon, and their multipart equivalents
sufficient for the community?
Thank you very much for considering
our ideas and helping us with your valuable feedback!
[1]
http://earthcube.org/group/advancing-netcdf-cf
[2]
https://github.com/bekozi/netCDF-CF-simple-geometry
[3]
https://en.wikipedia.org/wiki/Well-known_text
[4]
https://github.com/bekozi/netCDF-CF-simple-geometry/wiki
[5]
https://arthur-e.github.io/Wicket/sandbox-gmaps3.html
[6]
https://github.com/bekozi/netCDF-CF-simple-geometry/wiki/Examples---VLen-Ragged-Arrays
[7]
https://github.com/bekozi/netCDF-CF-simple-geometry/wiki/VLEN-Arrays-in-NetCDF-3
--
Ben Koziol
NESII/CIRES/NOAA Earth System Research Laboratory
ben.koziol at noaa.gov
802.392.4522
http://www.esrl.noaa.gov/nesii/
Received on Mon Sep 26 2016 - 04:03:36 BST