--- Rather than just be a naysayer, let me suggest a very different alternative: There are several projects in the CF realm (e.g., this Simple Features project, Discrete Sampling Geometry (DSG), true variable-length Strings, ugrid(?)) which share a common underlying problem: how to deal with variable-length multidimensional arrays: a[b][c], where the length of the c dimension may be different for different b indices. DSG solved this (5 different ways!), but only for DSG. The Simple Features proposal seeks to solve the problem for Simple Features. We still have no support for Unicode variable-length Strings. Instead of continuing to solve the variable-length problem a different way every time we confront it, shouldn't we solve it once, with one small addition to the standard, and then use that solution repeatedly? The solution could be a simple variant of one of the DSG solutions, but generalized so that it could be used in different situations. An encoding standard and built-in support for variable-length data arrays in netcdf-java/c would solve a lot of problems, now and in the future. Some work on this is already done: I think the netcdf-java API already supports variable-length arrays when reading netcdf-4 files. For Simple Features, the problem would reduce to: store the feature (using some specified existing standard like WKT or WKB) in a variable-length array. On Fri, Feb 3, 2017 at 9:07 AM, <cf-metadata-request at cgd.ucar.edu> wrote: > Date: Fri, 3 Feb 2017 11:07:00 -0600 > From: David Blodgett <dblodgett at usgs.gov> > To: Bob Simons - NOAA Federal <bob.simons at noaa.gov> > Cc: CF Metadata <cf-metadata at cgd.ucar.edu> > Subject: Re: [CF-metadata] Extension of Discrete Sampling Geometries > for Simple Features > Message-ID: <8EE85E65-2815-4720-90FC-13C72D3C7952 at usgs.gov> > Content-Type: text/plain; charset="utf-8" > > Dear Bob, > > I?ll just take these in line. > > 1) noted. We have been trying to figure out what to do with the point > featureType and I think leaving it more or less alone is a viable path > forward. > > 2) This is not an exact replica of WKT, but rather a similar approach to > WKT. As I stated, we have followed the ISO simple features data model and > well known text feature types in concept, but have not used the same > standardization formalisms. We aren?t advocating for supporting ?all of? > any standard but are rather attempting to support the use cases that have a > compelling need today while aligning this with as many other encoding > standards in this space as is practical. Hopefully that answers your > question, sorry if it?s vague. > > 3) The google doc linked in my response contains the encoding we are > proposing as a starting point for conversation: http://goo.gl/Kq9ASq < > http://goo.gl/Kq9ASq> I want to stress, as a starting point for > discussion. I expect that this proposal will change drastically before > we?re done. > > 4) Absolutely envision tools doing what you say, convert to/from standard > spatial formats and NetCDF-CF geometries. We intend to introduce an R and a > Python implementation that does exactly as you say along with whatever form > this standard takes in the end. R and Python were chosen as the team that > brought this together are familiar with those two languages, additional > implementations would be more than welcome. > > 5) We do include a ?geometry? featureType similar to the ?point? > featureType. Thus our difficulty with what to do with the ?point? > featureType. You are correct, there are lots of non timeSeries applications > to be solved and this proposal does intend to support them (within the > existing DSG constructs). > > Thanks for your questions, hopefully my answers close some gaps for you. > > - Dave > > > On Feb 3, 2017, at 10:47 AM, Bob Simons - NOAA Federal < > bob.simons at noaa.gov> wrote: > > > > 1) There is a vague comment in the proposal about possibly changing the > point featureType. Please don't, unless the changes don't affect current > uses of Point. There are already 1000's of files that use it. If this new > system offers an alternative, then fine, it's an alternative. One of the > most important and useful features of a good standard is backwards > compatibility. > > > > 2) You advocate "Implement the WKT approach using a NetCDF binary > array." Is this system then an exact encoding of WKT, neither a subset nor > a superset? "Simple Features" are often not simple. > > If it is WKT (or something else), what is the standard you are following > to describe the Simple Features (e.g., ISO/IEC 13249-3:2016 and ISO > 19162:2015)? > > Does your proposal deviate in any way from the standard's capabilities? > > Do you advocate following the entire WKT standard, e.g., supporting all > the feature types that WKT supports? > > > > 3) Since you are not using the WKT encoding, but creating your own, > where is the definition of the encoding system you are using? > > > > 4) This is a little out of CF scope, but: > > Do you envision tools, notably, netcdf-c/java, having a writer function > that takes in WKT and encodes the information in a file, and having a > reader function that reads the file and returns WKT? Or is it your plan > that the encoding/ decoding is left to the user? > > > > 5) This proposal is for "Simple Features plus Time Series" (my phrase > not yours). But aren't there lots of other uses of Simple Features? Will > there be other proposals in the future for "Simple Features plus X" and > "Simple Features plus Y"? If so, will CF eventually become a massive > document where Simple Features are defined over and over again, but in > different contexts? If so, wouldn't a better solution be to deal with > Simple Features separately (as Postgres does by making a geometric data > type?), and then add "Simple Features plus Time Series" as the first use of > it? > > > > Thanks for answering these questions. > > Please forgive me if I missed parts of your proposal that answer these > questions. > > > > > > On Thu, Feb 2, 2017 at 5:57 AM, <cf-metadata-request at cgd.ucar.edu > <mailto:cf-metadata-request at cgd.ucar.edu>> wrote: > > Date: Thu, 2 Feb 2017 07:57:36 -0600 > > From: David Blodgett <dblodgett at usgs.gov <mailto:dblodgett at usgs.gov>> > > To: <cf-metadata at cgd.ucar.edu <mailto:cf-metadata at cgd.ucar.edu>> > > Subject: [CF-metadata] Extension of Discrete Sampling Geometries for > > Simple Features > > Message-ID: <224C2828-7212-449F-8C2C-97D903F6BE1E at usgs.gov <mailto: > 224C2828-7212-449F-8C2C-97D903F6BE1E at usgs.gov>> > > Content-Type: text/plain; charset="utf-8" > > > > Dear CF Community, > > > > We are pleased to submit this proposal for your consideration and > review. The cover letter we've prepared below provides some background and > explanation for the proposed approach. The google doc here < > http://goo.gl/Kq9ASq <http://goo.gl/Kq9ASq>> is an excerpt of the CF > specification with track changes turned on. Permissions for the document > allow any google user to comment, so feel free to comment and ask questions > in line. > > > > Note that I?m sharing this with you with one issue unresolved. What to > do with the point featureType? Our draft suggests that it is part of a new > geometry featureType, but it could be that we leave it alone and introduce > a geometry featureType. This may be a minor point of discussion, but we > need to be clear that this is an issue that still needs to be resolved in > the proposal. > > > > Thank you for your time and consideration. > > > > Best Regards, > > > > David Blodgett, Tim Whiteaker, and Ben Koziol > > > > Proposed Extension to NetCDF-CF for Simple Geometries > > > > Preface > > > > The proposed addition to NetCDF-CF introduced below is inspired by a > pre-existing data model governed by OGC and ISO as ISO 19125-1. More > information on Simple Features may be found here. < > https://en.wikipedia.org/wiki/Simple_Features <https://en.wikipedia.org/ > wiki/Simple_Features>> To the knowledge of the authors, it is consistent > with ISO 19125-1 but has not been specified using the formalisms of OGC or > ISO. Language used attempts to hold true to NetCDF-CF semantics while not > conflicting with the existing standards baseline. While this proposal does > not support the entire scope of the the simple features ecosystem, it does > support the core data types in most common use around the community. > > > > The other existing standard to mention is UGRID convention < > http://ugrid-conventions.github.io/ugrid-conventions/ < > http://ugrid-conventions.github.io/ugrid-conventions/>>. The authors have > experience reading and writing UGRID and have designed the proposed > structure in a way that is inspired by and consistent with it. > > > > Terms and Definitions > > > > (Taken from OGC 06-103r4 OpenGIS Implementation Specification for > Geographic information - Simple feature access - Part 1: Common > architecture <http://www.opengeospatial.org/standards/sfa < > http://www.opengeospatial.org/standards/sfa>>.) > > > > Feature: Abstraction of real world phenomena - typically a geospatial > abstraction with associated descriptive attributes. > > Simple Feature: A feature with all geometric attributes described > piecewise by straight line or planar interpolation between point sets. > > Geometry (geometric complex): A set of disjoint geometric primitives - > one or more points, lines, or polygons that form the spatial representation > of a feature. > > Introduction > > > > Discrete Sampling Geometries (DSGs) handle data from one (or a > collection of) timeSeries (point), Trajectory, Profile, TrajectoryProfile > or timeSeriesProfile geometries. Measurements are from a point (timeSeries > and Profile) or points along a trajectory. In this proposal, we reuse the > core DSG timeSeries type which provides support for basic time series use > cases e.g., a timeSerieswhich is measured (or modeled) at a given point. > > > > Changes to Existing CF Specification > > > > In NetCDF-CF 1.7, Discrete Sampling Geometries separate dimensions and > variables into two types ? instance and element < > http://cfconventions.org/cf-conventions/cf-conventions. > html#_collections_instances_and_elements <http://cfconventions.org/cf- > conventions/cf-conventions.html#_collections_instances_and_elements>>. > Instance refers to individual points, trajectories, profiles, etc. These > would sometimes be referred to as features given that they are identified > entities that can have associated attributes and be related to other > entities. Element dimensions describe temporal or other dimensions to > describe data on a per-instance basis. This proposal extends the DSG > timeSeries featuretype <http://cfconventions.org/cf- > conventions/cf-conventions.html#_features_and_feature_types < > http://cfconventions.org/cf-conventions/cf-conventions. > html#_features_and_feature_types>> such that the geospatial coordinates > of the instances can be point, multi-point, line, multi-line, polygon, or > multi-polyg > on geometries. Rather than overload the DSG contiguous ragged array > encoding, designed with timeseries in mind, a geometry ragged array > encoding is introduced in a new section 9.3.5. See thi > > s google doc for specific proposed changes. <http://goo.gl/Kq9ASq < > http://goo.gl/Kq9ASq>> > > Motivation > > > > DSGs have no system to define a geometry (polyline, polygon, etc., other > than point) and an association with a time series that applies over that > entire geometry e.g., The expected rainfall in this watershed polygon for > some period of time is 10 mm. As suggested in the last paragraph of section > 9.1, current practice is to assign a representative point or just use an ID > and forgo spatial information within a NetCDF-CF file. In order to satisfy > a number of environmental modeling use cases, we need a way to encode a > geometry (point, line, polygon, multi-point, multi-line, or multi-polygon) > that is the static spatial feature representation to which one or more > timeSeries can be associated. In this proposal, we provide an encoding to > define collections of simple feature geometries. It interfaces cleanly with > the existing DSG specification, enabling DSGs and Simple Geometries to be > used concurrently. > > > > Looking Forward > > > > This proposal is a compromise solution that attempts to stay consisten > to CF ideals and fit within the structure of the existing specification > with minimal disruption. Line and polygon data types often require variable > length arrays. Development of this proposal has brought to light the need > for a general abstraction for variable length arrays in NetCDF-CF. Such a > general abstraction would necessarily be reusable for character arrays, > ragged arrays of time series, and ragged arrays of geometry nodes, as well > as any other ragged data structures that may come up in the future. This > proposal does not introduce such a general ragged array abstraction but > does not preclude such a development in the future. > > > > Three Alternative Approaches > > > > Respecting the human readability ideal of NetCDF-CF, the development of > this proposal started from a human readable format for geometries known as > Well Known Text <https://en.wikipedia.org/wiki/Well-known_text < > https://en.wikipedia.org/wiki/Well-known_text>>. We considered three high > level design approaches while developing this proposal. > > > > Direct use of Well-Known Text (WKT). In this approach, well known text > strings would be encoded using character arrays following a contiguous > ragged array approach to index the character array by geometry (or instance > in DSG parlance). > > Implement the WKT approach using a NetCDF binary array. In this > approach, well known text separators (brackets, commas and spaces) for > multipoint, multiline, multipolygon, and polygon holes, would be encoded as > break type separator values like -1 for multiparts and -2 for holes. > > Implement the fundamental dimensions of geometry data in NetCDF. In this > approach, additional dimensions and variables along those dimensions would > be introduced to represent geometries, geometry parts, geometry nodes, and > unique (potentially shared) coordinate locations for nodes to reference. > > Selected Approach > > > > The first approach was seen as too opaque to stay true to the CF ideal > of complete self-description. The third approach seemed needlessly verbose > and difficult to implement. The second approach was selected for the > following reasons: > > > > The second approach is just as or more human-readable than the third. > > Use of break values keeps geometries relatively atomic. > > Will be familiar to developers who are familiar with the WKT geometry > format. > > Character arrays, which are needed for options one and three, are > cumbersome to use in some programming languages in common use with NetCDF. > > Break values replace the need for extraneous variables related to > multi-part and polygon holes (interiors). Multi-part geometries are > generally an exception and excessive instrumentation to support them should > be discounted. > > Example: Representation of WKT-Style Polygons in a NetCDF-3 > timeSeriesfeatureType > > > > Below is sample CDL demonstrating how polygons are encoded in NetCDF-3 > using a continuous ragged array-like encoding. There are three details to > note in the example below. > > > > The attribute contiguous_ragged_dimension with value of a dimension in > the file. > > The geom_coordinates attribute with a value containing a space separated > string of variable names. > > The cf_role geometry_x_node and geometry_y_node. > > These three attributes form a system to fully describe collections of > multi-polygon feature geometries. Any variable that has the > continuous_ragged_dimension attribute contains integers that indicate the > 0-indexed starting position of each geometry along the instance dimension. > Any variable that uses the dimension referenced in the > continuous_ragged_dimension attribute can be interpreted using the values > in the variable containing the contiguous_ragged_dimension attribute. The > variables referenced in the geom_coordinates attribute describe spatial > coordinates of geometries. These variables can also be identified by the > cf_roles geometry_x_node and geometry_y_node. Note that the example below > also includes a mechanism to handle multi-polygon features that also > contain holes. > > > > netcdf multipolygon_example { > > dimensions: > > node = 47 ; > > indices = 55 ; > > instance = 3 ; > > time = 5 ; > > strlen = 5 ; > > variables: > > char instance_name(instance, strlen) ; > > instance_name:cf_role = "timeseries_id" ; > > int coordinate_index(indices) ; > > coordinate_index:geom_type = "multipolygon" ; > > coordinate_index:geom_coordinates = "x y" ; > > coordinate_index:multipart_break_value = -1 ; > > coordinate_index:hole_break_value = -2 ; > > coordinate_index:outer_ring_order = "anticlockwise" ; > > coordinate_index:closure_convention = "last_node_equals_first" ; > > int coordinate_index_start(instance) ; > > coordinate_index_start:long_name = "index of first coordinate in > each instance geometry" ; > > coordinate_index_start:contiguous_ragged_dimension = "indices" ; > > double x(node) ; > > x:units = "degrees_east" ; > > x:standard_name = "longitude" ; // or projection_x_coordinate > > X:cf_role = "geometry_x_node" ; > > double y(node) ; > > y:units = "degrees_north" ; > > y:standard_name = ?latitude? ; // or projection_y_coordinate > > y:cf_role = "geometry_y_node" > > double someVariable(instance) ; > > someVariable:long_name = "a variable describing a single-valued > attribute of a polygon" ; > > int time(time) ; > > time:units = "days since 2000-01-01" ; > > double someData(instance, time) ; > > someData:coordinates = "time x y" ; > > someData:featureType = "timeSeries" ; > > // global attributes: > > :Conventions = "CF-1.8" ; > > > > data: > > > > instance_name = > > "flash", > > "bang", > > "pow" ; > > > > coordinate_index = 0, 1, 2, 3, 4, -2, 5, 6, 7, 8, -2, 9, 10, 11, 12, > -2, 13, 14, 15, 16, > > -1, 17, 18, 19, 20, -1, 21, 22, 23, 24, 25, 26, 27, 28, -1, 29, 30, > 31, 32, 33, > > 34, -2, 35, 36, 37, 38, 39, 40, 41, 42, -1, 43, 44, 45, 46 ; > > > > coordinate_index_start = 0, 30, 46 ; > > > > x = 0, 20, 20, 0, 0, 1, 10, 19, 1, 5, 7, 9, 5, 11, 13, 15, 11, 5, 9, 7, > > 5, 11, 15, 13, 11, -40, -20, -45, -40, -20, -10, -10, -30, -45, -20, > -30, -20, -20, -30, 30, > > 45, 10, 30, 25, 50, 30, 25 ; > > > > y = 0, 0, 20, 20, 0, 1, 5, 1, 1, 15, 19, 15, 15, 15, 19, 15, 15, 25, > 25, 29, > > 25, 25, 25, 29, 25, -40, -45, -30, -40, -35, -30, -10, -5, -20, -35, > -20, -15, -25, -20, 20, > > 40, 40, 20, 5, 10, 15, 5 ; > > > > someVariable = 1, 2, 3 ; > > > > time = 1, 2, 3, 4, 5 ; > > > > someData = > > 1, 2, 3, 4, 5, > > 1, 2, 3, 4, 5, > > 1, 2, 3, 4, 5 ; > > } > > How To Interpret > > > > Starting from the timeSeries variables: > > > > See CF-1.8 conventions. > > See the timeSeries featureType. > > Find the timeseries_id cf_role. > > Find the coordinates attribute of data variables. > > See that the variables indicated by the coordinates attribute have a > cf_role geometry_x_nodeand geometry_y_node to determine that these are > geometries according to this new specification. > > Find the coordinate index variable with geom_coordinates that point to > the nodes. > > Find the variable with contiguous_ragged_dimension pointing to the > dimension of the coordinate index variable to determine how to index into > the coordinate index. > > Iterate over polygons, parsing out geometries using the contiguous > ragged start variable and coordinate index variable to interpret the > coordinate data variables. > > Or, without reference to timeSeries: > > > > See CF-1.8 conventions. > > See the geom_type of multipolygon. > > Find the variable with a contiguous_ragged_dimension matching the > coordinate index variable?s dimension. > > See the geom_coordinates of x y. > > Using the contiguous ragged start variable found in 3 and the coordinate > index variable found in 2, geometries can be parsed out of the coordinate > index variable and parsed using the hole and break values in it. > > > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/ > attachments/20170202/4ce5b42f/attachment.html < > http://mailman.cgd.ucar.edu/pipermail/cf-metadata/ > attachments/20170202/4ce5b42f/attachment.html>> > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > CF-metadata mailing list > > CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu> > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata < > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > > > > > ------------------------------ > > > > End of CF-metadata Digest, Vol 166, Issue 3 > > ******************************************* > > > > > > > > -- > > Sincerely, > > > > Bob Simons > > IT Specialist > > Environmental Research Division > > NOAA Southwest Fisheries Science Center > > 99 Pacific St., Suite 255A (New!) > > Monterey, CA 93940 (New!) > > Phone: (831)333-9878 (New!) > > Fax: (831)648-8440 > > Email: bob.simons at noaa.gov <mailto:bob.simons at noaa.gov> > > > > The contents of this message are mine personally and > > do not necessarily reflect any position of the > > Government or the National Oceanic and Atmospheric Administration. > > <>< <>< <>< <>< <>< <>< <>< <>< <>< > > > > _______________________________________________ > > CF-metadata mailing list > > CF-metadata at cgd.ucar.edu > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/ > attachments/20170203/4ff55def/attachment.html> > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > CF-metadata mailing list > CF-metadata at cgd.ucar.edu > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > ------------------------------ > > End of CF-metadata Digest, Vol 166, Issue 5 > ******************************************* > -- Sincerely, Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A (New!) Monterey, CA 93940 (New!) Phone: (831)333-9878 (New!) Fax: (831)648-8440 Email: bob.simons at noaa.gov The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <>< -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170203/6d2b1adf/attachment.html>Received on Fri Feb 03 2017 - 12:41:09 GMT
This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST