⇐ ⇒

[CF-metadata] Towards recognizing and exploiting hierarchical groups (Charlie Zender - Steve Hankin - Richard Signell)

From: stephen.pascoe at stfc.ac.uk <stephen.pascoe>
Date: Tue, 17 Sep 2013 11:56:41 +0000

Bryan has beaten me to the points I would have made. I think hierarchies are over rated at the interface level. Examples abound of where they have been abandoned: hierarchal vs relational DBs, XML databases and tools (save us from xquery for Netcdf!).

Under the hood hierarchies are often necessary for scalability and we all use them as a crutch when no better tools exist.

I would advocate keeping support for groups very simple. CF could treat any netcdf file containing groups as if it was a directory of netcdf files with attached metadata. IMO complex rules about inter-group relationships should be avoided. I guess attribute inheritance must be an exception here but I would urge caution. One of the CF data model tickets has got a detailed debate on interpretation of the current standard regarding variable attributes overriding global attributes. Lessons from that should be learned.

Stephen.

--
Stephen Pascoe from iPhone
On 17 Sep 2013, at 10:10, "Bryan Lawrence" <bryan.lawrence at ncas.ac.uk<mailto:bryan.lawrence at ncas.ac.uk>> wrote:
Hi Folks
CMIP5 is illuminating in a number of ways ... not least because it is impossible to come up with a *natural* hierarchy for consumers of the data (as opposed to the producers). But even the producers have different ways of organising their material (running members of different ensembles all at once, or all members of one ensemble at once), then the data has to be published and versioned ... and all of a sudden there is no natural hierarchy for CMIP5 (although everyone will have their own idea of what it could be ... )
The advantage of a flat system of objects, which can be linked into multiple hierarchies by a layer of metadata/indirection (call it what you like) becomes obvious in that context ... you can do faceted browse (and faceted assemblage of groups). So it's not so obvious to me that Charlie's examples are so compelling ... (indeed, even the NASA examples aren't so compelling when you consider some of the data use, which immediately requires us to extract and replicate the data into smaller granules in some cases ...)
Which leads me naturally onto CF. I think there *is* a case for thinking about how we use hierarchical attributes in CF (indeed, we've just been arguing about it in another context with the concept of file attributes and variable attributes). We could resolve this once and for all by establishing a convention for CF which says how we *will* do group attributes as they become necessary. (I still think we will eventually want vector concepts more naturally represented in files, even though I think files should not be our one view of the world.)
However, the argument about file and field attributes applies here. What (I think) we're talking about (thus far) for groups is metadata aggregation and is simply a *file based convention* for simplifying storage, so that when the file gets unpacked, the data model says the attributes are owned by each individual group member.  If it's just that on the table, then I'm OK with this.
The scope issue on the other hand, opens a can of worms, and I hope I've demonstrated with the CMIP5 preamble, that' it wont be that obvious to resolve.
Bryan
On 17 September 2013 06:26, <zender at uci.edu<mailto:zender at uci.edu>> wrote:
Hi Russ,
Thanks for your input and link to an earlier presentation of yours.
Agree that the proposal only applies to group hierarchies, i.e., to
groups representable by the Common Data Model 2/extended/enhanced
which for practical purposes means groups exposed by the netCDF4 API.
Your way of putting it is better because it's more generic: we only
seek to define metadata inheritance for hierarchical groups, no matter
the external representation of the group.
Cheers,
cz
Le 16/09/2013 12:06, Russ Rew a ?crit :
>> Dear all,
>
> I'm also glad to see this discussion surface.  Since I first presented
> "Developing Conventions for netCDF-4" at the 2007 GO-ESSP meeting:
>
>   http://www.unidata.ucar.edu/presentations/Rew/nc4-conventions.pdf
>
> I've been hoping that netCDF-4 feature adoption would begin to gain
> traction in the community (see slides 19 and 20 of this 2010
> presentation for my "chicken-and-egg logjam" illustration):
>
>    http://www.unidata.ucar.edu/presentations/Rew/agu_2010_nc4_Rew.pdf
>
> I like the Zender-Habermann-Leonard (ZHL?) proposal for Group
> Attributes, but would like to point out a potential problem for its use
> with HDF Groups: they aren't actually hierarchical.  In HDF5, Group A
> can be a parent of Group B, which in turn can be a parent of Group A,
> forming a cycle instead of a hierarchy.  The graph of the Group-subGroup
> relation in HDF5 can form an arbitrary directed cyclic graph, though
> this is not permitted in netCDF-4, in which only Group *hierarchies* can
> be created through the netCDF-4 API.
>
> Without a restriction to hierarchies, attribute inheritance is not
> useful, which is why we required group hierarchies for dimension
> inheritance in netCDF-4.  So I think the proposal should include a
> restriction to only hierarchical Group structures, which also has the
> desirable property that each Group, except for the root, has a unique
> parent Group.
>
> --Russ
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
--
Charlie Zender, Earth System Sci. & Computer Sci.
University of California, Irvine 949-891-2429 )'(
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
Scanned by iCritical.
--
--
Bryan Lawrence
University of Reading: Professor of Weather and Climate Computing.
National Centre for Atmospheric Science: Director of Models and Data.
STFC: Director of the Centre for Environmental Data Archival.
Ph: +44 118 3786507 or 1235 445012; Web:home.badc.rl.ac.uk/lawrence<http://home.badc.rl.ac.uk/lawrence>
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-- 
Scanned by iCritical.
Received on Tue Sep 17 2013 - 05:56:41 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒