[CF-metadata] CF Conventions and netCDF-4 enhanced model
I spent an instructive evening reading through the previous discussions (thanks for the links, Corey) and the arguments for and against using hierarchal structures. I also re-read the CF conventions documents again (1.6 and 1.7 draft) and it seems the standard currently ignores groups rather than explicitly forbidding their use. It seems to me that a netCDF dataset with groups could still conform to the CF conventions as they are currently written, even with all the other restrictions that the standard imposes. I'd be interested in seeing and possibly helping with CF conventions for supporting the enhanced model.
After reading the previous discussions, I thought it might be interesting to the list to explain our use of groups in netCDF products as it is somewhat different from the other cases that were discussed.
Our netCDF datasets have to cope with a number of different needs from various parties - archive, end-users, higher-level processing, reprocessing, monitoring, etc. To keep things simple, we wanted a single format per instrument/processing level that is flexible enough to contain all the data or a subset of the data depending upon the consumer needs. To do this, we created a hierarchal data structure that encapsulates data in related, but independent groups. These groups can be present in or missing from the dataset as required by the needs of the consumer. So a level 2 processing function might receive a product containing 20 instrument channels at 2 different resolutions, whereas the dissemination function might receive a product with just 5 of these channels at the lowest resolution. Both of these products are described by a single format specification.
This model of including or omitting independent groups also supports other needs, for example being able to add data that is produced at irregular intervals but needs to be in the product when it is available. Also, by tagging groups with a specific attribute, we should also be able to have a single, generic method for end-users to be able to subset data retrieved from the archive without requiring specific knowledge of each netCDF product. They should be able to select only the tagged groups (which might correspond to instrument channels for instance) that they want in their retrieved datasets.
This gives us a single, easily understood format definition that encompasses a wide variety of possible variations.
Any feedback on the idiocy or genius of this (ab)use of the netCDF format is welcome.
Thanks,
Tim
---------------------
Dr. Timothy Patterson
Instrument Data Simulation
Product Format Specification
EUMETSAT
Eumetsat-Allee 1
64295 Darmstadt
Germany
Received on Thu Sep 11 2014 - 04:27:53 BST
This archive was generated by hypermail 2.3.0
: Tue Sep 13 2022 - 23:02:42 BST