⇐ ⇒

[CF-metadata] some concerns about the "ensemble axis" proposal

From: Jennifer Adams <jma>
Date: Wed, 7 Mar 2007 14:55:32 -0500

This discussion is getting juicy!

I am the GrADS and GDS developer working on an interface for 5-
dimensional data sets. Ensembles are one example of how the 5th
dimension might be used, but there are others (e.g. EOFs), so we are
trying to make it as general as possible while still being practical
and usable. GrADS is written in C and handles data in a variety of
formats. Data file aggregation over time, and now over the "e"
dimension, is possible but not required.

Currently, we are not building an interface for multi-model ensembles
on different grids. The elephant in Steve's living room will not
allowed to play in our yard. Fast and easy interpolation between data
sets on different grids was omitted from GrADS by design and that is
not likely to change with the addition of a new grid dimension. If
users want to lump data sets on different grids together, they must
handle the interpolation explicitly in a way that is best suited to
their needs and in a way that they know will best preserve the
information in the data they wish to extract.

Ensembles that are on the same grid will be handled by GrADS. For
metadata, we are taking a minimalist approach -- the ensemble axis is
linear, and members have a unique name (<16 characters) and are
numbered from 1 to n. We don't require that all members have the same
start time or length, so those pieces of metadata are also required.
This information is generally provided in a data descriptor file, an
external metadata source written by the user after poring over the
output from ncdump or wgrib or similar routine.

If I am handed a single netcdf file with multi-model-different-grid
ensembles in it from ECMWF or GFDL, I'm going to write a set of
descriptor files, each one describing the subset of variables on a
common 5D grid. I'll have one descriptor file per grid, all pointing
at the same data file. Now I'm set to do my analysis in GrADS,
beginning with careful interpolation between the different grids.

When I put my 5D data sets behind a GDS and serve them to the world
of OPeNDAP clients (including GrADS), it becomes a special case: a
5D netcdf file that doesn't require a descriptor file, a file that
has all the metadata GrADS needs packaged in just the right way. For
the time being, my approach works because I'm writing the code for
the client and the server, I'm not worrying about any other client
trying to read my 5D GDS data set, and I'm not trying to be CF-
compliant. Here's what it's going to look like:

dimensions:
         lon = 9 ;
         lat = 9 ;
         lev = 9 ;
         time = 9 ;
         ens = 9 ;
         string16 = 16 ;
variables:
         float lon(lon) ;
                 lon:units = "degrees_east" ;
         float lat(lat) ;
                 lat:units = "degrees_north" ;
         float lev(lev) ;
                 lev:units = "level" ;
         float time(time) ;
                 time:units = "days since 0001-01-01 00:00:00" ;
         float ens(ens) ;
                 ens:grads_dim = "e" ;
         char ens_name(ens, string16) ;
                 ens_name:long_name = "ensemble name" ;
         int ens_length(ens) ;
                 ens_length:long_name = "ensemble length" ;
         int ens_tinit(ens) ;
                 ens_tinit:long_name = "ensemble initial time index" ;
         float var(ens, time, lev, lat, lon) ;
                 var:long_name = "test variable" ;

When more metadata is required to bring my GDS data set into CF
compliance, or to make it readable by other open source clients, I'll
add it. As long as GrADS users have the means to keep up with the
data sets being generated by CFS, IPCC, TIGGE, or whatever, then I'm
not concerned.

It took me a long time to write out this email -- I lost most of an
afternoon trying to phrase everything properly. I have been reading
this thread with interest, but I just can't keep up this kind of
lengthy correspondence on a regular basis. Please keep me in mind as
one of the silent listeners who still cares about the outcome.

Jennifer

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma at cola.iges.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20070307/0bb198a6/attachment-0002.html>
Received on Wed Mar 07 2007 - 12:55:32 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒