[CF-metadata] CF and multi-forecast system ensemble data from Bryan Lawrence on 2006-11-01 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Bryan Lawrence <b.n.lawrence>
Date: Wed, 01 Nov 2006 08:33:43 +0000

On Tue, 2006-10-31 at 09:32 -0800, Steve Hankin wrote:

> Like others I've watched this email thread grow like Topsy, wondering
> if I would find the time to read it ... But the discussion topic seems
> big enough to warrant a fair amount of rough and tumble.

Absolutely. Bring it on :-) :-)

With regard to the stability of CF and long term evolution, that's
what's behind my wanting to separate maintaining descriptions of the
quantities measured/predicted from the descriptions of how/why it was
done. It's also why I want just one way of doing it, not one that is
specially optimized for numerical models per se, and not of wider
applicability.

> OK. Enuf preamble. In a nutshell, the new proposed structures seem
> to capture the semantics of the various collections of model outputs
> -- ensembles and forecast collections -- through the addition of new
> dimensions. In the most extreme case this dimension list might become
> (realization,forecast_reference_time,forecast_period,lev,lat,lon).
>
> I'd pose two questions:
> 1. Will this approach break existing CF applications? If yes, is
> that a red flag to consider other options?

I don't believe there is any restriction on number of dimensions in CF,
so while it may not be pretty, it seems ok to me.

> 1. Is this the same approach that we would take if we already had
> netCDF 4? If no, is that a red flag that we should give more
> thought to the long-term stability of the standard?

Let's not think about netCDF4 immediately :-) One of the other things we
discussed was trying to divorce the content standard from the
implementation standard ...

But your substantive point is fair enough, is there a cleaner way to
think about this ...?

Particularly from the point of view that a set of simulations comprising
a forecast IS THE forecast (singular), then I think at least
(realization,t,z,y,x) is inescapable. I don't like the other example
(realization,ref_time,period,z,y,x), but accept that it may well be
the natural output of a set of aggregations. Is it not what Thredds
would deliver anyway from an aggregation? Which brings me to my last
point:

As far as Thredds as a stop-gap solution goes. I think we need to
divorce how we interact with the data from how we manage it
(particularly for posterity). There is no way that I'm going to rely on
ANY *interface* to preserve information content, so what you're really
saying is that you want to rely on metadata held externally from the
files (whether Thredds or not). To some extent that's inescapable (which
was one of my points in an earlier email), but we want to stick to the
requirement that CF can differentiate data (if not fully describe all
the ancillary information) ... and for forecast ensembles (and station
data) there is a need for some extra information over and above the
index value in a "special" dimension. It's that information we need to
get into the CF content standard.

Also, from an operational perspective, ok, so maybe I can use Thredds or
any interface to get some data, but then I've got it. What then? The
content standard has to tell me what's in it, so we're back to CF and
possibly pointers to external metadata.

(Wrt netcdf4: I see groups as being more useful for aggregating things
that don't share common dimensionality)

Cheers
Bryan
Received on Wed Nov 01 2006 - 01:33:43 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST