[CF-metadata] ensemble dimension from Francisco Doblas-Reyes on 2010-03-17 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Francisco Doblas-Reyes <f.doblas-reyes>
Date: Wed, 17 Mar 2010 11:07:42 -0000 (UTC)

Hi all,

I'd like to say that I finally created the ENSEMBLES dataset dealing with
the ensemble dimension in my own way. Doug happened to be at the time at
ECMWF, learned about our ENSEMBLES solution and adapted it to TIGEE, which
was very helpful as he singled out some problems we had.

It was clear from the previous round of discussions that it would be very
difficult to aggregate ensemble members produced with different grids
using NetCDF3. Hence, for the ENSEMBLES multi-forecast system (which
comprises multi-model, perturbed parameters and stochastic physics
initial-condition ensembles) dataset I first had to interpolate the data
into a common grid. I then tried to define the ensemble-member attributes
using auxiliary coordinates that borrowed some of the global attributes
used in the CMIP3 dataset.

Although all the ensemble members in the ENSEMBLES dataset share the same
time dimension, this is not strictly necessary and the time coordinates
(giving the start date and the lead time) can be also defined per ensemble
member. This might imply introducing some missing fields in the NetCDF
files though. Maybe NetCDF4 can deal with all this in a much more
efficient way.

On Wed, March 17, 2010 10:14, Kettleborough, Jamie wrote:
> Hello John,
>
>
>> Im good with either/both axis = "ensemble" and/or
>> standard_name = "ensemble_member_identifier".
>>
>> for backwards compatibility, we could consider recognizing standard_name
>> = "realization".
>> do we need anything new - or is standard_name = "realization" enough?
>
> I prefer "realization" to "ensemble" as I think its a bit more neutral -
> for instance you can produce a set of realizations of past and future
> climates using detection and attribution techniques, its not clear (to me)
> that "ensemble" is the most natural term for these. I think the term
> ensemble is a bit loaded to imply a production technique, whereas
> realization is more descriptive of the intent. (But I'm happy to be wrong
> on this.)

I seem to remember that I finally used "realization" following a long
exchange with John, Jamie and others for the reasons given above.

>> From what I remember a lot of the issues that caused the previous
>>
> discussion on ensembles to stall were around the aggregation of different
> files into a single ensemble file, and what you do in this case to
> maintain traceability back to the original model experiments. So a
> scenario something like:
>
> 1. There is a repository of CMIP5 integrations (a collection of
> mega-ensembles): a number of models, with a number of initial conditions
> and a number of forcing scenarios. The output from each model integration
> is stored in its own set of NetCDF files with global attributes likse
> source, forcing, experiment, model, institute, realization used to
> identify this data in the CMIP5 mega-ensemble. Each model can be on a
> different grid.
>
> 2. A data user takes all (or a selection of) the files for a
> mega-ensemble and puts them on the *same* space-time sampling for analysis.
> e.g. decadal mean global means or continent means
>
> 3. The result of the analysis may be a reduced size to the original data
> and so comfortably fit into one file. If the user wants to share this data
> with others and maitain links to the original model integrations - how do
> they do this?
>
> I don't know if/when we want to return to this analysis and aggregation
> case (is it a CF problem?). To me at least, it feels logically related to
> some of the discussion around station data - though I didn't follow this
> discussion that closely -
> https://cf-pcmdi.llnl.gov/trac/ticket/37#comment:34. As I understood it
> the problem there was 'aggregating' over instruments to for a depth
> coordinate, whereas here we are aggregating over model integrations to
> give an ensemble coordinate.

I'd just like to point out that the problem Jamie mentions above will be
faced by a large part of the community. At least, I see that happening
quickly with the decadal prediction part of CMIP5. If this is not a CF
problem, I wonder whether GO-ESSP, Metafor, ESG-CET or Curator are doing
something about it. If someone has an idea of who might be dealing with
the issue, I'd be really interested in hearing about it.

Cheers,
Paco

-- 
____________________________________________
Francisco J. Doblas-Reyes
Catalan Institute for Climate Sciences (IC3)
Doctor Trueta 203 - 08005 Barcelona, Spain
+34 93 5679977 Ext 316
____________________________________________

Received on Wed Mar 17 2010 - 05:07:42 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST