Dear Steve and Balaji,
Thanks for bringing back the issue of CF and the ensemble axis.
> 1. The ensemble axis proposal does not solve the general multi-model
> ensemble problem.
> * In earlier discussions it was stated that the ensemble axis
> is intended to address multi-model ensembles. But the model
> runs in a multi-model ensemble will in general be on a
> number of differing grids. The proposal addresses only the
> limited sub-case in which all models have been re-gridded to
> the same grid. This leaves unanswered how CF can support
> ensembles on multiple grids. We should explore the answer
> to the general question before committing to the specialized
> solution.
The original proposal concerned multi-forecast system ensembles. This
includes initial-condition ensembles, perturbed-parameter ensembles and
multi-models. It is likely that the first two systems have the same grid
in all the forecasts, because they would be generated by the same model
version. I wouldn't call these examples a limited sub-case.
Solving the question of how to handle multiple grids in the same file
before introducing the ensemble dimension would be ideal, but in the
meantime the dissemination of standard NetCDF files with all sorts of
ensembles forecasts is limited.
> 2. A netCDF-style ensemble axis is a marginal model for the
> underlying problem.
> * The "ensemble axis" is not an ordered axis. So when clients
> are working with models from an ensemble they will often not
> be accessing contiguous ranges of indices on the axis.
> NetCDF dimensions are ordered and can only provide direct
> API support for contiguous ranges on a dimension. So the
> ensemble axis proposal will not provide the usual and
> expected benefits of a netCDF dimension.
I agree that the treatment of the ensemble dimension is far from
perfect. However, I don't understand why dimensions can only provide
direct support for contiguous ranges. In a NetCDF file with various
(deterministic) forecasts, the variables "forecast_reference_time" and
"forecast_period" can be used in CF to determine the verifying time of
the forecast. Although these variables are referenced (at least one of
them with respect to a calendar), they don't need to be continuous in
range as forecasts can also be unequally spaced in time.
> 4. In realistic data management scenarios the ensemble axis will not
> be a sufficient solution to the problem; "aggregation servers"
> will be needed as well. (And when aggregation servers are
> introduced into the problem space, there are alternative
> approaches that should be considered, too.)
We have created an example of aggregation server that contains
multi-model ensemble data:
http://ensembles.ecmwf.int/thredds/catalog.html
This is an effort to satisfy the need of the community to access
forecast data in an efficient way (of course, we can rewrite the files
once a consensus has been reached, be it a modified version of the
"ensembles" proposal or the use NetCDF4). Although the files are big,
the system seems to cope fine with them. Forecasts from additional
models might be added in the near future, but as they will be identified
with the variables "source", "institution", "experiment_id" and
"realization" (some of them are not considered as global attributes any
more), there shouldn't be any problem in being social. However, this
dataset has been created with the individual model outputs taking into
account the other contributors, which, as you point out, won't be the
general case.
> 5. If implemented the proposal will create significant barriers to
> interoperability.
> * CF1.0 has created the highest level of model-sharing
> interoperability that our community has ever seen.
> Interoperability is arguably the greatest contribution that
> CF has made. (It is for this reason, for example, that ESRI
> products have begun to support CF.) Many clients that are
> currently capable of reading CF 1.0 will not be able to
> access model outputs that utilize this proposal. The scope
> of this problem -- weighing the benefits against the losses
> -- deserves to be discussed and assessed.
You're right. The use of a fifth dimension prevents Grads and Ferret
(among others) from handling the files. However, to my knowledge, the
inability of those clients to work with a fifth dimension made the users
of ensembles forecasts to not use them and search for alternatives such
as R, IDL or MATLAB.
> 6. The proposal potentially compromises the future quality of CF
> because netCDF 4 will offer solutions that model the problem properly.
When is netCDF4 expected to be available to start writing ensemble
forecast files?
My message is that there is a demand for standardized NetCDF ensemble
forecast files. Shouldn't the adoption of a CF standard to write those
files depend on when an alternative, more adequate solution is available?
Best regards,
Paco
--
________________________________________
Francisco J. Doblas-Reyes
European Centre for Medium-Range
Weather Forecasting (ECMWF)
Shinfield Park, RG2 9AX
Reading, UK
Tel: +44 (0)118 9499 655
Fax: +44 (0)118 9869 450
f.doblas-reyes at ecmwf.int
_______________________________________
Received on Mon Feb 26 2007 - 07:46:30 GMT