⇐ ⇒

[CF-metadata] Getting back to ensembles

From: Simon Wood <simon.wood>
Date: Fri, 01 Dec 2006 14:43:54 +1300

Dear Paco, Jonathan and others,

At NIWA we are building a multi-model environmental forecasting system.
We may at some stage want to combine output in a fashion broadly similar
to these multi-model ensembles, so I have been watching this discussion
with interest. I'd like to add a couple of comments/suggestions on the
example Paco posted the other day.

My main comment concerns the time axis, which is unclear to me. I
understand the concepts of 'reference time' (aka 'analysis time') and
'forecast period' (ie elapsed time since analysis time). Here at NIWA
we also commonly refer to 'validity time', being the forecast time axis
(ie validity time = reference time + forecast period). This is what I
would have expected to see as the main time axis for the dataset. The
'leadtime' variable in the example is confusing.

As Jonathan pointed out, a variable with standard name 'forecast period'
should have a unit of 'days' and be a value in days since the relevant
value of 'reference time', since it is supposed to be elapsed time.

With this change made, I would want to introduce a time coordinate
variable with standard name 'time' and units of 'days since <epoch>'
(taking values like those given in the example for leadtime). This
would not be monotonic in dimension 'time' so I don't think it should be
treated as a standard 1D coordinate variable (ie time(time)). However
time seems a sensible name and in fact it seems that it is the naming of
the 'time' dimension which is inappropriate. If the 'time' dimension
were renamed something like 'i' (to better reflect its meaning as simply
an index into a concatenation of an overlapping series of datasets) then
time(i) becomes a 1D *auxiliary* coordinate variable which should then
be referenced by a 'coordinates = time' attribute in the data variables.
  The time_bnds should probably be attached to this time variable. This
seems more consistent with earlier postings on the subject (eg
http://www.cgd.ucar.edu/pipermail/cf-metadata/2006/001008.html).

A modified example is included below (just focusing on the time stuff).
  Note that I have made reftime an auxilary coordinate variable too, but
made leadtime into an ancillary variable since conceptually the dataset
is surely dimensioned (ensemble, reftime, time, level, latitude,
longitude). Leadtime could be omitted altogether since it is trivially
derived from leadtime(i) = time(i) - reftime(i).

My second comment concerns the use of 'Conventions = "CF-1.0"' for
datasets which rely on extensions to CF-1.0. I think there needs to be
a mechanism to signal that a dataset uses new conventions not covered in
the CF-1.0 (28 October 2003) document and also that these new
conventions are provisional and might yet change. Perhaps "CF-1.1
preliminary" or "CF-1.0 plus local extensions" or similar?
(This point obviously needs further discussion which is probably more
relevant to the "Provisional standards" thread, so I'll leave it there
for now ;-) -- just wanted to note that it doesn't seem appropriate to
use "CF-1.0" in this case.)

regards

slw


netcdf MM_129_mon_2001-modified {
dimensions:
         longitude = 144 ;
         latitude = 71 ;
         level = 3 ;
         i = 24 ; // perhaps not best name, but not 'time'
         time_bnd = 2 ;
         ensemble = 63 ;
         string4 = 4 ;
         string15 = 15 ;
         string50 = 50 ;
variables:
         float longitude(longitude) ;
         float latitude(latitude) ;

         float reftime(i) ;
                 reftime:units = "days since 1950-01-01 00:00:00" ;
                 reftime:standard_name = "forecast_reference_time" ;
                 reftime:long_name = "forecast reference time" ;
         int leadtime(i) ;
                 leadtime:units = "days" ;
                 leadtime:standard_name = "forecast_period" ;
                 leadtime:long_name = "Time elapsed since the start of
the forecast" ;
         int time(i);
                time:units = "days since 1950-01-01 00:00:00";
                 time:standard_name = "time";
                 time:long_name = "Forecast Validity Time";
                 time:bounds = "time_bnd";
         int time_bnd(i, time_bnd) ;
                 time_bnd:units = "days since 1950-01-01 00:00:00" ;

         int realization(ensemble) ;
         char experiment_id(ensemble, string4) ;
         char source(ensemble, string50) ;
         char institution(ensemble, string15) ;
         float level(level) ;

         float geopotential(ensemble, i, level, latitude, longitude) ;
                 geopotential:data_type = "float" ;
                 geopotential:units = "m2 s-2" ;
                 geopotential:unit_long = quare_meter_per_square_second";
                 geopotential:standard_name = "geopotential" ;
                 geopotential:long_name = "geopotential" ;
                 geopotential:cell_methods = "time: mean (interval 1 day)" ;
                 geopotential:coordinates = "time reftime";
                 geopotential:ancillary_variables = "leadtime";
                 geopotential:_FillValue = 1.e+12f ;

data:
   reftime = 18659, 18659, 18659, 18659, 18659, 18659, 18748, 18748,
18748, 18748, 18748, 18748, 18840, 18840, 18840, 18840, 18840, 18840,
18932, 18932, 18932, 18932, 18932, 18932 ;

   leadtime = 14, 42, 73, 103, 134, 164, 14, 45, 75, ...

   time = 18673, 18701, 18732, 18762, 18793, 18823, 18762, 18793,
18823, 18854, 18885, 18915, 18854, 18885, 18915, 18946, 18976, 19007,
18946, 18976, 19007, 19038, 19066, 19097 ;



Francisco Doblas-Reyes wrote:
> Dear all,
>
> I have to confess that I'm slightly lost after the exchange of messages
> in the last few weeks (time during which I was away and unable to
> respond). However, I'll try to make some comments on the proposed
> structures and send an example of what I've understood.
>
> Using "realization" as the dimension to include ensemble data in the
> file seems a good option to me, although I went for a dimension name
> different (see below). String variables with this dimension should be
> able to do the job to include the metadata describing the simulations.
>
> What is not so clear to me is whether attributes such as "institution"
> or "source" or the one mentioned by Jamie "experiment_id" would be
> allowed as auxiliary variables, "realization" being for me the
> coordinate variable. Their inclusion in the list of accepted standard
> names would be the best for me, but having them as standard_metadata or
> external_vocabulary is also acceptable. However, the use of external
> dictionaries poses certain problems, as discussed. We can maintain at
> ECMWF those describing the simulations performed with all the models
> that are run and archived here (several European forecast models), but
> nothing guarantees that the format will be similar to the vocabularies
> kept at, say, NCEP. Reaching an agreement on the external dictionaries
> might take again a few months and contacting people who are not part of
> the CF list.
>
> Below there is an example of multi-forecast system file which is close
> to what is being discussed here. It's been constructed from a set of
> multi-model seasonal forecasts for the year 2001 (started on the first
> of February, May, August and November). The variable is geopotential
> height. Each single model (there are 7) contributes with a 9-member
> 6-month simulation, where each member has been produced with slightly
> different initial conditions. You'll see that "ensemble" is a dimension,
> the label/coordinate variable (dimensioned with ensemble) being
> "realization", which is not monotonic (one of the alternatives discussed
> in Jonathan's message). Realization is a number because they correspond
> to the production order in the original single-model ensemble. Please,
> note the use of "reftime" and "leadtime" and let me know if you diagree
> with something. I included values for the variables "source" and
> "institution" (this last one being the institution providing the data),
> as well as for "experiment_id". Note that "experiment_id" is necessary
> because several experiments can be carried out with the same value of
> source, realization and institution. Keeping all these variables
> separate (instead of a long character chain) makes life easier for data
> handling and plot labelling.
>
> I could read and plot the file with ncBrowse (both as a stand-alone file
> and as a request to a Thredds server) and I'm in the process of doing
> the same with the NCO functions.
>
> Finally, as data providers we can rewrite the files once a final
> decision has been reached. In the meantime, we can serve the data with
> metadata similar to those below, but expressing that they can be changed
> in the near future.
>
> Best regards,
> Paco
>
>
> netcdf MM_129_mon_2001 {
> dimensions:
> longitude = 144 ;
> latitude = 71 ;
> level = 3 ;
> time = 24 ;
> time_bnd = 2 ;
> ensemble = 63 ;
> string4 = 4 ;
> string15 = 15 ;
> string50 = 50 ;
> variables:
> float longitude(longitude) ;
> longitude:data_type = "float" ;
> longitude:units = "degrees_east" ;
> longitude:axis = "X" ;
> longitude:standard_name = "longitude" ;
> longitude:topology = "circular" ;
> longitude:modulo = 360 ;
> longitude:valid_min = 0. ;
> longitude:valid_max = 359. ;
> float latitude(latitude) ;
> latitude:data_type = "float" ;
> latitude:units = "degrees_north" ;
> latitude:axis = "Y" ;
> latitude:standard_name = "latitude" ;
> latitude:valid_min = -89. ;
> latitude:valid_max = 89. ;
> float reftime(time) ;
> reftime:units = "days since 1950-01-01 00:00:00" ;
> reftime:standard_name = "forecast_reference_time" ;
> reftime:long_name = "forecast reference time" ;
> int leadtime(time) ;
> leadtime:units = "days since 1950-01-01 00:00:00" ;
> leadtime:standard_name = "forecast_period" ;
> leadtime:long_name = "Time elapsed since the start of
> the forecast" ;
> leadtime:bounds = "time_bnd" ;
> int time_bnd(time, time_bnd) ;
> time_bnd:units = "days since 1950-01-01 00:00:00" ;
> int realization(ensemble) ;
> realization:standard_name = "realization" ;
> realization:long_name = "Number of the simulation in the
> ensemble" ;
> char experiment_id(ensemble, string4) ;
> experiment_id:standard_name = "experiment_id" ;
> experiment_id:long_name = "Experiment identifier" ;
> char source(ensemble, string50) ;
> source:standard_name = "source" ;
> source:long_name = "Method of production of the data" ;
> char institution(ensemble, string15) ;
> institution:standard_name = "institution" ;
> institution:long_name = "Institution responsible for the
> forecast system" ;
> float level(level) ;
> level:data_type = "float" ;
> level:units = "hPa" ;
> level:axis = "Z" ;
> level:standard_name = "air_pressure" ;
> level:positive = "up" ;
> float geopotential(ensemble, time, level, latitude, longitude) ;
> geopotential:data_type = "float" ;
> geopotential:units = "m2 s-2" ;
> geopotential:unit_long = "square_meter_per_square_second" ;
> geopotential:standard_name = "geopotential" ;
> geopotential:long_name = "geopotential" ;
> geopotential:cell_methods = "leadtime: mean (interval 1
> day)" ;
> geopotential:_FillValue = 1.e+12f ;
>
> // global attributes:
> :Conventions = "CF-1.0" ;
> :Generator = "SeasPy v1.1" ;
> :Created = "Fri Nov 10 15:09:50 2006" ;
> :References =
> "http://www.ecmwf.int/research/demeter/index.html" ;
> :Comment = "Data interpolated from original model grid
> into a regular grid. Data restrictions: none" ;
>
> data:
>
> skip lat, lon and level
>
> reftime = 18659, 18659, 18659, 18659, 18659, 18659, 18748, 18748,
> 18748, 18748, 18748, 18748, 18840, 18840, 18840, 18840, 18840, 18840,
> 18932, 18932, 18932, 18932, 18932, 18932 ;
>
> leadtime = 18673, 18701, 18732, 18762, 18793, 18823, 18762, 18793,
> 18823, 18854, 18885, 18915, 18854, 18885, 18915, 18946, 18976, 19007,
> 18946, 18976, 19007, 19038, 19066, 19097 ;
>
> skip time_bnd
>
> realization = 0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8, 0,
> 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6,
> 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8 ;
>
> experiment_id =
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "cnrm",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "crfc",
> "lody",
> "lody",
> "lody",
> "lody",
> "lody",
> "lody",
> "lody",
> "lody",
> "lody",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scnr",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "scwf",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "smpi",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo",
> "ukmo" ;
>
> source =
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, ARPEGE3/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, ECHAM4/OPA8.2, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, IFS23R4/HOPE-E, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, ECHAM5/OM1, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 ",
> "DEMETER, GloSea, System 0, Method 1 " ;
>
> institution =
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF ",
> "ECMWF " ;
>


-- 
Simon Wood
National Institute of Water and Atmospheric Research, Wellington, NZ	
simon.wood at niwa.co.nz
http://www.niwa.co.nz
Received on Thu Nov 30 2006 - 18:43:54 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒