[CF-metadata] CF and a representation of probalistic forecasts from Pamment, JA on 2006-10-12 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Pamment, JA <J.A.Pamment>
Date: Thu, 12 Oct 2006 12:52:27 +0100

Hi All,

In the update of the standard name table that took place on the 26th
September 2006 the following name was added:
realization; 1; realization is used to label a dimension that can be
thought of as a statistical sample, e.g., labelling members of a model
ensemble.

This resulted from a proposal by Jamie Kettleborough (originally for the
name "sample" although "realization" was later substituted).

Jamie also requested a new standard name modifier which we would now
like to call "realization_weight" with units of "1". Please can this be
added to the list of modifiers in Appendix C of the CF 1.0 doc?

Alison

On 03 May 2006 10:30 Jamie Kettleborough wrote:
>
> Hello,
>
> there are a couple of projects (Hadley Centre QUMP project and
> climateprediction.net) that will be distributing
> probabilistic forecasts of climate change based on ensembles of model
> runs. One way of representing these
> results will be a set of model runs and a set of weights that should
be
> applied to each model run.
> I think this can be accommodated straight forwardly, and reasonably
> generally, in the CF standard
> using 'ancillary_variables' and the addition of
>
> 1) New standard name 'sample' used to label a dimension that can be
> thought
> of as a statistical sample
(e.g.
> ensemble)
>
> 2) New standard name modifier 'sample_weight' used to label variables
that
> are acting
> as weights for other
quantities
>
> e.g. course map of predictions of 21st century temperature change
> In this example temperature is dimensioned by sample as well as the
normal
> space and time dimensions. Each
> sample is the result of one model run. Some models are less realistic
> than others and so should be down weighted
> in any subsequent analysis. The weights variable gives the weight for
> each ensemble member.
>
> dimensions:
> lat = 18 ;
> lon = 36 ;
> time = 10 ;
> sample = 10000 ; // sample points
> variables:
> float temp(sample,time,lat,lon) ; // each sample is the result from
one
> ensemble member
> temp:long_name = "Temperature at 1.5m" ;
> temp:standard_name = "air_temperature" ;
> temp:ancillary_variables = "weights" ;
> temp:source = "perturbed physics ensemble of HadSM3" ;
> float weights(sample) ; // the weight applied to each ensemble
memeber
> weights:long_name = "likelihood weights for 1.5m air temperature"
;
> weights:standard_name = "air_temperature sample_weights" ;
>
> Notes:
> 1. The sample points can be generated from a perturbed physics
ensemble or
> a detection attribution
> exercise (or possibly some other statistical method) so don't think
you
> want to explicitly use the term
> 'ensemble'. 'sample' is better. (though potentially confusing with
grab
> samples or bucket samples?
> - maybe 'distribution_sample is a better name?)
> 2. If the sample dimension is not identified by its standard name then
> there is an implied rule that
> the software has to infer which dimension to apply the weights to
based
> on the common dimension.
> 3. sample_weight variables have an implied valid_min=0, and
valid_max=1.
> (although the valid_max may be relaxed if you are prepared to
> renormalise later)
> 4. The 'ancillary_variable' attribute may point to more than one
> sample_weight. This might represent
> different sensitivity studies, different observations used for skill
> scores, or different methodologies.
> In this case each sample_weight should be thought of as applied stand
> alone. They are not applied in sequence.
> 5. The same sample_weight variable can be referenced by more than one
> variable. This is useful for forming
> joint (multidimensional) pdfs between variables. In this case
although
> the ordering of the samples is
> arbitrary it must be used consistently: the same order should be used
for
> all variables in the file.
> 6. The creation method of the sample points and associated weights
should
> be left to description in
> 'source' attribute (which may refer to URL for more information). In
the
> case of perturbed physics
> ensembles the derivation of weights can be complex so reference to
> external documents to describe the method
> will avoid unnecessarily overloading the usage metadata.
> 7. in other examples the weights might be a function of space and time
as
> well as sample member.
>
> I hope this all makes enough sense for people to make a judgement on
> whether this should be accepted or not.
> Obviously if I've been unclear let me know and I'll try and be more
> eloquent. If this all makes sense there
> will be a few follow up e-mails with specific requests for standard
names.
>
> There are a couple of other representations of probabilistic forecast
that
> might be used. These can be posted
> as separate suggestions as and when needed.
>
> Thanks,
>
> Jamie
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata

------
Alison Pamment Tel: +44 1235 778065
NCAS/British Atmospheric Data Centre Fax: +44 1235 445858
Rutherford Appleton Laboratory Email: J.A.Pamment at rl.ac.uk
Chilton, Didcot, OX11 0QX, U.K.
Received on Thu Oct 12 2006 - 05:52:27 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST