⇐ ⇒

[CF-metadata] standards for probabilities

From: Vegard Bønes <vegard.bones>
Date: Tue, 15 Nov 2011 13:14:39 +0000 (UTC)

Thank you, Jonathan! :)

So, a bit more concrete, this is option 1:

float rain_25(time, y, x);
 rain_25:standard_name = "precipitation_amount";
 rain_25:cell_methods = "realization: percentile(25)";

The only problem I see with this is that in the resulting cdm realization is not used anywhere, apart from possibly in cell methods. But maybe this is ok?


If I understand the second option correctly, this would lead to something like this:

float precipitation_amount(time, percentile, y, x);
 ...
float percentile(percentile);
 percentile:units = "1";
 percentile:standard_name = "cumulative_distribution_function_of_precipitation_amount";

But what is the purpose of explicitly refering to precipitation_amount in the standard name? would not cumulative_distribution_function be better? Then the same dimension could be used for other data, such as air_temperature.

Or, if we want to add something about the nature of the source data for the function, it could be called something like cumulative_distribution_function_due_to_realization?


I am still a bit uncertain about what is the best, though.


-- Vegard




----- Original Message -----
Fra: "Jonathan Gregory" <j.m.gregory at reading.ac.uk>
Til: "Vegard B??nes" <vegard.bones at met.no>
Kopi: cf-metadata at cgd.ucar.edu
Sendt: 15. november 2011 11:11:52
Emne: Re: [CF-metadata] standards for probabilities

Dear Vegard

> I want to express such things as "25th percentile precipitation amount" (based on ensemble data), and probability that air temperature will be within 2.5 degrees of the forecast. How should I do this?

You are right, this case has not yet been dealt with, although the guidelines
for construction of standard names foresee that needs like this might arise!

If the quantity is a precipitation_amount, it's fine to use that standard
name. The question is how to record that is the 25th percentile. Two possible
ways to do this would be:

* To extend the possible syntax of cell_methods so that it can describe
percentiles. It is already possible to indicate a median in cell_methods, and
that is a particular percentile. The advantage of this way of doing it would
be that you would record whether the distribution of precipitation amounts
being considered was for time-variation, or spatial variation, or some other
kind of variation. Obviously you could have a probability distribution with
percentiles for many different independent variables.

* To use a size-1 or scalar coordinate variable to record the probability,
with a new standard_name, perhaps
cumulative_distribution_function_of_precipitation_amount.
The value of this coordinate would be 0.25 for the 25th percentile. The
advantage of this method would be that you could have several different
percentiles in the same variable, by having a multivalued probability coord.
If you wanted to be specific about what the independent variable was, that
would have to be included in the standard name as well e.g.
cumulative_distribution_function_of_precipitation_amount_over_time.

What do you think?

Cheers

Jonathan
Received on Tue Nov 15 2011 - 06:14:39 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒