⇐ ⇒

[CF-metadata] Return periods

From: Jonathan Gregory <j.m.gregory>
Date: Thu, 4 Sep 2014 14:13:33 +0100

Dear Dan

I agree with you that it would be better to store F(x) than to use your sign
convention for return periods. However it would be fine to split the return
periods into the two tails in different data variables and give them distinct
standard names. We have some standard names for such things e.g.
  spell_length_of_days_with_lwe_thickness_of_precipitation_amount_above_threshold
and you could propose suitable ones.

If you store F(x), I think it would be a data variable, not a coordinate or
ancillary variable, and it should have a standard name. I believe the guidance
you quote is about probability distribution functions rather than cumulative
(probability) distribution functions. Following a similar approach, however,
we could have a standard name such as
  cumulative_distribution_function_of_precipitation_amount
for F(x), where x is precipitation_amount, which would be a coordinate. Is
that what you have in mind?

Cheers

Jonathan


----- Forwarded message from "Hollis, Dan" <dan.hollis at metoffice.gov.uk> -----

> Dear all,
>
> Here is another question related to migrating our UK climate grids to NetCDF.
>
> As well as grids of the monthly rainfall total (in mm) we also generate grids of the estimated return period of the rainfall total (in years). Currently these two quantities are stored in separate files (with only the file name and location to tell us they are related). I've been trying to think how to store the return period information using CF-NetCDF and would be grateful for advice.
>
> Some further details:
>
> Our existing grids contain the return period in years i.e. if the return period for a particular grid point is N years then this means that we estimate that the rainfall total for that grid point will be exceeded on average once every N years. This is equivalent to saying that each year there is a probability of 1/N of exceeding that rainfall amount i.e. the cummulative distribution function, F(x) = 1 - 1/N. For example, if N = 10 then F(x) = 0.9. Additionally, as we are also interested in droughts, we have adopted our own convention of using negative values to refer to the left (dry) tail of the rainfall distribution. For example N = -10 is used to mean that F(x) = 0.1 i.e. we estimate that rainfall amounts *less* than the observed value will occur once every 10 years on average.
>
> This use of positive and negative values to indicate return periods relating to the right (wet) and left (dry) tails is convenient but unconventional. My initial thought is that we should store F(x) itself and only convert to return period for the purposes of presentation e.g. creating maps.
>
> So, how to store F(x)? The main problem is that the value to which the return period relates (i.e. the rainfall amount) varies from one grid point to another. Two possibilities occur to me, both of which involve storing F(x) alongside the rainfall total:
>
> - Store F(x) as an auxilliary coordinate
>
> - Store F(x) as ancillary data
>
> It's not clear to me whether one is better than the other, or even whether either approach is valid.
>
> The other question is what to call the F(x) values. The guidance for ancillary data says to use standard name modifiers to indicate the relationship, but there doesn't seem to be anything suitable for describing F(x).
>
> The other thing I've looked at is the guidance for constructing standard names. I can't seem to locate this on the current CF web site so I've refered to the archived copy available here:
>
> https://web.archive.org/web/20130728212039/http://cf-pcmdi.llnl.gov/documents/cf-standard-names/guidelines
>
> The section on transformations includes 'probability_distribution_of_X[_over_Z]' in the list, however it's unclear to me whether this is what I need, or even how I might use it in other circumstances. The notes state:
>
> "probability distribution (i.e. a number in the range 0.0-1.0 for each range of X) of variations (over Z) of X. The data variable should have an axis for X."
>
> The reference to 'each range of X' is the bit I find confusing. Is the idea to store F(X1), F(X2), F(X3) etc, or is it intended to be F(X2) - F(X1), F(X3) - F(X2), F(X4) - F(X3) etc? The former doesn't quite fit the description, but the latter has the problem that the number of ranges (= the number of data values) will be one less than the number of X values. I can't see any existing names that use this transformation to use as a guide.
>
> If anyone can help that would be much appreciated.
>
> Thanks,
>
> Dan
>
>
> Dan Hollis Climatologist
> Met Office Hadley Centre FitzRoy Road Exeter Devon EX1 3PB United Kingdom
> Tel: +44 (0)1392 886780 Fax: +44 (0)1392 885681
> E-mail: dan.hollis at metoffice.gov.uk Website: http://www.metoffice.gov.uk
> For UK climate and past weather information, visit http://www.metoffice.gov.uk/climate
>
>

> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


----- End forwarded message -----
Received on Thu Sep 04 2014 - 07:13:33 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒