⇐ ⇒

[CF-metadata] statistic indices

From: Pamment, JA <J.A.Pamment>
Date: Tue, 24 Apr 2007 14:22:47 +0100

Hello Heinke,

>
> Hello again,
>
> Martina has forgotten to send a word file attachement.
>
> I would like to discuss the statistic indices. They come from HadEX
> and other projects.
>
> regards
> Heinke

For ease of reference, I have attached the original word file to this
posting. Apologies for this very lengthy posting, but there is a lot
of ground to cover here!

Forty-one new names are proposed for indices that describe aspects of
climate variability. The proposed names describe rather different
quantities to any that currently exist in the table. It has taken
me a while to analyse them but they seem to fall into five basic
categories (see below). Within each of the first four categories I
have included the definition of just one representative name to give
an example of the type of quantity being discussed. For the full
list of definitions please see the attachment.

(1) "index_per_time_period" names. These are counts of days on which
particular meteorological conditions existed.

> frost_days_index_per_time_period; 1; frost days index is the number of
> days where minimum temperature is below 0 degree Celsius. The time
> period should be defined by the bounds of the time coordinate.
> summer_days_index_per_time_period
> consecutive_frost_days_index_per_time_period
> consecutive_summer_days_index_per_time_period
> ice_days_index_per_time_period
> tropical_nights_index_per_time_period

These names all refer to conditions where the daily max or min
temperature falls above or below an absolute threshold. I think we
need not include "per_time_period" in the names as the period over
which the days are counted should be clear if the time bounds are
specified. I wonder if we even need to include the word "index" in
all the names - perhaps we could simply have:
number_of_frost_days
number_of_summer_days
number_of_ice_days
number_of_tropical_nights
For the other two names we could have:
consecutive_frost_days_index
consecutive_summer_days_index

> growing_season_length_index
I think this name is OK. As it is calculated over a calendar year the
time bounds would need to be specified appropriately.
 
> heating_degree_days_per_time_period
This could be:
number_of_heating_degree_days
I do not have sufficient expertise to comment on the definition of
heating degree days - is there a single standard for this or do
alternative definitions exist?

> wet_days_index_per_time_period
> heavy_precipitation_days_index_per_time_period
> very_heavy_precipitation_days_index_per_time_period
These all essentially describe the same concept, i.e., the number of
days when the daily accumulated precipitation exceeds a threshold.
The thresholds defining wet, heavy precipitation and very heavy
precipitation are 1 mm, 10 mm and 20 mm respectively. Perhaps we
could combine these names and use a scalar coordinate variable to
specify the threshold:
number_of_days_accumulated_precipitation_exceeds_threshold
Apart from reducing the number of names another advantage of this is
that we would not need to introduce qualitative terms such as "heavy"
and "very heavy" into the standard name table.

As with the other "consecutive" names
> consecutive_dry_days_index_per_time_period
> consecutive_wet_days_index_per_time_period
could be simply:
consecutive_wet_days_index
consecutive_dry_days_index

> frost_days_where_no_snow_index_per_time_period
In standard names the qualifier where_type specifies that, rather
than applying to an entire grid box, the quantity applies only to the
part of the grid box of the named type. However, I don't think that
is what was intended here. My understanding is that this quantity
is a count of the number of days with minimum temperature less than
zero degrees Celsius when there is no lying snow anywhere in the
grid box. Is that correct? If so, I would suggest something like:
number_of_frost_days_without_snow_cover

> strong_breeze_days_index_per_time_period
> strong_gale_days_index_per_time_period
> hurricane_days_index_per_time_period
This is another case where the three names are essentially
describing the same concept, i.e., the number of days when the
maximum wind speed exceeds a particular threshold. The respective
definitions for "strong breeze", "strong gale" and "hurricane" are
maximum wind speeds exceeding 10.5 m s-1, 20.5 m s-1 and 32.5 m s-1.
As with the precipitation names I think we could combine these into:
number_of_days_maximum_wind_speed_exceeds_threshold
and use a scalar coordinate variable to give the value of the
threshold.

(2) "index_wrt_Nth_percentile|mean_of_reference_period" names, where N
is a number in the range 1-100 and the reference period is defined in
a variable attribute. These are counts of days when particular
meteorological conditions existed relative to a threshold defined from
the reference period.

> heat_wave_duration_index_wrt_mean_of_reference_period; 1; this is the
> number of days per time period where in intervals of at least 6
> consecutive days the daily maximum temperature is more than 5 degrees
> above a reference value. The reference value is calculated as the
> mean of maximum temperatures of a five day window centred on each
> calendar day of a given 30 year climate reference period. The time
> period should be defined by the bounds of the time coordinate. The
> climate reference period should be defined by the variable attribute,
> e.g., 1961-1990.
> cold_wave_duration_index_wrt_mean_of_reference_period
> warm_spell_days_index_wrt_90th_percentile_of_reference_period
> cold_spell_days_index_wrt_10th_percentile_of_reference_period

I am not clear as to which attribute you would use to specify the
reference period. Would you specify it as a string in the "comment"
attribute or perhaps include a non-standard attribute? Presumably
additional information would need to be specified in order to fully
describe the reference data, e.g., which model it came from?.

I wonder if we actually need to include "wrt_to_Nth_percentile" as
part of the name since it is also part of the definition of the
indices. Would these indices ever be calculated with respect to
percentiles of the reference period other than those given here?

I think that terms such as "warm_spell", "cold_spell", etc., are
acceptable if they are used as part of the name of a defined index,
although I admit that I have never heard the term "cold_wave" before.
Do others agree?

(3) "percent of time wrt_to_Nth_percentile|mean_of_reference_period"
names, where N is a number in the range 1-100 and the reference period
is defined in a variable attribute. These names express the percent
of time per time period (specified by time bounds) that conditions
exist above or below a threshold calculated from the reference period.

> cold_nights_percent_wrt_10th_percentile_of_reference_period; 1; this
> is the per cent of time per time period where daily minimum
> temperature is below a reference value. The reference value is
> calculated as the 10th percentile of daily minimum temperatures of a
> five day window centred on each calendar day of a given 30 year
> climate reference period. The time period should be defined by the
> bounds of the time coordinate. The climate reference period should
> be defined by the variable attribute, e.g., 1961-1990.

> warm_nights_percent_wrt_90th_percentile_of_reference_period
> very_cold_days_percent_wrt_10th_percentile_of_reference_period
> cold_days_percent_wrt_10th_percentile_of_reference_period
> very_warm_days_percent_wrt_90th_percentile_of_reference_period
> warm_days_percent_wrt_90th_percentile_of_reference_period

(The following four names omit the word "percent" but are defined in
an analogous way to other names in this category)
> moderate_wet_days_wrt_75th_percentile_of_reference_period
> wet_days_wrt_90th_percentile_of_reference_period
> very_wet_days_wrt_95th_percentile_of_reference_period
> extremely_wet_days_wrt_99th_percentile_of_reference_period

The general comments I made in category (2) as to how much
information is required to describe the reference data also apply
here. Another general comment is that I would prefer to use
"fraction" rather than "percent" as it is more consistent with
existing standard names.

I feel that what is needed for all the names in this category is a
general way of describing the following concept:
fraction_of_days_when_min|max|mean|accumulated_temperature|precipitation
_lies_above|below_reference_period_threshold
For example, the definitions of moderate_wet, wet, very_wet and
extremely_wet days differ only in the percentile of the reference
period to which they are compared. Whereas in the names of
indices I think it is OK to use terms like "warm_spell" I am far less
happy about having qualitative words like "very" and "extremely" when
referring to a physical quantity such as precipitation. In the case
of the four wet day names it seems natural to combine them as:
fraction_of_days_when_precipitation_lies_above_reference_period_threshol
d
and supply the appropriate threshold percentile in a scalar
coordinate variable.

The temperature related names could also be fitted into this pattern,
for example,
cold_nights_percent_wrt_10th_percentile_of_reference_period
would become
fraction_of_days_when_minimum_temperature_lies_below_reference_period_th
reshold
and the percentile would again be given as a scalar coordinate
variable.

Comments on this idea would be most welcome!

(4) "per cent of amount per time period" names. These names express
the percentage of a quantity, e.g., precipitation, that has occurred
during a time period (defined by time bounds) from days when
particular meteorological conditions existed.

> precipitation_percent_due_to_R75p_days; 1; percentage of total
> precipitation amount per time period due to
> moderate_wet_days_wrt_75th_percentile_of_reference_period (see
> category (3)).

> precipitation_percent_due_to_R90p_days
> precipitation_percent_due_to_R95p_days
> precipitation_percent_due_to_R99p_days

R75p, R90p, R95p and R99p refer to the wet days discussed in
Category (4). For example, R75p refers to "moderate wet" days so that
precipitation_percent_due_to_R75p_days means the percentage of
precipitation accumulated over the whole time period that fell on
moderate wet days. I think that it would be better to make the names
explicit rather than use these abbreviations. Also, I would
prefer to use "fraction" rather than "percent". Again, given the
similarity between all four names and definitions it seems reasonable
to combine them as:
fraction_of_accumulated_precipitation_due_to_days_exceeding_reference_pe
riod_threshold
and supply the threshold percentile in a coordinate variable.

Please comment!

(5) physical variable names

> intra_period_extreme_temperature_range; K; difference between the
> absolute extreme temperatures in the observation period.

I think that the period would need to be specified using time bounds
and should not appear in the name. In standard names the quantity
"air_temperature" is used for temperature and a vertical coordinate
specifies the level in the atmosphere to which the temperature refers.
I think that perhaps we need to define a new cell_method of "range".
The standard name would then simply be:
air_temperature
(which, of course, is already in the table) with cell_methods =
"time: range". Do others agree?

> highest_one_day_precipitation_amount_per_time_period; kg m-2; highest
> one day precipitation is the maximum of one day precipitation amount
> in a given time period.
> highest_five_day_precipitation_amount_per_time_period; kg m-2;
> highest precipitation amount for five day interval (including the
> calendar day as the last day).

I would not include "per_time_period" as the period information
should come from the time bounds. To describe the maximum of
precipitation over different time intervals we can use cell_methods
with the additional information in parenthesis (see CF1.0 7.3). Hence
we would have just one name of:
precipitation_amount
(already in the standard name table) which would be specified with
appropriate time bounds and
cell_methods = "time: maximum (interval: 1 day)" or
cell_methods = "time: maximum (interval: 5 days)" as appropriate.
Please note that the units for a precipitation amount are kg m-2.

> simple_daily_intensity_index_per_time_period; kg m-2; simple daily
> intensity index is the mean of precipitation amount on wet days. A
> wet day is a day with precipitation sum exceeding 1 mm.

Again I would exclude "per_time_period". "Intensity" sounds to me
like a precipitation rate rather than an amount. I think we could
again use:
precipitation_amount
with appropriate time bounds and
cell_methods = "time: mean (over days with precipitation thickness
> 1 mm)".
In this case there is no standard syntax for the information in
parenthesis - it is just an explanation of how the cell_method was
applied to the data.

> wind_chill_temperature; Celsius; windchill temperature describes the
> fact that low temperatures are felt to be even lower in case of wind.
> It is based on the rate of heat loss from exposed skin caused by wind
> and cold. It is calculated according to the empirical formula:
> 33+(T-33)*(0,478+0,237*(SQRT(ff*3,6)-0,0124*ff*3,6)
> T = air temperature in degree Celsius, ff - 10m wind speed in m/s.
> Windchill temperature is only defined for temperatures at or below 33
> degree Celsius and wind speeds above 1.39 m/s. It is mainly used for
> freezing temperatures.

"Wind_chill" is a commonly used term and I am happy to accept it.
For consistency with other temperature names perhaps we should have:
wind_chill_air_temperature
The canonical units would need to be Kelvin. Please can someone tell
me if it is OK to use the "add_offset" attribute to specify a
conversion between Kelvin and Celsius if that is required?

> hum_index; Celsius; humindex describes empirically in units of
> temperature how the temperature and humidity influence the wellness
> of a human being.
> HI = T + 5/9 *(A-10)
> with A = e * (6.112 * 10 ** ((7.5 *T)/(237.7 + T)) * R/100 )
> T = air temperature in degree Celsius, R = relative humidity in %,
> e= vapour pressure.

It strikes me as strange to have a name that refers to humidity with
units of Celsius. Is this index the same as the one I have heard
called "comfort index"? In any case I think that a more explanatory
name which is consistent with the units would be better. My
question regarding the conversion of Kelvin to Celsius also applies
to this name.

Best wishes,
Alison

------
Alison Pamment Tel: +44 1235 778065
NCAS/British Atmospheric Data Centre Fax: +44 1235 446314
Rutherford Appleton Laboratory Email: J.A.Pamment at rl.ac.uk
Chilton, Didcot, OX11 0QX, U.K.























-------------- next part --------------
A non-text attachment was scrubbed...
Name: cdo_operators_names5.doc
Type: application/msword
Size: 67584 bytes
Desc: cdo_operators_names5.doc
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20070424/bb9215bf/attachment-0002.doc>
Received on Tue Apr 24 2007 - 07:22:47 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒