⇐ ⇒

[CF-metadata] statistic indices

From: Heinke Hoeck <heinke.hoeck>
Date: Fri, 04 May 2007 09:31:08 +0200

Hello Alison,

for references and definitions some links:
ECA&D-Indeces http://eca.knmi.nl/ and HadEX Data http://www.hadobs.org/
The indices are recommended by the CCl/CLIVAR/JCOMM expert team on
climate change detection and indices (ETCCDI).
http://www.clivar.org/organization/etccdi/etccdi.php.
We created the cf-standard names for this indices accordingly.


Pamment, JA (Alison) wrote:

> Hello Heinke,
>
>
>> Hello again,
>>
>> Martina has forgotten to send a word file attachement.
>>
>> I would like to discuss the statistic indices. They come from HadEX
>> and other projects.
>>
>> regards
>> Heinke
>>
>
> For ease of reference, I have attached the original word file to this
> posting. Apologies for this very lengthy posting, but there is a lot
> of ground to cover here!
>
> Forty-one new names are proposed for indices that describe aspects of
> climate variability. The proposed names describe rather different
> quantities to any that currently exist in the table. It has taken
> me a while to analyse them but they seem to fall into five basic
> categories (see below). Within each of the first four categories I
> have included the definition of just one representative name to give
> an example of the type of quantity being discussed. For the full
> list of definitions please see the attachment.
>
> (1) "index_per_time_period" names. These are counts of days on which
> particular meteorological conditions existed.
>
>
>> frost_days_index_per_time_period; 1; frost days index is the number of
>> days where minimum temperature is below 0 degree Celsius. The time
>> period should be defined by the bounds of the time coordinate.
>> summer_days_index_per_time_period
>> consecutive_frost_days_index_per_time_period
>> consecutive_summer_days_index_per_time_period
>> ice_days_index_per_time_period
>> tropical_nights_index_per_time_period
>>
>
> These names all refer to conditions where the daily max or min
> temperature falls above or below an absolute threshold. I think we
> need not include "per_time_period" in the names as the period over
> which the days are counted should be clear if the time bounds are
> specified.
per_time_period
===============
Sorry, but we don't agree.
index_per_time_period or number_per_time_period is not the same as
index or number_of. This has an other dimension because you eliminate
the time dimension. The meanings are different.

> I wonder if we even need to include the word "index" in
> all the names - perhaps we could simply have:
>
index/number_of
===============
We think that we should use index or number_of consistently.
'index' is a characteristic number and not only a number.
'index' seems to be a more presice term for the statistical indices.
Do you agree ?

> number_of_frost_days
> number_of_summer_days
> number_of_ice_days
> number_of_tropical_nights
> For the other two names we could have:
> consecutive_frost_days_index
> consecutive_summer_days_index
>
>
>> growing_season_length_index
>>
> I think this name is OK. As it is calculated over a calendar year the
> time bounds would need to be specified appropriately.
>
>
>> heating_degree_days_per_time_period
>>
> This could be:
> number_of_heating_degree_days
> I do not have sufficient expertise to comment on the definition of
> heating degree days - is there a single standard for this or do
> alternative definitions exist?
>
========================================================================
Yes, there are alternative definition. (VDI, CAD) see
cdo_operators_names5.doc.
========================================================================
>
>> wet_days_index_per_time_period
>> heavy_precipitation_days_index_per_time_period
>> very_heavy_precipitation_days_index_per_time_period
>>
> These all essentially describe the same concept, i.e., the number of
> days when the daily accumulated precipitation exceeds a threshold.
> The thresholds defining wet, heavy precipitation and very heavy
> precipitation are 1 mm, 10 mm and 20 mm respectively. Perhaps we
> could combine these names and use a scalar coordinate variable to
> specify the threshold:
> number_of_days_accumulated_precipitation_exceeds_threshold
>
threshold
=========
I think we need more discussion. Do we have a structure group for this ?
We have more than one opinion in our group.

If we use threshold, we should use it consistently for temperature,
precipitation,
max wind speed, category(3) ...

For temperature (frost_days, summer_days, ice_days..) you accepted the
standard names without threshold,
but not for precipitation and the other indices. Why ?
I think our definitions are more precisely and not covered in the header
in the scalar
coordinate variable. For the whole definition we need the operator
(<,>,=,=>,=<...) and the
threshold. You put the operator into the standard name and the threshold
into the scalar coordinate
variable.
The standard name with the threshold becomes more or less a category.

On the other hand the number of standard names is a big problem. Do we
need categories.
The threshold could be a solution.
> Apart from reducing the number of names another advantage of this is
> that we would not need to introduce qualitative terms such as "heavy"
> and "very heavy" into the standard name table.
>
> As with the other "consecutive" names
>
>> consecutive_dry_days_index_per_time_period
>> consecutive_wet_days_index_per_time_period
>>
> could be simply:
> consecutive_wet_days_index
> consecutive_dry_days_index
>
>
>> frost_days_where_no_snow_index_per_time_period
>>
> In standard names the qualifier where_type specifies that, rather
> than applying to an entire grid box, the quantity applies only to the
> part of the grid box of the named type. However, I don't think that
> is what was intended here. My understanding is that this quantity
> is a count of the number of days with minimum temperature less than
> zero degrees Celsius when there is no lying snow anywhere in the
> grid box. Is that correct? If so, I would suggest something like:
> number_of_frost_days_without_snow_cover
>
=======================================================================
You are right, but we would suggest something like:
frost_days_without_no_snow_cover_index_per_time_period
=======================================================================
>
>> strong_breeze_days_index_per_time_period
>> strong_gale_days_index_per_time_period
>> hurricane_days_index_per_time_period
>>
> This is another case where the three names are essentially
> describing the same concept, i.e., the number of days when the
> maximum wind speed exceeds a particular threshold. The respective
> definitions for "strong breeze", "strong gale" and "hurricane" are
> maximum wind speeds exceeding 10.5 m s-1, 20.5 m s-1 and 32.5 m s-1.
> As with the precipitation names I think we could combine these into:
> number_of_days_maximum_wind_speed_exceeds_threshold
> and use a scalar coordinate variable to give the value of the
> threshold.
>
> (2) "index_wrt_Nth_percentile|mean_of_reference_period" names, where N
> is a number in the range 1-100 and the reference period is defined in
> a variable attribute. These are counts of days when particular
> meteorological conditions existed relative to a threshold defined from
> the reference period.
>
>
>> heat_wave_duration_index_wrt_mean_of_reference_period; 1; this is the
>> number of days per time period where in intervals of at least 6
>> consecutive days the daily maximum temperature is more than 5 degrees
>> above a reference value. The reference value is calculated as the
>> mean of maximum temperatures of a five day window centred on each
>> calendar day of a given 30 year climate reference period. The time
>> period should be defined by the bounds of the time coordinate. The
>> climate reference period should be defined by the variable attribute,
>> e.g., 1961-1990.
>> cold_wave_duration_index_wrt_mean_of_reference_period
>> warm_spell_days_index_wrt_90th_percentile_of_reference_period
>> cold_spell_days_index_wrt_10th_percentile_of_reference_period
>>
>
> I am not clear as to which attribute you would use to specify the
> reference period. Would you specify it as a string in the "comment"
> attribute or perhaps include a non-standard attribute? Presumably
> additional information would need to be specified in order to fully
> describe the reference data, e.g., which model it came from?.
>
=====================================================================
You are right. We have to reference the data and the period.
Do you have a proposal ?
The indices are postprocessed data and not sprecific for one model. We
want to use it for example for
CLM-model output.
====================================================================
> I wonder if we actually need to include "wrt_to_Nth_percentile" as
> part of the name since it is also part of the definition of the
> indices. Would these indices ever be calculated with respect to
> percentiles of the reference period other than those given here?
>
====================================================================
Who knows ?
====================================================================
> I think that terms such as "warm_spell", "cold_spell", etc., are
> acceptable if they are used as part of the name of a defined index,
> although I admit that I have never heard the term "cold_wave" before.
> Do others agree?
>
> (3) "percent of time wrt_to_Nth_percentile|mean_of_reference_period"
> names, where N is a number in the range 1-100 and the reference period
> is defined in a variable attribute. These names express the percent
> of time per time period (specified by time bounds) that conditions
> exist above or below a threshold calculated from the reference period.
>
>
>> cold_nights_percent_wrt_10th_percentile_of_reference_period; 1; this
>> is the per cent of time per time period where daily minimum
>> temperature is below a reference value. The reference value is
>> calculated as the 10th percentile of daily minimum temperatures of a
>> five day window centred on each calendar day of a given 30 year
>> climate reference period. The time period should be defined by the
>> bounds of the time coordinate. The climate reference period should
>> be defined by the variable attribute, e.g., 1961-1990.
>>
>
>
>> warm_nights_percent_wrt_90th_percentile_of_reference_period
>> very_cold_days_percent_wrt_10th_percentile_of_reference_period
>> cold_days_percent_wrt_10th_percentile_of_reference_period
>> very_warm_days_percent_wrt_90th_percentile_of_reference_period
>> warm_days_percent_wrt_90th_percentile_of_reference_period
>>
>
> (The following four names omit the word "percent" but are defined in
> an analogous way to other names in this category)
>
>> moderate_wet_days_wrt_75th_percentile_of_reference_period
>> wet_days_wrt_90th_percentile_of_reference_period
>> very_wet_days_wrt_95th_percentile_of_reference_period
>> extremely_wet_days_wrt_99th_percentile_of_reference_period
>>
>
> The general comments I made in category (2) as to how much
> information is required to describe the reference data also apply
> here. Another general comment is that I would prefer to use
> "fraction" rather than "percent" as it is more consistent with
> existing standard names.
>
=====================================================================
The data unit is 'percent' and not 'fraction'. Values are between 0 and 100.
=====================================================================
> I feel that what is needed for all the names in this category is a
> general way of describing the following concept:
> fraction_of_days_when_min|max|mean|accumulated_temperature|precipitation
> _lies_above|below_reference_period_threshold
> For example, the definitions of moderate_wet, wet, very_wet and
> extremely_wet days differ only in the percentile of the reference
> period to which they are compared. Whereas in the names of
> indices I think it is OK to use terms like "warm_spell" I am far less
> happy about having qualitative words like "very" and "extremely" when
> referring to a physical quantity such as precipitation. In the case
> of the four wet day names it seems natural to combine them as:
> fraction_of_days_when_precipitation_lies_above_reference_period_threshol
> d
> and supply the appropriate threshold percentile in a scalar
> coordinate variable.
>
> The temperature related names could also be fitted into this pattern,
> for example,
> cold_nights_percent_wrt_10th_percentile_of_reference_period
> would become
> fraction_of_days_when_minimum_temperature_lies_below_reference_period_th
> reshold
> and the percentile would again be given as a scalar coordinate
> variable.
>
> Comments on this idea would be most welcome!
>
> (4) "per cent of amount per time period" names. These names express
> the percentage of a quantity, e.g., precipitation, that has occurred
> during a time period (defined by time bounds) from days when
> particular meteorological conditions existed.
>
>
>> precipitation_percent_due_to_R75p_days; 1; percentage of total
>> precipitation amount per time period due to
>> moderate_wet_days_wrt_75th_percentile_of_reference_period (see
>> category (3)).
>>
>
>
>> precipitation_percent_due_to_R90p_days
>> precipitation_percent_due_to_R95p_days
>> precipitation_percent_due_to_R99p_days
>>
>
> R75p, R90p, R95p and R99p refer to the wet days discussed in
> Category (4). For example, R75p refers to "moderate wet" days so that
> precipitation_percent_due_to_R75p_days means the percentage of
> precipitation accumulated over the whole time period that fell on
> moderate wet days. I think that it would be better to make the names
> explicit rather than use these abbreviations. Also, I would
> prefer to use "fraction" rather than "percent". Again, given the
> similarity between all four names and definitions it seems reasonable
> to combine them as:
> fraction_of_accumulated_precipitation_due_to_days_exceeding_reference_pe
> riod_threshold
> and supply the threshold percentile in a coordinate variable.
>
> Please comment!
>
> (5) physical variable names
>
>
>> intra_period_extreme_temperature_range; K; difference between the
>> absolute extreme temperatures in the observation period.
>>
>
> I think that the period would need to be specified using time bounds
> and should not appear in the name. In standard names the quantity
> "air_temperature" is used for temperature and a vertical coordinate
> specifies the level in the atmosphere to which the temperature refers.
> I think that perhaps we need to define a new cell_method of "range".
> The standard name would then simply be:
> air_temperature
> (which, of course, is already in the table) with cell_methods =
> "time: range". Do others agree?
>
=====================================================================
We don't agree, because it is not the air_temperature it is the
difference of air temperature.
=====================================================================

>
>> highest_one_day_precipitation_amount_per_time_period; kg m-2; highest
>> one day precipitation is the maximum of one day precipitation amount
>> in a given time period.
>> highest_five_day_precipitation_amount_per_time_period; kg m-2;
>> highest precipitation amount for five day interval (including the
>> calendar day as the last day).
>>
>
> I would not include "per_time_period" as the period information
> should come from the time bounds. To describe the maximum of
> precipitation over different time intervals we can use cell_methods
> with the additional information in parenthesis (see CF1.0 7.3). Hence
> we would have just one name of:
> precipitation_amount
> (already in the standard name table) which would be specified with
> appropriate time bounds and
> cell_methods = "time: maximum (interval: 1 day)" or
> cell_methods = "time: maximum (interval: 5 days)" as appropriate.
> Please note that the units for a precipitation amount are kg m-2.
>
========
o.k.
========
>
>> simple_daily_intensity_index_per_time_period; kg m-2; simple daily
>> intensity index is the mean of precipitation amount on wet days. A
>> wet day is a day with precipitation sum exceeding 1 mm.
>>
>
> Again I would exclude "per_time_period". "Intensity" sounds to me
> like a precipitation rate rather than an amount. I think we could
> again use:
> precipitation_amount
> with appropriate time bounds and
> cell_methods = "time: mean (over days with precipitation thickness
>
===========
not thickness -> amount
===========
>> 1 mm)".
>>
> In this case there is no standard syntax for the information in
> parenthesis - it is just an explanation of how the cell_method was
> applied to the data.
>
===================

simple_daily_intensity_index is a commonly used term, but it is your decision.

===================

>
>> wind_chill_temperature; Celsius; windchill temperature describes the
>> fact that low temperatures are felt to be even lower in case of wind.
>> It is based on the rate of heat loss from exposed skin caused by wind
>> and cold. It is calculated according to the empirical formula:
>> 33+(T-33)*(0,478+0,237*(SQRT(ff*3,6)-0,0124*ff*3,6)
>> T = air temperature in degree Celsius, ff - 10m wind speed in m/s.
>> Windchill temperature is only defined for temperatures at or below 33
>> degree Celsius and wind speeds above 1.39 m/s. It is mainly used for
>> freezing temperatures.
>>
>
> "Wind_chill" is a commonly used term and I am happy to accept it.
> For consistency with other temperature names perhaps we should have:
> wind_chill_air_temperature
> The canonical units would need to be Kelvin. Please can someone tell
> me if it is OK to use the "add_offset" attribute to specify a
> conversion between Kelvin and Celsius if that is required?
>
>
===================================
Sorry, but we are not happy with the wind_chill_air_temperature in Celsius
because it should be Kelvin. The formula for the definition should be in
Kelvin
and no 'add_offset' should be used. I think that is much better.
===================================
>> hum_index; Celsius; humindex describes empirically in units of
>> temperature how the temperature and humidity influence the wellness
>> of a human being.
>> HI = T + 5/9 *(A-10)
>> with A = e * (6.112 * 10 ** ((7.5 *T)/(237.7 + T)) * R/100 )
>> T = air temperature in degree Celsius, R = relative humidity in %,
>> e= vapour pressure.
>>
>
> It strikes me as strange to have a name that refers to humidity with
> units of Celsius. Is this index the same as the one I have heard
> called "comfort index"? In any case I think that a more explanatory
> name which is consistent with the units would be better. My
> question regarding the conversion of Kelvin to Celsius also applies
> to this name.
>
=================
I agree totally with you. To refer to humidity with units of Celsius is
awful. I like to withdraw the hum_index.
================
> Best wishes,
> Alison
>
> ------
> Alison Pamment Tel: +44 1235 778065
> NCAS/British Atmospheric Data Centre Fax: +44 1235 446314
> Rutherford Appleton Laboratory Email: J.A.Pamment at rl.ac.uk
> Chilton, Didcot, OX11 0QX, U.K.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
Received on Fri May 04 2007 - 01:31:08 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒