⇐ ⇒

[CF-metadata] Standard names status update

From: Pamment, JA <J.A.Pamment>
Date: Fri, 8 Dec 2006 03:17:00 -0000

It is now just over two months since the standard name table was updated and I am planning another update to take place on Thursday 14th December. The names listed below under the individual discussion threads will be added on that day unless any objections are received in the meantime. This is a lengthy posting because there are a lot of live discussion topics at the moment. However, I have tried hard to make this report as easily digestible as possible!

1. Standard names for altimeter data. Proposed by: Olivier Lauret.

Names to be added:
sea_surface_height_above_reference_ellipsoid; m
surface_eastward_geostrophic_sea_water_velocity; m s-1
surface_northward_geostrophic_sea_water_velocity; m s-1
surface_eastward_geostrophic_sea_water_velocity_assuming_sea_level_for_g
eoid; m s-1
surface_northward_geostrophic_sea_water_velocity_assuming_sea_level_for_
geoid; m s-1

2. Names proposed by Beate Geyer and Alex Ruane in discussion thread
"new standard names 1-33".

Names to be added:
baseflow_amount; kg m-2
runoff_excluding_baseflow; kg m-2
surface_net_downward_radiative_flux_where_land; W m-2
volume_fraction_of_water_in_soil_at_critical_point; 1
normalized_difference_vegetation_index; 1
correction_for_model_negative_specific_humidity; 1;

The addition of these names will mean that the proposals remaining to be resolved are those numbered 2, 5, 7, 10, 12, 17, 28, 29 and 30.

3. Standard names for aerosols and chemistry. Proposed by: Christiane Textor.

Christiane has produced version 5 of her table, dated 31st October 2006, at http://wiki.esipfed.org/index.php/CF_Standard_Names_-_Proposed_names_for_TF_HTAP. I would like to thank Christiane for all her hard work in producing and maintaining her table - it simplifies my task enormously.

Many of the proposals seem to have no further issues to resolve. If no objections are received I will include them in the 14th December update. The agreed names, under Christiane's headings, are as follows:

Dry deposition flux at the surface -
surface_dry_deposition_mass_flux_of_all_nitrogen_oxides_expressed_as_nitrogen; kg m-2 s-1

Wet deposition flux at the surface -
surface_wet_deposition_mass_flux_of_all_nitrogen_oxides_expressed_as_nitrogen; kg m-2 s-1

Emission fluxes -
atmosphere_emission_mass_flux_of_nox_expressed_as_nitrogen; kg m-2 s-1
atmosphere_emission_mass_flux_of_non_methane_volatile_organic_compounds_expressed_as_carbon; kg m-2 s-1
atmosphere_emission_mass_flux_of_anthropogenic_non_methane_volatile_organic_compounds_expressed_as_carbon; kg m-2 s-1
atmosphere_emission_mass_flux_of_biogenic_non_methane_volatile_organic_compounds_expressed_as_carbon; kg m-2 s-1

Volume mixing ratio -
mole_fraction_of_ozone_in_air; 1
mole_fraction_of_cabon_monoxide_in_air; 1
mole_fraction_of_nitrogen_monoxide_in_air; 1
mole_fraction_of_nitrogen_dioxide_in_air; 1
mole_fraction_of_nitrogen_acid_in_air; 1
mole_fraction_of_peroxyacetyl_nitrate_in_air; 1
mole_fraction_of_hydroxyl_radical_in_air; 1
mole_fraction_of_sulfur_dioxide_in_air; 1
mole_fraction_of_hexachlorbiphenyl_in_air; 1
mole_fraction_of_alpha_hexachlorocyclohexane_in_air; 1
mole_fraction_of_mercury_in_air; 1
mole_fraction_of_divalent_mercury_in_air; 1
mole_fraction_of_anthropogenic_non_methane_volatile_organic_compounds_in_air; 1
mole_fraction_of_biogenic_non_methane_volatile_organic_compounds_in_air; 1

Mass mixing ratio -
mass_fraction_of_pm10_aerosol_in_air; 1
mass_fraction_of_pm2p5_aerosol_in_air; 1
mass_fraction_of_pm1_aerosol_in_air; 1

Fluxes due to chemical reactions -
chemical_gross_production_rate_of_mole_concentration_of_ozone; mole m-3 s-1
chemical_destruction_rate_of_mole_concentration_of_ozone; mole m-3 s-1
chemical_destruction_rate_of_mole_concentration_of_methane; mole m-3 s-1
chemical_destruction_rate_of_mole_concentration_of_carbon_monoxide; mole m-3 s-1

Optical thickness -
pm10_ambient_aerosol_optical_depth; 1
pm2p5_ambient_aerosol_optical_depth; 1
pm1_ambient_aerosol_optical_depth; 1
nitrate_ambient_aerosol_optical_depth; 1
sulfate_ambient_aerosol_optical_depth; 1
ammonium_ambient_aerosol_optical_depth; 1
black_carbon_ambient_aerosol_optical_depth; 1
organic_carbon_ambient_aerosol_optical_depth; 1
seasalt_ambient_aerosol_optical_depth; 1
dust_ambient_aerosol_optical_depth; 1
water_in_ambient_aerosol_optical_depth; 1

Others -
cell_area; m2
cell_thickness; m
atmosphere_mass_of_air_per_unit_area; kg m-2

Proposals with issues remaining to be resolved, and which will _not_ be included in the 14th December update, relate to those names using the "expressed_as_such" construction.

The construction "expressed_as" was originally proposed by Roy Lowry (9th October 2006). There is general agreement on the mailing list that it is a useful and clear form of words in names such as surface_dry_deposition_mass_flux_of_all_nitrogen_oxides_expressed_as_nitrogen (meaning the mass flux of the nitrogen contained in the nitrogen oxides being deposited). However, a question mark remains over whether and in what circumstances "expressed_as_such" should be used. "as_such" was suggested by Jonathan (4th October 2006) for names needing the construction "X_expressed_as_Y" when X=Y. The aim was to avoid unnecessarily long names. I have tried to sum up what I see as the three key issues below.

(a) Why is "expressed_as_such" (or, alternatively, "X_expressed_as_Y" when X=Y) needed at all?
Jonathan posed this question on 21st September 2006:
>
> The construction X_as_Y seems fine to me, to indicate that you mean X when
> it comes as a constituent of Y. But it some cases I wonder if it is really
> necessary. What is meant by ammonium_as_ammonium?
>
Christiane replied on 26th September 2006:
>
> In our community, the mass of molecules like NH4 (or SO4) are sometimes
> given as N (or S). To avoid this ambiguity, I have added
> ammonium_as_ammonium or sulfate_as_sulfate.
>
I think this shows that there is a clear need to be able to state unambiguously that the mass flux of a molecule is being calculated using the mass of the molecule itself and not in terms of the mass of another species. This point seems to be generally accepted on the mailing list.

(b) Is it preferable to use "expressed_as_such" or "X_expressed_as_Y" when X=Y in constructing standard names?
Christiane, in her posting of 31st October 2006, reported that "expressed_as_such" had caused some confusion within the HTAP community:
>
>There was a comment on the mercury variables by Chris Holmes on the wiki
>page, which I copy here with my comments from
>http://wiki.esipfed.org/index.php/Talk:CF_Standard_Names_->_Proposed_names_for_TF_HTAP
>
> > For example:
> > surface_dry_deposition_mass_flux_of_divalent_mercury_expressed_as_such
> >
> > Does this mean that the corresponding variable gives the mass of the
> > divalent mercury compound that is dry deposited? Rather than the mass
> > of mercury within the deposited mercury compound?
> >
>
> I meant to refer to the mass of mercury within the deposited mercury
> compound, but is this appropriate?
>
In the same posting Christiane gave other examples of confusion arising from the use of "expressed_as_such". Given that members of the community for whom these standard names are being created are not entirely sure of their meaning I propose that the construction "expressed_as_such" should not be used. Instead, standard names should use the construction "X_expressed_as_Y" when X=Y. This will, on occasion, result in some fairly long names but I think that is infinitely preferable to having potential misinterpretations of the data. It also has the advantage that names will be constructed more consistently regardless of whether X and Y are the same or different species. Your comments are invited!

(c) When should "X_expressed_as_Y" when X=Y be included in a name?
Christiane (31st October 2006) wrote:
>
>I find the terms X_expressed_as_Y, or X_expressed_as_such, if X=Y very
>clear, but I am not sure when to apply it, because there is also just X.
>For example should it be dust_dry_aerosol or
>dust_expressed_as_such_dry_aerosol? In contrast to
> e.g. dust_expressed_as_silicate_dry_aerosol.
>
Jonathan (15th November 2006) suggested a pragmatic approach to this issue when considering the example of surface_dry_deposition_mass_flux_of_particulate_organic_matter_dry_aerosol_expressed_as_mass_of_particulate_organic_matter:
>
> could particulate_organic_matter_dry_aerosol be expressed as anything
> other than particulate_organic_matter_dry_aerosol? What alternatives are
> there? This may be a different situation from
> carbon_dioxide_expressed_as_carbon or expressed_as_carbon_dioxide, when
> there is an obvious ambiguity.
>
I agree with Jonathan on this. If a species is unlikely ever to be expressed as another then we could agree to omit from the name "expressed_as" and whatever follows. This, in some cases at least, will lead to shorter names. I would need to rely on Christiane's guidance on whether it is safe to omit "expressed_as" for any given species as I do not have any expertise in the field of aerosols and chemistry. Once again, your comments are invited.

4. CF and multi-forecast system ensemble data. Proposed by: Francisco Doblas-Reyes (Paco).

As everyone will doubtless be aware, this thread has led to a very high volume of posts on the mailing list ranging from discussion of standard names, via alternative methods for handling ensembles, to future development plans for the CF standard. Personally, I have learned a great deal from reading the various threads associated with this topic but in this summary I have confined myself to discussing it in relation only to standard names.

(a) What are the proposed standard names?
The initial proposals for standard names were discussed and refined by Paco, Jonathan Gregory and Jamie Kettleborough. They led to the following names being put forward:
experiment_id (STRING);
ensemble_member OR initial_condition (STRING);
institution (STRING);
source (STRING);
original_distributor (STRING);
production_status (STRING);
sst_specification (STRING);
real_time (CHARACTER);
archive_date (INTEGER, units=days from specific date).
The last five of these have not been the subject of any discussion. It was proposed that the first four names would be accompanied by a change to the CF1.0 standard to allow them to be used as either standard names or global variables.

(b) Why were the additional standard names proposed?
The name "realization" was added to the standard name table at the last update on 26th September 2006 as a means of indexing ensemble members. However, Paco explained that "realization" alone would not be sufficient to cope with multi-forecast system ensembles because, for example, one may wish to apply different statistical weights to ensemble members produced by different models or at different institutions and the various pieces of metadata necessary to allow this could not easily be conveyed in a single coordinate variable. The requirement for the additional metadata to be accommodated by the CF standard is widely accepted by those contributing to the discussion.

(c) Should ensemble metadata be given standard names?
There is general agreement on the mailing list that there is a need for ensemble data to be dimensioned within files as (realization,time,height,lat,lon). The "realization" dimension would span the individual ensemble members (these may form, for example, a multi-system forecast, an initial condition ensemble, or a perturbed physics ensemble). The additional metadata, such as the name of the model used to produce the ensemble member, could then be supplied in auxiliary coordinate variables with the same dimension as "realization" and having standard names of "source", "institution", etc.

Bryan Lawrence, posting on 27th October 2006, triggered a broad discussion on how CF metadata should be governed in the future and, in particular, whether it would be appropriate to make provision for the additional ensemble metadata via the standard name mechanism. Bryan argued that the metadata needed to describe the models and instruments that produce data should be treated separately from the metadata describing a (modelled or measured) physical variable such as air temperature. The latter are dealt with in CF1.0 by placing them in variables named according to the standard name table. Bryan was concerned that the CF community should not try to govern all the vocabulary needed to describe, for example, IPCC forcing scenarios used in climate model experiments or instrument characteristics that are already governed by other standards, such as those used in SensorML.

(d) Alternative proposal to using standard names for multi-forecast system ensemble metadata. Proposed by Bryan Lawrence.

N.B. Here I have drawn together and paraphrased material from Bryan's posts of 27th October, 31st October and 20th November 2006. The proposed methodology is general and need not apply only to ensemble metadata.

The proposal is that there should be separate controlled vocabularies to describe the scientific contents of a variable (using standard names, e.g, air_temperature) and to describe how the contents were produced (e.g., the name of a climate model or observing system). This would be achieved as follows:
(i)a new class of standard identifiers called "standard_metadata" should be created which would be governed within the CF standard, but separately from standard names. In standard_metadata would be items such as the current global file attributes;
(ii) where possible *for metadata* external vocabularies should be used. By definition, these would not be governed within CF. External vocabularies would be referenced using a URI contained within an auxiliary coordinate variable.

An example combining the standard_metadata and external vocabulary methods for supplying metadata is given below:
  temperature(realization,time,lat,lon):
    temperature:coordinates = 'time lat lon metadata1 metadata2' ;
  char metadata1(realization,len100):
    metadata1:standard_metadata="institution"; // for instance
  char metadata2(realization,len100):
    metadata2:external_vocabulary = http://wmo.foo.int/identifierY

Where suitable external vocabularies exist they should be used in preference to adding to the CF controlled vocabularies. If an item of metadata exists in a CF controlled vocabulary and an external vocabulary they should not both be attached to the same data variable as they may well have different meanings and this would only lead to confusion.

(e) Conclusion
Discussion by many list contributors following Bryan's 27th October post has shown that the CF community would be able to work with the proposed method and there seems to be agreement that it provides the functionality needed to accommodate multi-forecast system ensembles. My own opinion is that it offers a flexible solution which will allow CF users to benefit from the work done in creating other metadata standards while at the same time allowing vocabularies to be governed within CF when that is deemed to be necessary.

Given the direction that the discussion has taken since the initial proposals were made on 15th October 2006 I will now close "ensembles" as a standard names issue. The names will _not_ be added to the table. However, ensembles will most definitely remain open as a CF1.0 conventions issue.

5. New standard names for variables concerning sea surface waves. Proposed by Heinz Guenther. and Beate Geyer.
These names are currently under discussion and will not be added to the table as yet.

6. NMAT (Nighttime Marine Air Temperature) names. Proposed by Julian Hill.
These names are currently under discussion and will not be added to the table as yet.

7. Proposed standard names for biological model outputs. Proposed by Michael Godin.
These names are currently under discussion and will not be added to the table as yet.

------
Alison Pamment Tel: +44 1235 778065
NCAS/British Atmospheric Data Centre Fax: +44 1235 445858
Rutherford Appleton Laboratory Email: J.A.Pamment at rl.ac.uk
Chilton, Didcot, OX11 0QX, U.K.
Received on Thu Dec 07 2006 - 20:17:00 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒