⇐ ⇒

[CF-metadata] CF Controlled Vocabulary

From: Roy Lowry <rkl>
Date: Mon, 06 Aug 2007 11:54:50 +0100

Dear Frank,

Can I start by saying thanks for taking this forward and that your proposal has my whole-hearted support.

In essence what you have put forward is a simple XML schema for encoding the knowledge of how an entity may be derived from several fields in a CF file and for assigning a label to that entity. I have three comments on your examples.

(1) The XML encoding has been done independently of any W3C standards for knowledge encoding, such as OWL, which could have advantages in terms of tooling availability and interoperability. What are your feelings about heading in that direction? Is there anyone reading the list interested in taking this road?

(2) I feel the schema should allow the derived entity to carry more than one label so that synonym issues and support for standardised abbreviations may be addressed.

(3) I didn't understand the encoding specifying the height at which the temperature to be taken should be 2m. I would have expected a standard name for a CF dimension specifying height to be encoded somewhere. Note that I am not a regular CF hands-on user so this may be that I need a bit of education so I can understand.

Cheers, Roy.

>>> Frank Toussaint <Frank.Toussaint at zmaw.de> 7/12/2007 11:59 am >>>
Dear all,

at the CF day in Paris there was a discussion on a Controlled Vocabulary (CV) system to specify variables in the CF header semantically. Please, find below a first draft of a proposal for this.

On the behalf of the World Data Center CLimate (WDCC)
   Heinke Hoeck and Frank Toussaint



Proposal for a CF Controlled Vocabulary as additional CF Header Entry
----------------------------------------------------------------------
(WDCC, Hamburg, July 10th, 2007)

In earth sciences, the NetCDF-CF standard gets more and more used. Not only in the field of climate and forecast but also in adjacent scientific branches of Earth system research.

However, information that provides a definitive description of what the data in each variable represents is often spread over various attributes in the NetCDF-CF file: one standard name can well stand for different parameters if it is accompanied by modifiers like "variance" or by a CF scalar coordinate. Other CF standard_names, of course, are "standalone" in the sense that they represent a completely defined parameter.

On the other hand, climatologists and other scientists have named their quantities by identifiers that are convenient to use, e.g., in climate models but are often not intrinsic systematic.

To allow for common semantic designators that define file contents in the granularity needed, we propose a CF "Controlled Vocabulary" (CV). Each of these CV keywords should reflect a well defined set of attribute/value pairs of the NetCDF-CF standard including a CF standard_name. The CV entry should be included in the variable's section of the NetCDF-CF header in addition to the existing CF standard attributes.

To keep consistency between the CV keywords of a standard Controlled Vocabulary and the associated attribute/value set a mapping between both should be maintained on a central server by e.g. the CF standard names committee. A reference to the CV mapping should be included in the file header information.

By this approach the consistency of the existing CF attributes is not affected, nor is the CF conformance of the file or the concept of CF standard_names. Nevertheless, the CV will enable the CF community to join frequently used attribute/value sets into one semantic entity, making handling of complex variables easier and more community adapted. E.g., climatological standard designators like ice day or tropical night, which are difficult to integrate into the existent CF standard_name scheme can be addressed by their defining set of CF attributes wherever CV keywords are used (e.g., portals, catalogues). The consistency between CV keyword and CF standard_names plus attributes should be ensured by the CF checker.

Additionally, we want below to add a few concrete examples of candidates for a CV controlled vocabulary. These can more or less easily be expressed by existing CF standard names plus attributes. With some others we still are in discussions.




Examples of CV candidates (xml notation)
----------------------------------------
<cfControlledVocabulary cvName="CERAtopics">

<!-- example 1: topic is alias for standard_name: radiation ========== -->
<item controlledName="toa incoming shortwave flux">
  <variable typeName="float tisf">
      <standardName>toa_incoming_shortwave_flux</standardName>
      <units>W m-2</units>
  </variable>
</item>

<!-- example 2: cell methods: min ==================================== -->
<item controlledName="minimum surface temperature">
  <variable typeName="float tmin">
     <longName>air temperature minima during coming 6 hours</longName>
     <standardName>air_temperature</standardName>
     <units>K</units>
     <cellMethods>time: minimum (interval: 6 hours)</cellMethods>
  </variable>
</item>

<!-- example 3: scalar coordinate: 2m level ========================== -->
<item controlledName="air_temperature-at2m">
  <variable typeName="float t2m">
     <longName>air temperature at 2 metres height</longName>
     <standardName>air_temperature</standardName>
     <coordinates>lev2m</coordinates>
  </variable>
  <variable typeName="float lev2m">
     <longName>level of 2 metres height</longName>
     <units>meter</units>
  </variable>
  <data variable="lev2m">2.</data>
</item>

<!-- example 4: cell methods + scalar coord: daily temp max at 2m ==== -->
<item controlledName="maximum 2-meter temperature">
  <variable typeName="float tmax_2m">
     <longName>daily air 2m temperature maxima</longName>
     <standardName>air_temperature</standardName>
     <units>K</units>
     <cellMethods>time: maximum (interval: 1 day)</cellMethods>
     <coordinates>lev2m</coordinates>
  </variable>
  <variable typeName="float lev2m">
     <longName>level of 2 metres height</longName>
     <units>meter</units>
  </variable>
  <data variable="lev2m">2.</data>
</item>

</cfControlledVocabulary>

-- 
/** Dr. Frank Toussaint      Max-Planck-Institut f?r Meteorologie
  *  Pfitznerstr. 69            M&D / World Data Center - Climate
  *  22761 Hamburg                   Bundesstr.53 - 20146 Hamburg
  *  priv.Tel.: 040-3861 9285      office phone: +49-40-41173-175
  *  www.Leuchtturm-Atlas.de      e-mail: Frank.Toussaint at zmaw.de */ 
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu 
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-- 
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
Received on Mon Aug 06 2007 - 04:54:50 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒