⇐ ⇒

[CF-metadata] Encoding Errors on variables in CF

From: Brian Eaton <eaton>
Date: Thu, 3 Apr 2003 16:24:16 -0700

All,

On Sun, Mar 30, 2003 at 10:58:01PM +0100, Jonathan Gregory wrote:
> The idea of making links between variables depends on the assumption that the
> error variable can only exist in the company of the data variable it belongs
> to, because the links are the identification of its role. This idea is like
> bounds on coordinate variables, which are subsidiary to the coordinate
> variables and have no attributes of their own. A problem with this, in my view,
> is that we might choose to store the error variables apart from the variables
> they belong to.

I don't think CF needs to address all possible ways that someone might
choose to distribute data between files. For example, we don't provide
conventions to store coordinates in a separate file from the data
variables.

Error measures are used to interpret the data they are associated with, so
data and error variables must be associated in some manner. I think that
directly linking them via an attribute is the simplest way of doing this.
Consider the situation where some quantity is being measured by multiple
instruments in a field program. Following Jonathan's recent suggestion I
might have:

float no2_i1(time);
  no2_i1:long_name = "no2 from instrument 1"
  no2_i1:standard_name = "nitrogen_dioxide_volume_mixing_ratio" ;
  no2_i1:units = "1-e9" ;
float no2_i1_error_limit(time) ;
  no2_i1_error_limit:standard_name = "nitrogen_dioxide_volume_mixing_ratio";
  no2_i1_error_limit:units = "1-e9" ;
  no2_i1_error_limit:intent = "standard_error" ;
float no2_i2(time);
  no2_i2:long_name = "no2 from instrument 2"
  no2_i2:standard_name = "nitrogen_dioxide_volume_mixing_ratio" ;
  no2_i2:units = "1-e9" ;
float no2_i2_error_limit(time) ;
  no2_i2_error_limit:standard_name = "nitrogen_dioxide_volume_mixing_ratio";
  no2_i2_error_limit:units = "1-e9" ;
  no2_i2_error_limit:intent = "standard_error" ;

In this example the error variables can't be unambiguously associated with
the corresponding data via the standard_name, so additional metadata would
be required to make the connection. On the other hand a linking attribute
makes this simple (I have pluralized the "error_variable" suggestion from
Bryan):

float no2_i1(time);
  no2_i1:long_name = "no2 from instrument 1"
  no2_i1:standard_name = "nitrogen_dioxide_volume_mixing_ratio" ;
  no2_i1:units = "1-e9" ;
  no2_i1:error_variables = "no2_i1_error_limit" ;
float no2_i1_error_limit(time) ;
  no2_i1_error_limit:units = "1-e9" ;
float no2_i2(time);
  no2_i2:long_name = "no2 from instrument 2"
  no2_i2:standard_name = "nitrogen_dioxide_volume_mixing_ratio" ;
  no2_i2:units = "1-e9" ;
  no2_i2:error_variables = "no2_i2_error_limit" ;
float no2_i2_error_limit(time) ;
  no2_i2_error_limit:units = "1-e9" ;

The attribute "error_variables" can take multiple names. This deals with
the following example from Ag:

float no2(time) ;
  no2:standard_name = "no2_mixing_ratio" ;
  no2:long_name = "Nitrogen Dioxide Mass Mixing Ratio" ;
  no2:units = "1-e9" ;
  no2:error_variables = "no2_error_limit no2_detection_limit" ;
float no2_error_limit(time) ;
  no2_error_limit:long_name = "Nitrogen Dioxide Error Limit" ;
  no2_error_limit:units = "1-e9" ;
  no2_error_limit:comment = "Units are given in parts per
    billion by volume. The error limit is quoted for 2 sigma random errors plus
    systematic uncertainties derived from cross-sectional fits." ;
float no2_detection_limit(time) ;
  no2_detection_limit:long_name = "Nitrogen Dioxide Detection Limit" ;
  no2_detection_limit:units = "1-e9" ;
  no2_detection_limit:comment = "Units are given in parts per
    billion by volume. The detection limit is quoted for 2 standard deviations." ;

The example sent by John Evans can also be dealt with using the linking
attribute.

The other issue besides linking is to actually describe the error measures.
Ag's suggestion of using the comment field to describe the error is a
solution that's already accomodated by CF (i.e., the comment attribute is
allowed on any variable). Jonathan has begun to standardize the
description with his "intent" attribute. But, if we don't use the
standard_name as a linking method, then it is available for its intended
use, i.e., describing the quantities contained in variables. In that case
the values being suggested as intents could instead be new standard names.

The use of flag variables is common and I think Jonathan's proposed
attributes "flag_values" and "flag_meanings" are a good idea. It seems
that if a variable has these attributes then further description via a
standard name, such as "data_quality", is unnecessary.

Summary of proposed changes to CF:
. Add the "error_varibles" attribute to link data and error variables.
. Add new standard_name values to provide descriptions of the error types.
. Add the "flag_values" and "flag_meanings" attributes to describe flag
  variables.

Brian
Received on Thu Apr 03 2003 - 16:24:16 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒