⇐ ⇒

[CF-metadata] Encoding Errors on variables in CF

From: Jonathan Gregory <jonathan.gregory>
Date: Wed, 16 Apr 2003 23:12:53 +0100

Dear All

It appears there's quite a lot of agreement, but I also think there might have
been a small misunderstanding. The prefix I proposed of "ancillary_data_for"
was a prefix on the standard_name i.e. on the *value* of an attribute. It
wasn't intended to be the *name* of a new attribute and it wasn't naming a
variable. However, I agree with Russ that ancillary_variables is a better name
than ancillary_data for an attribute pointing to variables.

Bryan has suggested that in addition to this link from parent variable to
ancillary variables, there is also a need for a link from ancillary to parent
variables. What do others think? I am a bit uneasy about this further
redundancy, which could lead to internal inconsistency in the file. A parent
variable can be found unambiguously by searching all variables to find the one
which points to the ancillary variable of interest. But that is more laborious,
of course.

Without this addition, the modified summary is:

(1) Point from a variable to its associated ancillary data variables (error
variables, data quality variables and others not yet thought of) through a
blank-separated list of variable names in a ancillary_variables attribute. Be
aware this link might get broken.

(2) Give an ancillary variable a standard_name which is constructed by adding
the prefix of "ancillary_data_for_" to the standard_name of its parent variable.
The ancillary variable should also have copies of all the other attributes of
its parent variable, so that it is fully self-describing.

(3) Give it also an intent attribute with a standardised value to define the
kind of ancillary data, and other attributes such as error_multiplier if
required to be precise.

(4) Use attributes flag_values and flag_meanings to provide interpretations of
ancillary variables containing flag data.

What this means for John's new example is at the end.

There's a drawback with these "generalised" ancillary standard names that you
can't define the unit for the standard name. For example, the unit for a
standard error is the same as that of the parent quantity, but the unit for a
data quality variable is dimensionless. This breaks the principle that you can
deduce the unit from the standard_name (and the cell_methods). In this scheme
you have to strip off the prefix ancillary_data_for_. Is this OK?

One alternative is what we originally discussed, not to have a separate intent
attribute, but to have standard_names like
standard_error_of_wind_from_direction. We didn't like that because it would
make the standard_name table huge. But if an application can be expected to
recognise standard prefixes on standard_names, we wouldn't have to put all
these entries for ancillary variables in the standard_name table explicitly.

We could make this easier, perhaps, by extending the syntax so that the
standard_name attribute could contain two words e.g.
"wind_from_direction standard_error"
In that case, a generic application would find it easier to decompose the
standard name into two strings which it looks up in separate tables for
units information. Would that be any better?

Sorry to make the discussion more complicated again.

Jonathan

  float wind_direction_uv(time, wind_depth, lat, lon) ;
    wind_direction_uv:standard_name = "wind_from_direction" ;
    wind_direction_uv:ancillary_variables=
      "wind_direction_uv_stddev wdir_uv_qc";
  float wind_direction_uv_stddev(time, wind_depth, lat, lon) ;
    wind_direction_uv_stddev:intent = "standard_error" ;
  byte wdir_uv_qc(time, wind_depth, lat, lon) ;
    wdir_uv_qc:intent = "data_quality" ;
    wdir_uv_qc:flag_values = 0b, 1b; 2b ;
    wdir_uv_qc:flag_values = "quality_good out_of_range sensor_nonfunctional" ;

I'm not sure I quite understand the distinction between the two wind directions.
Is it that one is the direction of the time-average vector wind, and the other
the time-average of the direction of the vector wind? That's interesting - I
think there ought to be a way to distinguish these with cell_methods, but there
isn't at present.
Received on Wed Apr 16 2003 - 16:12:53 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒