⇐ ⇒

[CF-metadata] interplay of standard name modifiers, cell_methods -- is there a problem?

From: Lowry, Roy K. <rkl>
Date: Wed, 3 Apr 2013 08:46:29 +0100

Hi Jim,

There are a lot of nails being hit on the head at the moment. The Standard Name attribute was conceived as a standardised label for the geophysical phenomenon - a sort of grouping term for what was being measured. Note that the Standard Name isn't a mandatory attribute in CF - the rules state that there either needs to be a long name OR a Standard Name!

Over time there have been various attempts to turn the Standard Name into something different - the 'single location to gather important information about the kind of data contained in a variable' as you put it (I like that description). For example, the OceanSites community made the Standard Name mandatory in their CF profile. This caused requests to be made for new Standard Names appropriate to your definition, but not the original Standard Name concept. Some of these got through: others didn't, which makes the entity definition of the Standard Name concept a little blurred.

Another strategy has been to provide a 'signpost' pointing out the location of all the various bits of information needed to fulfil your definition. There was a proposal called Common Concept (Trac ticket 24) designed to do this. Unfortunately, it required a small but significant amount of effort to set up that was never resourced. In fact, it seemed like it was cursed - I even had to hand back funding allocated for the purpose because a critical staff member left at a time when there was a total ban on UK public service recruitment. A 'resource light' version of this strategy - CF String Syntax (Trac ticket 94 which I'm moderating) - is currently ready to implement once it has been written up as a Conventions document update. If we get this finished do you think it would resolve the problems you see?

Cheers, Roy.


Because it isn't mandatory in CFPlease note that I now work part-time from Tuesday to Thursday. E-mail response on other days is possible but not guaranteed!

From: CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Jim Biard
Sent: 02 April 2013 21:19
To: cf-metadata at cgd.ucar.edu
Cc: Jonathan Gregory
Subject: Re: [CF-metadata] interplay of standard name modifiers, cell_methods -- is there a problem?

Jonathan,

You haven't been unclear about how we got into the state we are currently in. We got to now by adding bits here and there as needs arose without always thinking about the implications for the whole system down the road (which you did a great job of describing). A lot of good work has been done to get where we are today, and I appreciate that. I also think that there is room for improvement in how we represent values that result from operations applied to measurements.

Yes, we could add paragraphs to the documentation to tell users to go look in various places when they are trying to figure out what the contents of a variable are, but that is not user-friendly behavior. I want to make it easy and intuitive to understand what sort of information a variable contains. If we consider the standard name attribute as the location where the essence of the variable contents is described using a controlled vocabulary, it makes sense to provide a mechanism within that vocabulary for distinguishing between a direct measurement (air temperature) and information about a measurement (standard deviation of air temperature). We provide this for some operations on measurements (number of observations, standard error, etc), but not for others (standard deviation, variance, anomaly, etc).

Expanding the list of standard name modifiers in the CF Metadata Conventions would allow us to make variables more self-describing and less confusing, and allow a user (or software) to look in a single location to gather important information about the kind of data contained in a variable (which I see as the purpose of the standard_name attribute).

Grace and peace,

Jim

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

jim.biard at noaa.gov<mailto:jim.biard at noaa.gov>
828-271-4900

On Apr 2, 2013, at 1:05 PM, Jonathan Gregory <j.m.gregory at reading.ac.uk<mailto:j.m.gregory at reading.ac.uk>> wrote:


Dear all

Jim asked,

"As some examples of the confusing situation we have now, why do we have a
separate word modifier number_of_observations instead of a
number_of_observations_of_X transformation modifier? Why don't we have
variance_of_X or anomaly_of_X transformations (or separate word modifiers
variance or anomaly)? Why isn't there a cell method for standard error? I
can't discern any logic behind the current partitioning."

I've tried to explain how this came about, but perhaps I am not being clear,
so let me try again:

* We introduced the modifiers like number_of_observations for those situations
where it was thought likely that a large number of standard names would need
them. Factorising out this dimension thus avoids a large expansion of the
standard name table. So far, only four anomaly_of names have been requested,
so it seems the right judgement not to have a standard_name modifier for that.

* That was also one of the motivations for cell_methods: there would be vastly
more standard names if we had to include all the cell_methods information too.
The other motivation for cell_methods is that the statistical operations
relate to particular axes. For instance, just "mean" is too vague: does it
mean time-mean, zonal-mean, mean over radiation wavelength, or what? The same
is true for variance. The cell_methods attribute makes this precise.

* There is not a cell method for standard error because it does not relate to
a particular dimension. The standard error is a metadata property of the
individual data. The cell methods statistically describe the variation of the
quantity within cells. These are different purposes.

While you may not agree with the logic, does this help to explain what it is?

If the situation is perceived as confusing and easily misunderstood, I am all
in favour of clarifying it by inserting more explanation and discussion in the
CF standard document. That could be done with a defect ticket. As Philip says,
it could shorten future discussions.

But we can also change the standard, of course. However, changes to existing
attributes are difficult for existing software. I do not think we need or
ought to change the existing attributes. While I appreciate the reason for the
suggestion, I feel that suffixing something to the standard_name to indicate
"something" has been done to it would not really help, because there is almost
*always* something done to it! Cell methods are recommended to be specified in
any case where the default "point" or "sum" is not correct. They should be
present if the quantity is a mean, in particular. A mean is also a
transformation, just like a standard deviation.

I am not convinced yet by the argument that we have to modify the CF standard
because the standard_name may be misunderstood or misused by software which
catalogues or serves datasets. CF introduced the standard_name attribute. If
it's being used now, software must already have been modified to support CF.
Well then, why can't be modified again to support CF more fully or correctly?
If we explained more clearly in the standard what the intention was, that would
no doubt help with future software design.

Instead of changing what we have, I think we should add to it. It seems to me,
as I've said before, that the existing proposal for "CF strings" summarising
some essential metadata (similar to the earlier proposal for common concepts
in some ways) would solve this problem. It is *that* kind of string, not the
standard name, that the user should be offered to select an appropriate
variable. It's a combination of attributes. It's not hard to assemble that
information from the separate attributes, but if that's an obstacle, we could
help software over it by recommending that this extra attribute be included.

Please have a look at https://cf-pcmdi.llnl.gov/trac/ticket/94 and add your
comments on it.

Best wishes

Jonathan
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130403/aab02a61/attachment.html>
Received on Wed Apr 03 2013 - 01:46:29 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒