[CF-metadata] interplay of standard name modifiers, cell_methods -- is there a problem? from Jim Biard on 2013-04-02 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Jim Biard <jim.biard>
Date: Tue, 2 Apr 2013 16:19:23 -0400

Jonathan,

You haven't been unclear about how we got into the state we are currently in. We got to now by adding bits here and there as needs arose without always thinking about the implications for the whole system down the road (which you did a great job of describing). A lot of good work has been done to get where we are today, and I appreciate that. I also think that there is room for improvement in how we represent values that result from operations applied to measurements.

Yes, we could add paragraphs to the documentation to tell users to go look in various places when they are trying to figure out what the contents of a variable are, but that is not user-friendly behavior. I want to make it easy and intuitive to understand what sort of information a variable contains. If we consider the standard name attribute as the location where the essence of the variable contents is described using a controlled vocabulary, it makes sense to provide a mechanism within that vocabulary for distinguishing between a direct measurement (air temperature) and information about a measurement (standard deviation of air temperature). We provide this for some operations on measurements (number of observations, standard error, etc), but not for others (standard deviation, variance, anomaly, etc).

Expanding the list of standard name modifiers in the CF Metadata Conventions would allow us to make variables more self-describing and less confusing, and allow a user (or software) to look in a single location to gather important information about the kind of data contained in a variable (which I see as the purpose of the standard_name attribute).

Grace and peace,

Jim

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

jim.biard at noaa.gov
828-271-4900

On Apr 2, 2013, at 1:05 PM, Jonathan Gregory <j.m.gregory at reading.ac.uk> wrote:

> Dear all
>
> Jim asked,
>
> "As some examples of the confusing situation we have now, why do we have a
> separate word modifier number_of_observations instead of a
> number_of_observations_of_X transformation modifier? Why don't we have
> variance_of_X or anomaly_of_X transformations (or separate word modifiers
> variance or anomaly)? Why isn't there a cell method for standard error? I
> can't discern any logic behind the current partitioning."
>
> I've tried to explain how this came about, but perhaps I am not being clear,
> so let me try again:
>
> * We introduced the modifiers like number_of_observations for those situations
> where it was thought likely that a large number of standard names would need
> them. Factorising out this dimension thus avoids a large expansion of the
> standard name table. So far, only four anomaly_of names have been requested,
> so it seems the right judgement not to have a standard_name modifier for that.
>
> * That was also one of the motivations for cell_methods: there would be vastly
> more standard names if we had to include all the cell_methods information too.
> The other motivation for cell_methods is that the statistical operations
> relate to particular axes. For instance, just "mean" is too vague: does it
> mean time-mean, zonal-mean, mean over radiation wavelength, or what? The same
> is true for variance. The cell_methods attribute makes this precise.
>
> * There is not a cell method for standard error because it does not relate to
> a particular dimension. The standard error is a metadata property of the
> individual data. The cell methods statistically describe the variation of the
> quantity within cells. These are different purposes.
>
> While you may not agree with the logic, does this help to explain what it is?
>
> If the situation is perceived as confusing and easily misunderstood, I am all
> in favour of clarifying it by inserting more explanation and discussion in the
> CF standard document. That could be done with a defect ticket. As Philip says,
> it could shorten future discussions.
>
> But we can also change the standard, of course. However, changes to existing
> attributes are difficult for existing software. I do not think we need or
> ought to change the existing attributes. While I appreciate the reason for the
> suggestion, I feel that suffixing something to the standard_name to indicate
> "something" has been done to it would not really help, because there is almost
> *always* something done to it! Cell methods are recommended to be specified in
> any case where the default "point" or "sum" is not correct. They should be
> present if the quantity is a mean, in particular. A mean is also a
> transformation, just like a standard deviation.
>
> I am not convinced yet by the argument that we have to modify the CF standard
> because the standard_name may be misunderstood or misused by software which
> catalogues or serves datasets. CF introduced the standard_name attribute. If
> it's being used now, software must already have been modified to support CF.
> Well then, why can't be modified again to support CF more fully or correctly?
> If we explained more clearly in the standard what the intention was, that would
> no doubt help with future software design.
>
> Instead of changing what we have, I think we should add to it. It seems to me,
> as I've said before, that the existing proposal for "CF strings" summarising
> some essential metadata (similar to the earlier proposal for common concepts
> in some ways) would solve this problem. It is *that* kind of string, not the
> standard name, that the user should be offered to select an appropriate
> variable. It's a combination of attributes. It's not hard to assemble that
> information from the separate attributes, but if that's an obstacle, we could
> help software over it by recommending that this extra attribute be included.
>
> Please have a look at https://cf-pcmdi.llnl.gov/trac/ticket/94 and add your
> comments on it.
>
> Best wishes
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130402/6ca82f81/attachment-0001.html>
Received on Tue Apr 02 2013 - 14:19:23 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST