⇐ ⇒

[CF-metadata] Question from NODC about interplay of standard name modifiers, cell_methods, etc.

From: Jim Biard <jim.biard>
Date: Mon, 1 Apr 2013 11:15:28 -0400

Hi.

I think it makes good sense to provide a consistent location and controlled vocabulary for specifying that a variable contains a transformation of the original measurement that tells you more about the measurement. (That's the best way I could think of to describe it in general terms.) The standard name attribute seems to me to be a natural and obvious location for storing this information, and we already have an initial vocabulary of standard name modifiers that are used for this very purpose. We aren't violating the "essential" nature of the standard name by allowing it to clearly identify that a variable contains a variance or standard deviation of a measurement. We shouldn't send people (or machines) jumping around in an attempt to figure out what is actually stored in a variable.

It seems to me that we need to do a unification and/or realignment between the standard name modifiers listed in the main CF document, the cell methods listed in the main CF document, and the transformations listed in the "Guidelines for Construction of Standard Names" document. (We wouldn't delete anything, of course. Backwards-compatibility issues, and all that.)

As some examples of the confusing situation we have now, why do we have a separate word modifier number_of_observations instead of a number_of_observations_of_X transformation modifier? Why don't we have variance_of_X or anomaly_of_X transformations (or separate word modifiers variance or anomaly)? Why isn't there a cell method for standard error? I can't discern any logic behind the current partitioning.

I think that squaring this up could make life much simpler for people trying to navigate this maze. (I know it would make my life simpler!)

Grace and peace,

Jim

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

jim.biard at noaa.gov
828-271-4900

On Mar 29, 2013, at 8:33 PM, "Cameron-smith, Philip" <cameronsmith1 at llnl.gov> wrote:

> Hi All,
>
> I can think of two different cases:
>
> 1) Repeated measurement are made of a physical quantity. The best estimate of the physical quantity is then the mean with the standard error. In this case the standard deviation is really a property of the measurement system rather than the physical quantity.
>
> 2) A system contains different values of the physical quantity in different times/places within the region of interest. In this case the standard deviation is clearly related to the physical quantity (IMHO). For example, the standard deviation of aerosol diameters in an air parcel, or the standard deviation of surface altitudes within a gridbox.
>
> For a time series, of say temperature, I can imagine wanting to store various quantities, eg:
>
> 1) Instantaneous temperature.
> 2) Mean temperature over an interval.
> 3) std dev of temperature over that interval.
> 4) max over an interval.
> 5) min over an interval.
>
> Since I can imagine wanting to use these on most, if not all, of the physical quantities in the standard names, I think it makes sense to put these in a separate piece of metadata. This is what is exactly what cell_methods dies, and indeed there are actually 10 such methods. This makes more sense to me than increasing the size of the standard name table by a factor of 10.
>
> The standard error is a little different, because I usually associate it with the measurement instrument rather than the physical quantity (I can think of one odd exception).
>
> I think this leads to the reason that appear to distinguish 'std_name modifiers' from 'cell_methods':
>
> +) std_name modifiers relate to information about the instrumental measurement of the given physical quantity.
>
> +) cell_methods relate to calculations performed on the actual data series.
>
> I am not sure why one is included in the std_name string, while the other is in a separate attribute, but that is a different discussion.
>
> For reference, the std_name modifiers are listed at http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/apc.html
>
> The cell_methods are listed at http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/ape.html
>
> Best wishes,
>
> Philip
>
>
> -----------------------------------------------------------------------
> Dr Philip Cameron-Smith, pjc at llnl.gov, Lawrence Livermore National Lab.
> -----------------------------------------------------------------------
>
>
> From: CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Nan Galbraith
> Sent: Friday, March 29, 2013 6:58 AM
> To: Kenneth S. Casey - NOAA Federal
> Cc: cf-metadata at cgd.ucar.edu; Jonathan Gregory
> Subject: Re: [CF-metadata] Question from NODC about interplay of standard name modifiers, cell_methods, etc.
>
> I don't want to belabor this point, but from the practical point of view of someone
> who uses and generates data, which I think is fairly representative of this group, a
> mean is a representation of a geophysical property, and a stdev is not.
>
> We collect in situ data, and I know that MANY of our instruments output the mean
> of several measurements, few do single spot samples. It would surprise me to hear
> anyone claim that these data sets do not represent geophysical quantities.
>
> Again, I'm just suggesting that the rules for standard name modifiers might be
> tweaked to encourage user-friendly labeling of data. I suspect that most data
> publishers are already taking care not to share data that's labeled in a CF-compliant
> but misleading way.
>
> Regards - Nan
>
>
> On 3/29/13 9:08 AM, Kenneth S. Casey - NOAA Federal wrote:
> Nan - your statement below has me wondering about what a statistician would say. Would they say: A "mean" is still a statistical concept, and can not be measured. It can only be computed, statistically, as sum/N. In that sense, it is not really any different that standard deviation? the mean is where the distribution is centered, and the standard deviation is the width of that distribution. Neither is a discrete measurement and only make sense as part of a distribution. But I am not a statistician so I really do wonder what one would say?.
>
> -Ken
>
> On Mar 27, 2013, at 4:23 PM, Nan Galbraith <ngalbraith at whoi.edu> wrote:
>
>
> I don't think the standard deviation of the temperature of sea water is really a
> geophysical property; it's a mathematical concept, while a temperature value
> represented as a mean is still a temperature.
>
>
>
>
> --
> *******************************************************
> * Nan Galbraith Information Systems Specialist *
> * Upper Ocean Processes Group Mail Stop 29 *
> * Woods Hole Oceanographic Institution *
> * Woods Hole, MA 02543 (508) 289-2444 *
> *******************************************************
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130401/c8a3837c/attachment-0001.html>
Received on Mon Apr 01 2013 - 09:15:28 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒