⇐ ⇒

[CF-metadata] per-variable metadata?

From: Steve Hankin <Steven.C.Hankin>
Date: Thu, 04 Aug 2011 16:25:58 -0700

On 8/4/2011 1:30 PM, Upendra Dadi wrote:
> Hi Steve & Nan,
> Is this allowed in CF? Isn't ancillary_variable meant to be used for
> per value metadata i.e. metadata for each and every value in the
> variable it is referring to? If so, shouldn't the ancillary_variable
> have the same set of dimensions and in the same order as the variable
> it is referring to?

Hi Upendra, Nan, Roy,

First off just to comment that these topics seems appropriate as a
logical next step. The new Discrete Geometries chapter largely focused
on the file structure aspects -- storing the numbers. Community by
community the metadata needs vary, of course. Defining "standard
profiles" (e.g. OceanSites) is the natural approach to standardizing
metadata contents. So I think we are all asking the question, what are
the general rules and structures that should be followed so that generic
applications are best able to access and utilize the specialized
metadata encoded per a standardized CF profile? A second point is to
confess that I wasn't personally involved in the discussions that lead
to the ancillary_variables machinery. I'm ready to stand corrected if I
misinterpret the words found in CF 1.5. Lets just get the ideas out on
the table and let others comment.

For the case that Nan has described, if one were using the techniques of
chapter 9
(https://cf-pcmdi.llnl.gov/trac/attachment/ticket/37/CFch9-may4.docx?format=raw)
the metadata would be tied to the variable "TEMP" by its station index,
rather than by an ancillary_variable attribute. In this example the
"station_info" variable is a model model for "Instrument_manufacturer",
"Instrument_model", etc.

        A9.2.4 Contiguous ragged array representation of timeSeries
        dimensions:
            station = 23 ;
            obs = 1234 ;

        variables:
            float lon(station) ;
                lon:standard_name = "longitude";
                lon:long_name = "station longitude";
                lon:units = "degrees_east";
            float lat(station) ;
                lat:standard_name = "latitude";
                lat:long_name = "station latitude" ;
                lat:units = "degrees_north" ;
            char station_name(station, name_strlen) ;
                station_name:long_name = "station name" ;
                station_name:cf_role = "station_idtimeseries_id";
        * int station_info(station) ;*
        * station_info:long_name = "some kind of station info" ;*
            int row_size(station) ;
                row_size:long_name = "number of observations for this
        station " ;
                row_size:ample_dimension = "obs" ;

            double time(obs) ;
                time:standard_name = "time";
                time:long_name = "time of measurement" ;
                time:units = "days since 1970-01-01 00:00:00" ;
            float humidity(obs) ;
                humidity:standard_name = "specific_humidity" ;

Nan, I think in your example the "depth" dimension is effectively the
same as the "station" dimension in A9.2.4 (or 9.2.1) -- independent
instruments deployed at a list of depths (stations) with metadata
describing each depth. So the question is whether the the association
of metadata through the station (or depth) dimension is sufficient? (I
think it is.) Or is there a use case that demonstrates that the
ancillary_variable machinery is needed, as well?

Upendra, your point, "/ancillary_variable meant to be used for per
value metadata i.e. metadata for each and every value in the variable it
is referring to/" is a strict interpretation of the opening sentence of
3.4. Ancillary Data
(http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/cf-conventions.html#ancillary-data),
"/one data variable provides metadata about the individual values of
another data variable/". That interpretation would rule out cases like
the following, which seem desirable to encode (imagining an instrument
such as a Doppler profiler, where the uncertainty in a velocity
measurement is a function of depth)

       float q(time, depth) ;
         q:standard_name = "||upward_sea_water_velocity" ;
         q:ancillary_variables = "q_uncertainty" ;
       float q_uncertainty(depth)

Some word-smithing seems to be in order to clarify that opening sentence
of section 3.4.

    - Steve

>
> Upendra
>
>
> On 8/4/2011 2:17 PM, Nan Galbraith wrote:
>> Hi Steve -
>>
>> I'm very interested in the background discussion on this -
>> any chance of bringing it into the foreground?
>>
>> I'm using ancillary variables in 2-D in situ data files to describe
>> instruments and things like precision, accuracy, sample scheme,
>> etc.. For temperature files from moorings where different sensor
>> types are at different depths, I'd like to use something like
>>
>> TEMP:ancillary_variables = "Instrument_manufacturer Instrument_model
>> Instrument_sample_scheme Instrument_serial_number
>> TEMP_qc_procedure
>> TEMP_accuracy TEMP_precision TEMP_resolution";
>> and then
>> short INST_SN(depth) ;
>> INST_SN:long_name = "instrument_serial_number" ;
>> ... etc., etc.
>>
>> If there's going to be a standard way to do this, I'd really like to
>> know about it - sooner rather than than later.
>>
>> Thanks -
>> Nan
>>
>> On 8/4/11 11:35 AM, Steve Hankin wrote:
>>> Hi Jeff,
>>>
>>> Each variable in a CF file may possess an |ancillary_variables|
>>> attribute, that points to variables that have relationships
>>> (http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/cf-conventions.html#ancillary-data).
>>> To attach flags to a variable, use |ancillary_variables| to point to
>>> a variable that has |flag_values||| and |flag_meanings |attributes
>>> (http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/cf-conventions.html#flags).
>>>
>>> We have started a discussion in the background, whether an example
>>> that illustrates this should be included in the CF documentation.
>>>
>>> - Steve
>>>
>>> =====================================================
>>>
>>> On 7/14/2010 6:21 AM, Jeff deLaBeaujardiere wrote:
>>>> In another discussion, Steve Hankin wrote:
>>>> > CF generally favors attributes attached to variables over
>>>> attributes attached to files
>>>>
>>>> This reminds me of a question I wanted to ask: does CF have any
>>>> conventions regarding how to handle data that contains multiple
>>>> observed quantities with different quality flags, comment fields or
>>>> other attributes for each quantity?
>>>>
>>>> -Jeff DLB
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> CF-metadata mailing list
>>>> CF-metadata at cgd.ucar.edu
>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>>
>> --
>> *******************************************************
>> * Nan Galbraith (508) 289-2444 *
>> * Upper Ocean Processes Group Mail Stop 29 *
>> * Woods Hole Oceanographic Institution *
>> * Woods Hole, MA 02543 *
>> *******************************************************
>>
>>
>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20110804/89d5f407/attachment-0001.html>
Received on Thu Aug 04 2011 - 17:25:58 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒