⇐ ⇒

[CF-metadata] Getting back to ensembles

From: Steve Hankin <Steven.C.Hankin>
Date: Tue, 02 Jan 2007 11:35:48 -0800

Hi Jonathan, Bryan, et. al.,

I think Bryan has stated the essence of the problem well, below.
Jonathan, you have pointed out that the next challenge is to find words
that describe the semantic distinctions that we are trying to draw with
minimal ambiguity. So I suggest that we turn our attention to that
challenge. I do not have a specific proposal of wording to offer.
Instead (like a broken record (boy, that dates me!)) I'd suggest we
begin with a discussion of what our requirement is, and see if the
wording falls naturally out from that.

Borrowing some of Bryan's words:

*Requirement:*

    CF must support standardized terminology in multiple semantic
    domains. It must do so in a manner that will permit tools to be
    built that utilize the distinctions between these domains. The
    domains that must be kept distinct include at a minimum the ones
    listed just below, however, this list must be extensible.

       1. scientifically significant measured quantities
       2. parameters describing measurement techniques or processes
          (including who made the measurements)
       3. identification of CF data structures (grids, axes,
          coordinates, coordinate geometry information, ...)
       4. others (??)

I'm no metadata expert, so please correct me if the following assertion
is wrong: There is nothing to prevent the same name from existing in
multiple vocabularies. For example, if "platform_orientation"
(mentioned as an example in a previous email) really is both a
"scientifically significant measured quantity" and a "parameter
describing measurement technique", then is can exists separately under
both domains. It do not think it is a requirement that the distinction
between semantic domains can be inferred from the name alone. Stated
another way -- we do not need to determine the context of the name from
the name; we should always already know the context in which we are
encountering a name.

    - Steve

================================================================

Bryan Lawrence wrote:
> Jonathan
>
> I think we all understand the distinction between things which are
> measured, and things which are about measurement techniques, or who made
> the measurements (or simulations).
>
> We want to start building smart tools that can assist data users
> understand these distinctions faster and (as Roy implies, more
> accurately), but for these smart tools to work we need to build
> semantics directly into our vocabs. One way of doing this (and it's
> simply a first step) is to separate the vocabs. It's a design decision
> as to whether one does that by rules within a vocab or between vocabs.
> Either way, in our ontology building, we can start with the science
> vocabs, and the job is that little bit more tractable (and I think
> Jonathan underestimates how hard this going to be ... although it's only
> possible because of the quality of what we have now!)
>
> At the moment it is only your voice which is arguing against separating
> these vocabs. Are there any others who have a problem with the proposed
> separation? (Which is simply the creation of an additional table, to be
> called standard_metadata, which we will continue to control in the same
> way as we control standard_names ... and yes, at some future time we
> might deprecate some existing standard names and define them as standard
> metadata --- and to forestall Jonathan's objection: this wont hurt
> existing files because this will all be version controlled :-)
>
> Bryan
>
> On Fri, 2006-12-29 at 10:35 +0000, Jonathan Gregory wrote:
>
>> Dear Roy
>>
>>
>>> Consider a case where a metadata record has two fields, one for geographic
>>> coverage and one for parameter. If selection drop-downs for these are
>>> covered by two separate lists - either vocabs or within an ontology - then
>>> 'sea_temperature' will not appear in the geographic coverage drop-down and
>>> 'Atlantic_Ocean' will not appear in the paramer drop-down. Were both
>>> drop-downs covered by a single ' Standard Name list' then both terms would
>>> appear. This not only increases the risk of field population with nonsense
>>> (the type of error I was visualising - admittedly it's still possible to
>>> call temperature salinity), but also makes the drop-down appear eccentric to
>>> say the least.
>>>
>> We distinguish between lists for (a) standard names (b) the possible values of
>> quantities which have a standard name. "atlantic_ocean" is not a standard name;
>> it is a possible value for a variable whose standard name is "region". A menu
>> of standard names would includes sea_surface_temperature, rainfall_flux,
>> latitude, region and land_cover (to list a few from the present table) and
>> also (if my proposal is agreed, to meet Paco's requirement) source, institution
>> and experiment_id. These are all names for things which a data variable or a
>> coordinate variable could contain. The *values* of these variables are dealt
>> with in other ways. sea_surface_temperature, rainfall_flux and latitude are
>> numeric, so no list is needed. The others are string-valued. At present only
>> region has standardised values; the possible values are given by
>> http://www.cgd.ucar.edu/cms/eaton/cf-metadata/region.html
>> It's quite likely we might develop a standard list for land_cover. As we have
>> discussed a lot, it would be useful to make links to other people's controlled
>> vocabularies if we can, and the proposal also includes a new attribute to point
>> to external lists of the possible values for a quantity (Bryan's suggestion).
>>
>>
>>> Jon's comment that we can carry on as we are now and change later worries me a little. So many times in my work with metadata I have found that aggregation is infinitely easier than teasing things apart.
>>>
>> It is right to be cautious, but I think this reasonable concern of yours is
>> that things should be sufficiently informative. I agree with you. That's why
>> we spend so much time making sure we know exactly what quantity is being
>> identified by a standard name, and why quantities with different physical
>> dimensions (units) have different standard names. In this case, I think we
>> are talking about a categorisation that can be introduced whenever we need it.
>> There are only 814 standard names at present, so it would not be a big job to
>> classify them in future, given a clear criterion for doing it - which is what
>> we lack, since we don't have a need for it (as far as I can see).
>>
>> Cheers
>>
>> Jonathan
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata
>

-- 
--
Steve Hankin, NOAA/PMEL -- Steven.C.Hankin at noaa.gov
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20070102/36e29f2a/attachment-0002.html>
Received on Tue Jan 02 2007 - 12:35:48 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒