⇐ ⇒

[CF-metadata] Getting back to ensembles

From: Bryan Lawrence <b.n.lawrence>
Date: Mon, 20 Nov 2006 13:53:32 +0000

Hi Folks

I've said enough about process for now.

We should get back to discussing exactly what's on the table. I've
enumerated key paragraphs to help discussion. I will move this to trac
once Kyle reminds me what my account details are :-(

1) I think there is general acceptance that we can have something like:

float temperature(realization,time,lat,lon):
temperature:coordinates = 'realization time lat lon' ;

2) I then muddied the waters by suggesting using ancillary variables for
metadata about the realizations. Jamie and Jonathan stressed that we
should use auxillary coordinate variables. They were right.

3) Jonathan's example then was similar to the following, where I've
added an extra metadata label to stress the point that as I understand
it the auxillary coordinate variables provide a mechanism for slicing
the realizations in many ways, without adding to the dimensionality of
the ensemble itself (by the way, this could be useful for station data
as it stands provided we don't overload much meaning into
"realization"). This should allow a lot of the functionality paco wants.

float temperature(realization,time,lat,lon):
    temperature:coordinates = 'time lat lon metadata1 metadata2' ;
  char metadata1(realization,len100):
    metadata1:standard_name="institution"; // for instance
  char metadata2(realization,len100):
    metadata2:standard_name="thing2; // for instance
    
4) As an aside at this point it seems we don't need a standard name for
"realization" even though we have one ... (although we'd need it for the
case where there were no auxillary coordinate variables).

5) I suggested that where we don't already have standard names for the
metadata here, that we should not invent them. I made the argument that
we were opening a huge metadata door here and that we ought not reinvent
the entire wheel within CF.

6) I further made the argument that whatever metadata we did want
internally for these should not be *standard names*, which in my mind
really ought to be about defining physical variables (things with
units), or at least things which we directly might expect to manipulate
(e.g. realization weights). I think the intention for this stuff is also
different from the flag concept as well ...

7) I accept that there is merit in standard "somethings" which map our
existing global attributes, which ought to help aggregation tools. John?

8) So, I was proposing something new, which I then think should be
provisional: firstly, that we create a new class of standard
identifiers, which we call "standard_metadata", and we put the global
file attribute things in here, and anything else we find external
vocabularies inadequate for.

9) So the example would be rather than in 3) above, we would have:
float temperature(realization,time,lat,lon):
    temperature:coordinates = 'time lat lon metadata1 metadata2' ;
  char metadata1(realization,len100):
    metadata1:standard_metadata="institution"; // for instance
  char metadata2(realization,len100):
    metadata2:standard_metadata="thing2; // for instance

Note that one could still have *standard_names* in as auxillary
coordinate variables, but when we use standard_metadata, we are
explicitly saying there is nothing physical about this information!
I think that would be of considerable aid to software!

10) But, secondly, where possible *for metadata* we should use external
vocabularies, and *not* add new stuff into our controlled vocabularies.
I thought the mechanism for doing this would be (for anything actually,
not just for standard_metadata):

char metadata1(realization,len100):
   metadata1:external_vocabulary = http://wmo.foo.int/identifierY

11) Obviously if one wrote a file containing this stuff, one would
a) be documenting the file consistently, this doesn't necessarily rely
on external tables any more than using standard_names relies on the
external definitions.
b) want some faith that www.foo.int will keep that meaning around at
least as long as the data exists. That's no more or less a leap of faith
than in CF ... any given data producer can then make their own decision
about doing that.

12) If there is an existing cf standard name or cf standard metadata for
this, then *don't* use them both - that way lies confusion (it's
unlikely that the two will really be identical). In this I differ
strongly from Jonathan's last suggestion.

That's probably enough for now.

Bryan
Received on Mon Nov 20 2006 - 06:53:32 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒