⇐ ⇒

[CF-metadata] Getting back to ensembles

From: John Caron <caron>
Date: Mon, 20 Nov 2006 11:12:00 -0700

Hello all:


Bryan Lawrence wrote:
> Hi Folks
>
> I've said enough about process for now.
>
> We should get back to discussing exactly what's on the table. I've
> enumerated key paragraphs to help discussion. I will move this to trac
> once Kyle reminds me what my account details are :-(
>
> 1) I think there is general acceptance that we can have something like:
>
> float temperature(realization,time,lat,lon):
> temperature:coordinates = 'realization time lat lon' ;
>
> 2) I then muddied the waters by suggesting using ancillary variables for
> metadata about the realizations. Jamie and Jonathan stressed that we
> should use auxillary coordinate variables. They were right.
>
> 3) Jonathan's example then was similar to the following, where I've
> added an extra metadata label to stress the point that as I understand
> it the auxillary coordinate variables provide a mechanism for slicing
> the realizations in many ways, without adding to the dimensionality of
> the ensemble itself (by the way, this could be useful for station data
> as it stands provided we don't overload much meaning into
> "realization"). This should allow a lot of the functionality paco wants.
>
> float temperature(realization,time,lat,lon):
> temperature:coordinates = 'time lat lon metadata1 metadata2' ;
> char metadata1(realization,len100):
> metadata1:standard_name="institution"; // for instance
> char metadata2(realization,len100):
> metadata2:standard_name="thing2; // for instance
>
> 4) As an aside at this point it seems we don't need a standard name for
> "realization" even though we have one ... (although we'd need it for the
> case where there were no auxillary coordinate variables).

im not clear if you are suggesting we standardize the dimension name to = realization? I will assume for now you are not saying that.

>
> 5) I suggested that where we don't already have standard names for the
> metadata here, that we should not invent them. I made the argument that
> we were opening a huge metadata door here and that we ought not reinvent
> the entire wheel within CF.
>
> 6) I further made the argument that whatever metadata we did want
> internally for these should not be *standard names*, which in my mind
> really ought to be about defining physical variables (things with
> units), or at least things which we directly might expect to manipulate
> (e.g. realization weights). I think the intention for this stuff is also
> different from the flag concept as well ...
>
> 7) I accept that there is merit in standard "somethings" which map our
> existing global attributes, which ought to help aggregation tools. John?

Ok, let me restate the situation from the common data model, which is a bit different from CF, though compatible.

let me munge your examples slightly:

1)
float temperature(rdim,time,lat,lon):
      temperature:coordinates = 'realization time lat lon' ;

char realization (rdim, strlen):

assuming that time,lat,lon are the usual CF conventions, then the new thing is that I have a string valued coordinate, but i dont know what it is intended to mean. But if I unambiguously indicate it as:

char realization (rdim, strlen):
  realization :standard_name="realization"; // or whatever we decide

Now I know its an ensemble dimension, and can present it to the user for slicing and dicing, in a way that the user knows for sure what the dimension is used for. I would require that the strings are unique, so there is a 1-1 mapping. What the string values are doesnt matter to me.

2) if i understand, you want to deal with the case where you do care what the strings mean, sometimes they indicate runs from different institution, or maybe just perturbations from the same run. You certainly can add multiple (auxiliary) coordinate variables. I still need the standard name on one of them so i know its an ensemble dimension, and I need a 1-1 mapping. One possibility would be to consider the variable that has the standard name on it to be special, and its values required to be unique. Or one could put a standard name on all of them, and assume that the concatentation of the values is unique.

float temperature(rdim,time,lat,lon):
      temperature:coordinates = 'metadata1 metadata2 time lat lon' ;

char metadata1(rdim, strlen):
  metadata1:standard_name="institution"; // for instance

char metadata2(rdim, strlen):
  metadata2:standard_name="thing2; // for instance

I assume that you are saying lets not standardize the values of these strings. However, I do need to understand that "institution" and "thing2" are types of an ensemble dimension, not intended to represent something else, ie a vector.

Probably enough for now, for me.


>
> 8) So, I was proposing something new, which I then think should be
> provisional: firstly, that we create a new class of standard
> identifiers, which we call "standard_metadata", and we put the global
> file attribute things in here, and anything else we find external
> vocabularies inadequate for.
>
> 9) So the example would be rather than in 3) above, we would have:
> float temperature(realization,time,lat,lon):
> temperature:coordinates = 'time lat lon metadata1 metadata2' ;
> char metadata1(realization,len100):
> metadata1:standard_metadata="institution"; // for instance
> char metadata2(realization,len100):
> metadata2:standard_metadata="thing2; // for instance
>
> Note that one could still have *standard_names* in as auxillary
> coordinate variables, but when we use standard_metadata, we are
> explicitly saying there is nothing physical about this information!
> I think that would be of considerable aid to software!
>
> 10) But, secondly, where possible *for metadata* we should use external
> vocabularies, and *not* add new stuff into our controlled vocabularies.
> I thought the mechanism for doing this would be (for anything actually,
> not just for standard_metadata):
>
> char metadata1(realization,len100):
> metadata1:external_vocabulary = http://wmo.foo.int/identifierY
>
> 11) Obviously if one wrote a file containing this stuff, one would
> a) be documenting the file consistently, this doesn't necessarily rely
> on external tables any more than using standard_names relies on the
> external definitions.
> b) want some faith that www.foo.int will keep that meaning around at
> least as long as the data exists. That's no more or less a leap of faith
> than in CF ... any given data producer can then make their own decision
> about doing that.
>
> 12) If there is an existing cf standard name or cf standard metadata for
> this, then *don't* use them both - that way lies confusion (it's
> unlikely that the two will really be identical). In this I differ
> strongly from Jonathan's last suggestion.
>
> That's probably enough for now.
>
> Bryan
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Mon Nov 20 2006 - 11:12:00 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒