⇐ ⇒

[CF-metadata] Getting back to ensembles

From: Jonathan Gregory <j.m.gregory>
Date: Wed, 27 Dec 2006 18:09:58 +0000

Dear Roy

> I still think having separate vocabularies for separate metadata attributes is the better way to go. I would prefer two vocabularies with overlapping term sets to the the possibility of populating a field with an inappropriate term. If the metadata attributes pertain to clearly different entities then so too will the vocabularies.
>
> Having separate vocabularies permits automated population protection on each attribute - it's amazing how easy it is to select the wrong item from a drop-down list through an unintentional slip of the mouse.

Thanks for this. I would be perfectly happy to use different attributes to name
"quantities" which can and can't be given a standard name, provided a clear
requirement for this distinction can be stated, and a way to make it in
practice. Otherwise I don't see how we would decide which is which. If the
two vocabularies could even be overlapping, it would confirm my suspicion that
there is no distinction to be drawn; we would in effect have two synonymous
attributes. That would be an unhelpful complexity.

Quite likely I am being unperceptive, and someone can say what distinction is
to be made. Below I repeat my earlier unsuccessful attempts to identify one.
So far this feels like trying to reach the end of the rainbow! But as Steve has
said, the way to introduce changes should be to start with a requirement (we
may have divergent views on the degree of formality needed in that step), not
try to invent one post-hoc to justify something.

Best wishes

Jonathan


Here are some distinctions which *don't* seem to work well:

* We could say string-valued things are not given standard names. But we have
region as a standard name with string values, identifying geographical areas
such as atlantic_ocean, for labelling the ocean overturning streamfunction.
This function is very similar to numerical geographical coordinates. It would
be strange to use standard name for latitude and longitude to describe a
rectangular region, but to say that the label for a non-rectangular region is
not a standard name. Similarly, land cover types have standard names. They are
labelling parts of gridboxes which aren't geographically delineated, so again
are acting as a sort of spatial coordinate.

* Things which you aren't going to "operate" on are not standard names i.e.
they're just labels. However, possibly the commonest thing to do with the
numerical spatiotemporal coordinates is to subset them, and you may well do
that with the ensemble metadata too. It is also possibly you might combine
them or process them in other ways.

* Quantities which could never be data variables should not be given standard
names. It does seem quite unlikely that source, institution and experiment_id
might form the contents of a data variable. (In fact it would be hard to do
until netCDF-4, since they are string-valued, but it could be done using
flag_values and flag_meanings.) But I don't think it's impossible. I can
imagine constructing a lat-lon field which indicates which model, at each
point, had the most realistic value of a quantity, for example.
Received on Wed Dec 27 2006 - 11:09:58 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒