⇐ ⇒

[CF-metadata] subgrid variation

From: Bryan Lawrence <b.n.lawrence>
Date: Thu, 24 Feb 2005 10:55:10 +0000

Hi Folks

Apologies in advance. This will read like a criticism of the outstanding work
that has and is being done. It is, but hopefully both minor and constructive,
and the fact that I can do it at all is because of the outstanding start that
we have with CF ... and again apologies, this is long.

Ok, the subgrid variation.

I think this is one of the examples of things that no one is quite sure of the
status of now ...

So, the issues this post raise are

1) CF-1.0 was presumably frozen in place. I can't find easily a place where
all the proposals for CF-2.0 are collected separately. Whatever we decide to
do, we should cleanly separate proposals from CF-1.0. (This is not a
criticism of Brian or anyone else, this is hard to do, and takes time, so we
need to a) want to do it, and b) resource it).

2) The standard names. Some of you will know that I think we need to address
splitting CF into (at least) the following three components:
 - vocabularies
 - netcdf specific things
 - other stuff.

Having done that, we should separate the development track an possibly version
each separately. Anyway, limiting ourselves to vocabularies as they apply to
variables and axes, as Jonathan and Steve both state, we have an issue with
systemising modifiers (factorisation) of the standard names. We also have an
issue with name proliferation (not in and of itself a problem, but a problem
when we start to incorporate other namespaces and fixing them in place, e.g.
selecting part of an external gazetteer, which then is evolves externally at
a different pace).

This is an area where there is a lot of work on in other communities, and we
should look at what they are doing, and use it. In particular, GML has a
dictionary concept. Can we use that as a starting point. Can we then ask how
we use that in a CF compliant way? I think so, it could be as simple as
referencing the namespace associated with a particular standard name. That
would allow people to use externally maintained namespaces. For example: the
CF community are not the right people to maintain an atmospheric chemistry
namespace, but we do want atmospheric chemistry in our CF compliant files ...
(Is this going to be a big change from CF-1.0? No, not if we do it right, it
could even be backward compatible).

3) What about the modifiers issue? We need to come up with a method of
dealing with modifiers of any sort in an abstract way, rather than creating
epicycles every time we think of something new. OK, well, we're not going to
do that today, so this should be something we put on our CF development
track ...
 
On this specific proposal, if I quote 7.3 of CF-1.0:

        "Some methods (e.g., variance) imply a change of units of the variable, and
this also is specified by Appendix D. "

Was anyone else aware that where we don't have a cell method we have to know a
priori whether a quantity is extensive (depends on the size of the cell) or
intensive (doesn't) to know what to with it. This makes perfect sense, except
that this implies that all standard names need to have this as an attribute
(or it is compulsory as an attribute of a variable, better but maybe even
better to imply that things are intensive except when they have an attribute
which says they're not). (Or do I have this all wrong?)

Meanwhile, this will be a nightmare to *use*.

So, what do I think we should do? Clearly we need to keep the separation
between vocabulary and usage (ie standard names and modifiers like cell
methods). As Steve points out, this is broken already with cell methods,
which can change the (implied/explicit) units of a standard name. I think we
need to take a step back and ask ourselves how we can do this in a cleaner
way. So this is another item of work we need for our CF development track ...

At that point, we probably should think about the structures of our standard
names. My personal point of view would be that we should try and identify the
semantic things we care about, and join them together into standard names.
For example, we have x_sea_water_velocity as an alias of
sea_water_x_velocity. I'd have to code every one of these up (but if we
really do expand, then take for example the BODC parameter dictionary which
has 10,000 entries, no way I'm going to do that manually)! Surely we can
divide this into two or three semantic parts, and the order is irrelevant.
Doing this could be as simple as stating that semantic content is separated
by _ and order is irrelevant (but it wont be that simple :-). Roy Lowry has
the wonderful example of how this can get out of hand: the advent of green
dogs (if we allow colour modifiers to be completely independent of animals).
I suppose our example could be orography velocity ...

I've got more thoughts on the standard names, but will save them for now, but
by way of summary of this post, I think this email has raised three things we
need to identify as needing more work, and for which we need someone to make
a proposal about how to proceed.

1) Versioning
2) Dealing with variable modifiers, units and standard names
3) Dealing with dictionaries of names, and relationship with external
namespaces.

We may need to decouple units and standard names to resolve these issues!
Clearly we can't do that in CF 1.x, but we might in a future CF ...

I'm deliberately not raising specific proposed solutions now, because what I
would like us to do is identify that these are structural problems that need
resolution in a future version of CF, not to be hacked into CF-1.0. Of course
folk (including me) can come up with ways of dealing with it for now, but
they wont be CF (standard).

Bryan

On Tuesday 08 February 2005 08:54, Jonathan Gregory wrote:
> Dear Steve
>
> > They
> > are now creating files which have "standard_name" attributes. In most
> > cases they lack cell_methods attributes. They believe that they are
> > creating CF 1.0 compliant files. Does your proposal "(2)" imply that
> > valid CF 1.0 files which use "standard_name" will become invalid CF 2.0
> > files? If so isn't this a serious backwards compatibility concern?
>
> Yes, it would. Karl has also made that point. Strangely, Karl's posting
> doesn't appear on the news archive.
>
> I agree, we can only make cell_methods strongly recommended, rather than
> mandatory, in order that previous CF files remain valid. I think that such
> a recommendation is needed, because really people ought to record what they
> intend a quantity to be, and when there is subgrid variation of surface
> types, for instance, there is a real ambiguity. The standard name alone is
> not sufficient metadata.
>
> > Before we introduce further complexity to the sub-grid cell_methods
> > machinery (which is not one of the more "transparent" aspects of CF
> > already), has there been a serious exploration of alternatives?
>
> ...
>
> > I'm tempted to think that a thoughtful discussion on how to systematize
> > these modifiers is in order.
>
> So far, I have not managed to think of better ways myself, and believe me,
> I have spent hours thinking about it! But anyone's welcome to make
> proposals, of course. Yes, the standard_name and its modifiers and the
> cell_methods together specify the quantity. This is a step towards some
> "factorisation" of the description, in order to limit the expansion of the
> standard name table, and I think it's quite systematic. We are following a
> path between the extremes of putting all the definition in one attribute,
> and breaking it down into many attributes.
>
> > Listening
> > to the general community chatter there's no doubt that plenty of groups
> > have picked up the "standard_name" attribute and are thrilled to have
> > found it.
>
> That is encouraging. Of course, it means we need to arrange ways of making
> it easier to add and modify standard names. That's something I hope we can
> discuss at the meeting at the GO-ESSP meeting at the BADC in June.
>
> Best wishes
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata

-- 
Bryan Lawrence,        Head NCAS/British Atmospheric Data Centre
Web: badc.nerc.ac.uk                      Phone: +44 1235 445012
CCLRC: Rutherford Appleton Laboratory, Chilton, Didcot, OX11 0QX
Received on Thu Feb 24 2005 - 03:55:10 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒