⇐ ⇒

[CF-metadata] some concerns about the "ensemble axis" proposal

From: Bryan Lawrence <b.n.lawrence>
Date: Wed, 07 Mar 2007 09:51:14 +0000

Hi Folks, especially Balaji and Steve

I'll make some general comments, and then take Balaji's questions.

Firstly, Thredds has nothing to do with this issue, and that's my point
from the November email, and which I was restating in reply to Steve's
point. If we have to appeal to *any* external *software* package to
define our metadata, then our convention is broken. (However, I have no
problem with appealing to external *definitions* of internal
identifiers.)

Secondly, netcdf4 is also a red herring, because folk have to use
netcdf3 now, and will have to do so for a while to come. (Further, I
can't seriously believe that on the one hand we have an argument that
adding another axis is an engineering problem, but using a different API
to the persistence format is not ... both ways, software will need
adjustment, but in the former case we are working on top of a known and
reliable persistence format. I can tell you for a fact that we wont be
accepting netcdf4 data in 2007 for the BADC ... not because I don't
like it, but because it has not yet got a track record!).

So where does that leave us?
* It leaves us with certain classes of ensemble data, that we have
available a priori (i.e. at file writing time), and that can be stored
in files in a certain way, and these are the ones that we are proposing
a solution for. These work fine with the proposed solution.
* there are also classes of ensemble data that we might what to
aggregate a postiori (i.e we don't have them at file write time, or that
cannot be stored into an array which has the same underlying coordinate
system). Well frankly, how is that different from *any* other existing
situation? (The Unified Model 4.5 had P and UV on different grids, but
we can still put them in the same file, I could even put them in the
same file with an ensemble axis for each). I can always find an example
where I want to add more data to a file later which isn't in the time
dimension (so I rewrite the file). Ensembles are simply not special in
this regard!

(Aggregation servers are a red herring, in the final analysis, what I
get from aggregation servers are files, so let's care about the
persistence format, not the interface definition).

> Is the ensemble axis static? (i.e not UNLIMITED)? What happens if I want
> to increase the size of an ensemble later? (We recently added 2 members
> to a 3-member initial-condition ensemble we've submitted to IPCC AR4).

So rewrite the data, but nobody is saying you *have* to have ensembles
all in one file, just as you don't rely on having all the variables in
one file. In the latter case, for sure one needs external information to
make the links, but let's not appeal to any *specific* software to do
it.

> For the kinds of ensembles we have in mind, can we stay within the file
> size limits?

No one is arguing that 1 file = 1 dataset.

> I certainly wasn't meaning to suggest, or even imply, any software
> choices or aggregation methods to go along with this. This is a comment
> about metadata only.

Fair enough, I agree with your perspective.

Cheers
Bryan
Received on Wed Mar 07 2007 - 02:51:14 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒