John, Jon, Thomas, et. al.,
I will weigh in here with a vote _*against*_ creating another dimension
( a new axis type) to achieve vector component . In higher level code
creating a multi-dimensional vector object may well be an elegant
approach -- but I will argue in bullets below that at the file
definition level it is can add complexity and create a number of
significant inconsistencies in the code pipelines and backwards
compatibility problems.
1. There will always be two classes of data access need for vectors --
1) looking at the individual components; 2) looking at the
multi-component vector quantity. Accessing individual components is
*very common* (I'd speculate that it may be the more common of the
two modes.) If we group the components into a single variable
using an additional dimension it means that the code to treat the
individual vector components will become different from the scalar
variable code, despite the fact that there is a nearly identical
list of use cases for vector components and scalar variables. (This
would be a step away from elegance.)
2. We would almost certainly find that staggered grids becomes a
slippery slope of complexity. The specific index ranges needed for
the individual staggered components depend on the operation that is
being performed: vector plots, curl, divergence, volume integrals,
etc. ... These needs are not consistent with a single index range
applying to all components.
3. There are many use cases in which the analysis pipeline is different
for different components of a vector. Some examples: the vectors
may be stored in separate files (e.g. the entire CMIP5 archive ...
and we know what a challenge it is to get data providers to utilize
the aggregation tools); the Z vector component of ocean data is
often generated through an on-the-fly analysis conservation-of-mass
anlaysis step, rather than stored in the file; the Z component often
requires special scaling -- e.g. when making vector plots. Such
cases illustrate why it is more elegant to make the vector
associations in higher level code, rather than at the file level.
4. 3-vector components are often plotted and analyzed in 2-dimensional
views. With a vector dimension of length 3, we cannot do a
multi-dimensional access in the XZ plane without reading the Y
component, too -- illustrating where the vector dimension at the
file level can add complexity.
- Steve
=======================
On 12/9/2011 11:43 AM, John Caron wrote:
> On 12/9/2011 11:37 AM, Jonathan Gregory wrote:
>> Dear John
>>
>> I prefer the idea that Thomas has put forward of an umbrella, rather
>> than
>> containing the vector/tensor components in one data variable, because
>>
>> * I really don't like the idea of units being mixed within a data
>> variable.
>> I think things with different units must certainly be different
>> quantities
>> and could not be regarded as the same field. You can get away with it
>> if they
>> are all m s-1, for instance, but not so easily if the vertical one is
>> orders
>> of magnitude different from the horizontal, and not at all if the
>> vector is
>> expressed in polar coordinates.
>
> I think the common case is that the vector components have the same
> unit. One could restrict to that case.
>
>>
>> * I think it would be very inconvenient, and would break a lot of
>> existing
>> software, if the coordinates were not what they appeared to be,
>> because an
>> offset had to be added. Also, in general, the component fields of a
>> staggered
>> grid do not have the same dimensions, as well as differing in the
>> coordinates.
> Im not sure what "an offset had to be added" means.
>
> I think the common case of staggered grids could be handled with a
> convention defining the staggering, rather than seperate dimensions. I
> pull out the one Rich Signell and I cam up with a long time ago, for
> its own sake.
>
>>
>> * It avoids the need to define a convention for labelling vector/tensor
>> components.
> I think this convention would be about as complex as the one you will
> need for Thomas' proposal.
>
>>
>> * It is completely backwards-compatible as regards the component
>> fields, which
>> are exactly as before; we're just adding some separate information
>> linking
>> them. This seems neat to me.
>
> I agree thats a strong reason for Thomas' method.
>
> OTOH, if we start thinking in terms of the extended model, a Structure
> ("compound type" in HDF5 parlance) might be useful. What do you think
> about starting to think about possible uses of extended data model?
>
> Thanks for your thoughts, as always, interesting.
>
> John
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20111213/19a17de2/attachment-0001.html>
Received on Tue Dec 13 2011 - 11:34:24 GMT