[CF-metadata] axis attribute from Sebastien Villaume on 2017-04-18 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Sebastien Villaume <sebastien.villaume>
Date: Tue, 18 Apr 2017 10:08:08 +0000 (GMT-00:00)

Dear Jonathan and Jim,

thank you for your further comments.

_at_Jonathan: I don't want to define an axis without coordinates or put the "axis" attribute on more than one object, I only propose to decouple the axis definition from the coordinate definition. I don't see why I can't define a dummy object, it is exactly what is done to define a crs: a variable without dimension, without data, only attributes:

  int crs ;
    crs:grid_mapping_name = "latitude_longitude" ;
    crs:semi_major_axis = 6371000.0 ;
    crs:inverse_flattening = 0 ;


following the same principle, I could define:

  int axis_x ;
    axis_x:axis = X;
    axis_x:standard_name = "axis" ;
    axis_x:long_name = "axis for the longitude coordinate"
    axis_x:units = "1" ;

and then have my longitude:

float longitude(j,i);
    longitude:standard_name = "longitude" ;
    longitude:units = "degree_east" ;
    longitude:long_name = "longitude" ;
    longitude:axis_mapping = "X" ;

In that specific example, only axis_x has the "axis" attribute. I added a "axis_mapping" attribute to the longitude variable to indicate that it should be understood as the coordinate of the axis given in "axis_mapping".

I favour this solution over creating a dummy coordinate variable with gridbox indices. It seems artificial to do so and usually creating something artificial to comply to a rule is the sign of a design problem or a lack of functionality to describe the use case.

_at_Jim: I agree with your last message, It would be beneficial to have an explicit mechanism to identify axes, coordinates, and their relationships when the use case does not conform to the common case dimension = axis = coordinate .

What is the CF mechanism to extend the convention or to propose semantic changes, rewrites of some portions of chapters, etc. ?

Best wishes,
/S?bastien

----- Original Message -----
From: "Jim Biard" <jbiard at cicsnc.org>
To: cf-metadata at cgd.ucar.edu
Sent: Friday, 14 April, 2017 23:11:44
Subject: Re: [CF-metadata] axis attribute

Jonathan,

We've got no great options available. The axis attribute ought to be an indicator of an independent coordinate axis. Because of possible backward compatibility issues, I think we need to demote the axis attribute to being a plotting hint. I'm at a loss for a word to use for a new attribute, but overloading axis with the different meanings seems too confusing.

The existing convention for identifying 1D coordinate variables (variable name = dimension name) works well for identifying independent axes as far as that goes, so what are we missing with a fuzzy or demoted axis attribute? It seems to me that we lack a clean way to identify the x/y/z/t/etc relationships between the different independent axes, and we lack a way to clearly identify a 1-D variable as an independent axis if it doesn't conform to the variable name = dimension name convention.

I'm not sure what's the cleanest way to go about it, but I'll suggest a name for an attribute that would fill some portion of these purposes. How about basis (or bases)?

Grace and peace,

Jim

On 4/11/17 10:36 AM, Jonathan Gregory wrote:

Dear David and Sebastien

In a given model the level numbers are meaningful indeed, but there's no
universal convention for them. Maybe some atmosphere models number them from
the TOA downwards, for instance. Nothing would prohibit that, and the
standard_name of model_level_number doesn't prescribe any convention for it.
There is likewise no universal convention for how to number the gridboxes in
x and y, but in any given model there is a convention for it. Thus I think
it's the same. The standard name identifies a conventional concept, but not a
convention for the numbers themselves.

As for the axis attribute itself, maybe we could distinguish the two meanings
of "axis" and avoid backwards compatibility by removing the restriction in
chapter 5 that it is not permissible for a data variable to have both a
coordinate variable and an auxiliary coordinate variable having an axis
attribute with any given value. That is, at present you can't have both a 1D
x-coordinate variable with axis="X" and a 2D longitude auxiliary coordinate
variable with axis="X". But we could allow them both, with different meanings.
We could say that the axis attribute of 1D coordinate variables labels the
index dimensions of the data variable, while the axis attribute of multi-
dimensional auxiliary coordinate variables labels them as spatiotemporal
dimensions, as a hint for plotting.

Best wishes

Jonathan

----- Forwarded message from David Hassell <david.hassell at ncas.ac.uk> -----

Date: Tue, 11 Apr 2017 08:41:34 +0100
From: David Hassell <david.hassell at ncas.ac.uk> To: Jonathan Gregory <j.m.gregory at reading.ac.uk> CC: CF Metadata <cf-metadata at cgd.ucar.edu> Subject: Re: [CF-metadata] axis attribute

Hello all,

I am still uncomfortable with creating a coordinate variable with arbitrary
values. The analogy with model_level_number is not quite there, I think, as
the model_level_number values are not arbitrary. For example, a value of 6
means (in one model I know) that this is the first level above the boundary
layer. That value and meaning is relevant however the data may have been
subspaced/sliced.

Perhaps an auxiliary coordinate variable could be used with missing data in
all of its values, an axis attribute (of "X" or "Y") and *no* standard
name? In the absence of a coordinate variable for that dimension, the axis
attribute of the auxiliary coordinate variable would give meaning to the
dimension.

This is still not ideal, though ...

All the best,

David

On 10 April 2017 at 18:25, Jonathan Gregory <j.m.gregory at reading.ac.uk> wrote:

Dear S?bastien

Yes, you're right, we can't have an axis without coordinates. This is
because
CF is a netCDF convention, and there's no way to attach attributes to a
dimension other than by creating a coordinate variable. Similarly we don't
have data variables with just dimensions but no data. We could have
conventions
for these concepts in CF-netCDF but it hasn't been proposed.

However it seems pretty good to me to do as you suggest and create a
coordinate variable with the gridbox indices in it, for which (as
discussed)
we could define a standard name. This is analogous to model_level_number,
which is already a standard name and could be a vertical coordinate
variable.
It can do what you want, and I agree it makes sense to label these
coordinate
variables as X and Y. That indicates that they are the horizontal
dimensions.
It's probably less important which is X and which Y. That's a plotting
issue.

I think the confusion probably arises because we didn't define what we mean
by "axis". We used the word as an "obvious" concept, but there is an
ambiguity
about whether it corresponds to a dimension of a data variable, or a
physical
variable (not necessarily spatiotemporal) which could be an independent
variable on which the data depends. Latitude might be an axis in the second
sense even if it's not an axis in the first sense. I prefer the first
sense,
which is the one the axis attribute originally had.

Best wishes

Jonathan

----- Forwarded message from Sebastien Villaume < sebastien.villaume at ecmwf.int > -----

Date: Fri, 7 Apr 2017 09:10:09 +0000
From: Sebastien Villaume <sebastien.villaume at ecmwf.int> To: David Hassell <david.hassell at ncas.ac.uk> CC: CF Metadata <cf-metadata at cgd.ucar.edu> , Jonathan Gregory <j.m.gregory at reading.ac.uk> Subject: Re: [CF-metadata] axis attribute
X-Mailer: Zimbra 8.6.0_GA_1200 (ZimbraWebClient - FF50
(Linux)/8.6.0_GA_1200)

Dear David,

I see your point and you are probably right that a plotting routine will
probably sort this out.

However this is not what I am after, I am more interested in metadata
discovery and indexing.

I need to discover what I have in a file without plotting it, without
having a human looking at it to confirm what it is and that it has been
plotted correctly.

I also would like to use these metadata informations to perform actions
like merging netCDF files, slicing, cropping, aggregating, interpolating,
comparing data in different grids and representations, etc.

I understand that implicit is fine and that explicit is not required for
some applications. I have no issue with this.

My personal point of view is that explicit is better than implicit: I
tend to prefer "mandatory" over "optional".

Being implicit means that the assumptions made need to be valid 100% of
the time to avoid accidents or corner cases.

I would like to be explicit so I need all the proper mechanisms
(variables, semantics, etc.) in place so I can use them.

Right now it feels that I am missing some functionality.

Let me copy below few bits of the terminology section in the CF 1.7
draft document (very similar to 1.6). Please read it keeping in mind what
is really an axis, a coordinate, a spatio-temporal dimension and an an
array dimension. Each time you read "coordinate2, "dimension" or
"dimensional", ask yourself what is implied and if it is not ambiguous:

------------------------
variables
------------------------
auxiliary coordinate variable
    Any netCDF variable that contains coordinate data, but is not a
coordinate variable (in the sense of that term defined by the NUG and used
by this standard - see below). Unlike coordinate variables, there is no
relationship between the name of an auxiliary coordinate variable and the
name(s) of its dimension(s).

coordinate variable
    We use this term precisely as it is defined in section 2.3.1 of the
NUG . It is a one-dimensional variable with the same name as its dimension
[e.g., time(time) ], and it is defined as a numeric data type with values
that are ordered monotonically. Missing values are not allowed in
coordinate variables.

grid mapping variable
    A variable used as a container for attributes that define a specific
grid mapping. The type of the variable is arbitrary since it contains no
data.

multidimensional coordinate variable
    An auxiliary coordinate variable that is multidimensional.

scalar coordinate variable
    A scalar variable (i.e. one with no dimensions) that contains
coordinate data. Depending on context, it may be functionally equivalent
either to a size-one coordinate variable (Section 5.7, "Scalar Coordinate
Variables") or to a size-one auxiliary coordinate variable (Section 6.1,
"Labels" and Section 9.2, "Collections, instances, and elements").

------------------------
dimensions
------------------------
latitude dimension
    A dimension of a netCDF variable that has an associated latitude
coordinate variable.

longitude dimension
    A dimension of a netCDF variable that has an associated longitude
coordinate variable.

spatiotemporal dimension
    A dimension of a netCDF variable that is used to identify a location
in time and/or space.

time dimension
    A dimension of a netCDF variable that has an associated time
coordinate variable.

vertical dimension
    A dimension of a netCDF variable that has an associated vertical
coordinate variable.

------------------------

So according to this terminology, I have in my file, 2 auxiliary
coordinates variables, but no "real" coordinates variables (according to
the NUG) so my auxiliary coordinates are auxiliary to what?

What is a "multidimensional coordinate"? if dimension means
spatio-temporal dimension it is a non sense because a coordinate can only
reference 1 spatio-temporal dimension, if it is meant to be
array-dimensions it is not clear...

What are my 2D array latitude and longitude then? are they latitude and
longitude dimension defined in the terminology? not really.... because
there are no such things as latitude and longitude dimension: you can
define latitude and longitude coordinates, associated with 2 axis that
themselves define 2 spatial dimensions... but the coordinates can be
defined in whatever n-D array.

I like the definition of "grid mapping variable", I could use a similar
variable to be a container for attributes for my "axis variable" with no
data!

I know that in the day-to-day life and discussions we don't make the
effort to be precise (I don't) and that it is easy to overload the meaning
of things but I think that the CF document needs to be very precise, non
ambiguous and can not mix axes, coordinates, spatio-temporal and array
dimensions.

/S?bastien

----- Original Message -----
From: "David Hassell" <david.hassell at ncas.ac.uk> To: "Sebastien Villaume" <sebastien.villaume at ecmwf.int> Cc: "CF Metadata" <cf-metadata at cgd.ucar.edu> , "Jonathan Gregory" <
j.m.gregory at reading.ac.uk >

Sent: Friday, 7 April, 2017 08:37:20
Subject: Re: [CF-metadata] axis attribute

Dear S?bastien,

Please bear with me when I ask to right back to the beginning! I am not
sure what the benefit is in labelling the dimensions as X or Y. In the
original tripolar case we have:

dimensions:
    i = 96 ;
    j = 73 ;
variables:
    float latitude(j, i) ;
        latitude:units = "degrees_north" ;
    float longitude(j, i) ;
        longitude:units = "degrees_east" ;
    float sit(j, i) ;
        sit:units = "m" ;
        sit:standard_name = "sea_ice_thickness" ;
        sit:coordinates = "latitude longitude" ;

There is nothing stopping anything from seeing that this is 2-d array of
size i*j, and there is nothing stopping software subpacing the data by i
and j indices.

I don't think a plotting routine would benefit from knowing that the i
dimension was "X", because there are no 1-d coordinates it can use along
that dimension.

Many thanks and all the best,

David

On 6 April 2017 at 22:45, Sebastien Villaume <
sebastien.villaume at ecmwf.int >

wrote:

Dear Mark and Jonathan,

thank you for your comments.

_at_Mark:
the short answer: you can put in principle whatever you want in that
variable because in this case it is a dummy variable only there to
hold the

axis attribute. But please read the long explanation!

the long, boring explanation:
As I understand it, the CF convention does not recognize axis as a
valid

object on its own like for "dimensions" and the various type of
"variables"

and the convention seems to make it mandatory to attach to it a
variable

that becomes a "coordinate" variable. Note that I say that it is the
coordinate variable that is attached to the axis and not the opposite.

>From a mathematical point of view, it is perfectly possible to define
an

axis without a coordinate on it (arguably it is not that useful). The
common case is that a 1-D array defines positions on that axis (the
coordinate). Then your 1-D data points are positioned with the help of
the

coordinate, itself attached to the axis.

If you have one more axis, you can define a new coordinate on it. This
creates a 2-D space. Now you have the choice on how you represent your
2-D

data points:
if the dataset is totally irregular you will have a 1-D array of "n"
data

points associated with a 1-D array of "n" positions for the first
dimension

and a 1-D array of "n" positions for the second dimension. It works,
it is

still a 2-D dataset stored in a long one dimensional vector.

Imagine that you realize that your dataset is not as irregular as you
thought, it is in fact a regular grid! you identify that you only have
i

possible values of the first coordinate and j possible values for the
second coordinate, you also notice that i*j=n. Great you can now
represent

your dataset with 2 coordinates of length i and j respectively, each of
them associated with 2 axes x and y and your data is now a 2-D array of
size (i,j). you can position your data using the coordinates, it is
mapped

using the indices within each coordinate array. Now you have a 2-D
spatial

dataset sored in a 2-D array with 2 supporting 1-D spatial coordinates
stored in one dimensional vectors.

Lets say now that you take this regular grid and you distort it... your
regular grid is gone you can no longer use i and j for partitioning!
really? well no, nobody says that you can not slice your "n" long
vectors

into i*j arrays! you could choose whatever you want for i and j as
long as

i*j=n. Of course if you choose (2)*(n/2) or (n/2)*(2), it is a bit
useless,

but you can also choose meaningful i and j because even if your grid
became

irregular, it is not random points, it is still a grid of size i*j .
This

is exactly my use case! And in that situation your coordinates can be
arranged in arrays of size i*j. What I need is 2 axes and 2
coordinates of

dimension 2 with lengths i and j. The catch here is that I have 2-D
arrays

to store one "spatial" dimension! It is another case of overlapped
concepts, dimension is used transparently for the dimension of arrays,
dimension of the geometrical space, and sometimes for the size of one
of

the dimensions of an array!!

Anyway, I should be able to define my axes like this:

int x;
    x:axis = "X";
    x:standard_name = "x_axis" ; // no standard name exists...
    x:units = "1" ; // no units, it will come with the coordinate
int y;
    y:axis = "Y";
    y:standard_name = "y_axis" ; // no standard name exists...
    y:units = "1" ; // no units, it will come with the coordinate
float longitude(j,i);
    longitude:standard_name = "longitude" ;
    longitude:units = "degrees" ;
    longitude:positive = "east" ;
    longitude:long_name = "longitude" ;
    longitude:axis_mapping = "X" ;
float latitude(j,i);
    latitude:standard_name = "latitude" ;
    latitude:units = "degrees" ;
    latitude:positive = "north" ;
    latitude:long_name = "latitude" ;
    latitude:axis_mapping = "Y" ;
float sit(j, i) ;
    sit:units = "m" ;
    sit:standard_name = "sea_ice_thickness" ;
    sit:long_name = "Ice thickness" ;
    sit:coordinates = "latitude longitude" ;

several comments:
notice how one could tell on which axis the coordinate should go using
for

instance a "axis_mapping" attribute. Not a "coordinate" attribute,
this one

should be used to tell the coordinates of my data variable!
I find this approach clearer and more flexible as it can probably cater
for any situation of axes, coordinates, etc.

But because in CF one cannot create bare axis, I follow the rules and
creates:

double x(i);
    x:axis = "X";
    x:standard_name = "..." ; // not an axis anymore, give me a
standard

name
    x:units = "1" ;
    y:long_name = "i-index of mesh grid" ;
double y(j);
    y:axis = "Y";
    y:standard_name = "..." ; // not an axis anymore, give me a
standard

name
    y:units = "1" ;
    y:long_name = "j-index of mesh grid" ;

and I have the choice of what I put in those arrays since it is somehow
artificial.

I could populate the "primary" coordinates with 1 to i and 1 to j which
would represent the indices and if I subset the grid, I then retain the
information that the domain has been cropped because the indices left
will

not be 1 to i/j but n to m.
I don' t really like this but what can I do?

If we follow this idea, it means introducing a clear concept from
"axis"

besides the other types of variables, defining new attribute to
"attach"

coordinates to axes, etc.

Another solution, much less disturbing, would be to heavily modify the
proper chapters in the CF document to:
- completely decouple the concepts of "axis" and "coordinate": a
coordinate is not an axis and vice versa.
- completely decouple the concepts of spatio temporal dimension from
array

dimension from the size the array dimension
- continue to use the "axis" attribute but on n-D array coordinates:
the

array has n-D dimensions but the coordinate map to 1 axis/spatial
dimension

only!
- Whatever the dimensions of the array for the coordinate, all the
values

contained in the array must be mapped on one given axis, the one
defined in

axis attribute. For instance, a 2-D latitude only contains values that
are

latitudes and will only map on one axis.
- In principle one could have in the same file several coordinates of
possibly different "array" dimensions, different sizes and different
units

defined for one axis. This means that the attribute "axis=z" for
instance

can appears more than once in the file. The only restriction I see is
that

2 data variables can be only plotted simultaneously if all their
coordinates share the same units (the coordinate mapped on one axis of
the

first data variable must have the same units than the coordinate
mapped on

the same axis for the other data variable). This allow 2 data variables
defined on two different grid sharing the same units to be in the same
file

and plotted together.
- X and Y should be clearly decoupled from longitude and latitude. X
and Y

are the axes, longitude and latitude are the coordinates!

_at_Jonathan:
I think the whole confusion here comes from the overlapping of
concepts:

axes and coordinates on one hand and dimension of arrays and spatial
dimensions on the other hand. If the relevant chapters are rewritten
carefully to separate axes from coordinates and array dimensions from
spatio-temporal dimensions we are good. think

_at_all: Reading more through the Trac tickets system, I noticed the nice
Trac ticket 117 about "multiple" time axis. This is a nice example of
mixing axes, coordinates, dimensions of arrays, the time dimension,
etc!

/S?bastien

----- Original Message -----
From: "Jonathan Gregory" <j.m.gregory at reading.ac.uk> To: cf-metadata at cgd.ucar.edu Sent: Thursday, 6 April, 2017 16:49:56
Subject: Re: [CF-metadata] axis attribute

Dear Jim and Sebastien

The original intention of axis was to label the independent variables
as 1D

xyzt axes of the data variables. This can be deduced from other
attributes,
but it's more effort. It's partly a plotting hint, but also it's
because

you
might reasonable want to tell software, "give me the z-axis
coordinates",

or
"calculate a mean over the x-direction". The latter is often a zonal
mean,

but
it isn't with a rotated-pole or tripolar grid, yet the operation is
still

performed sometimes.

It's useful that you've pointed out the confusion of purpose. If it
were

regarded as an acceptable backwards-incompatibility, which I'm nervous
about,
I'd be happy if we returned "axis" to its original purpose of
identifying

1D
axes, and also for scalar coordinate variables (which are equivalent to
axes
of size one), and provided another attribute to label aux coords as
horizontal.

I agree that if we have 1D x and y, with 2D lat and lon, the 1D
variables

are
the axes. That's consistent with the original purpose of the axis
attribute.

I also find the units of latitude and longitude confusing: it looks
like

it was a way to squeeze the direction of the coordinate inside the
units. I

have the same observation for the time coordinate that has its origin
in

the units!

This convention was kept in CF for backwards-compatibility with
COARDS. CF

does
not use units in any other case to identify the quantity or sense.

It was done correctly for z coordinate using "units" and "positive",
probably because there are many types of z coordinates with various
origin

and directions, and no real consensus. I note however that often the
origin

is not always clearly defined.

The positive attribute was also kept for backwards-compatibility with
COARDS.
It has the advantage of being useful to identify the vertical axis, but
this
can also be done with axis="Z". CF standard names provide information
which

indicates the sign convention.

If coordinate_index is confusing, I think standard_names containing
x_index

or y_index would be OK, provided we change the existing standard names
  magnitude_of_derivative_of_position_wrt_x_coordinate_index
  magnitude_of_derivative_of_position_wrt_y_coordinate_index
to remove "_coordinate".

Best wishes

Jonathan
_______________________________________________
CF-metadata mailing list CF-metadata at cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata _______________________________________________
CF-metadata mailing list CF-metadata at cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

--
David Hassell
National Centre for Atmospheric Science
Department of Meteorology, University of Reading,
Earley Gate, PO Box 243, Reading RG6 6BB
Tel: +44 118 378 5613 http://www.met.reading.ac.uk/ 
----- End forwarded message -----
_______________________________________________
CF-metadata mailing list CF-metadata at cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata 
-- 
David Hassell
National Centre for Atmospheric Science
Department of Meteorology, University of Reading,
Earley Gate, PO Box 243, Reading RG6 6BB
Tel: +44 118 378 5613 http://www.met.reading.ac.uk/ 
----- End forwarded message -----
_______________________________________________
CF-metadata mailing list CF-metadata at cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata 
-- 
	Visit us on 
Facebook 	Jim Biard 
Research Scholar 
Cooperative Institute for Climate and Satellites NC 
North Carolina State University 
NOAA National Centers for Environmental Information 
formerly NOAA?s National Climatic Data Center 
151 Patton Ave, Asheville, NC 28801 
e: jbiard at cicsnc.org 
o: +1 828 271 4900 
Connect with us on Facebook for climate and ocean and geophysics information, and follow us on Twitter at _at_NOAANCEIclimate and @NOAANCEIocngeo . 
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Received on Tue Apr 18 2017 - 04:08:08 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST