⇐ ⇒

[CF-metadata] ragged arrays vs compression by gathering

From: John Caron <caron>
Date: Sat, 11 Oct 2008 16:22:04 -0600

I was looking at the "reduced horizontal grid" feature because its really a way to store "ragged arrays" rather than the somewhat more general "compression by gathering". Its possible that a convention to store ragged arrays might be quite useful in point observation conventions that ive been trying to create.

Example 5.3. Reduced horizontal grid

dimensions:
  londim = 128 ;
  latdim = 64 ;
  rgrid = 6144 ;

variables:
  float PS(rgrid) ;
    PS:long_name = "surface pressure" ;
    PS:units = "Pa" ;
    PS:coordinates = "lon lat" ;
  float lon(rgrid) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(rgrid) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
  int rgrid(rgrid);
    rgrid:compress = "latdim londim";

If one examines the rgrid values, I think you would see sequences of lat,lon indices like
        0,0 0,1 0,2 ... 0,row0size
        1,0 1,1 1,2 ... 0,row1size
        2,0 2,1 2,2 ... 0,row2size
        3,0 3,1 3,2 ... 0,row3size
        ...

that is, it could be completely specified by the set of rowSizes, everything else just being an enumeration of the indices of the ragged array.

A more explicit data structure for ragged arrays might look like:

dimensions:
  londim = 128 ;
  latdim = 64 ;
  rgrid = 6144 ;

variables:
  float PS(rgrid) ;
    PS:long_name = "surface pressure" ;
    PS:units = "Pa" ;
    PS:coordinates = "lon lat" ;

  float lon(rgrid) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;

  float lat(latdim) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;

  int rowSize(latdim);
    rgrid:ragged= "latdim londim";
    rgrid:desc= "number of longitudes for each latitude row";

so that: rgrid size = sum(rowSize)

in this example, the lon coordinate would just be lon(rgrid). to figure out the lat coordinate, you have to form rowStart(latDim):

  rowStart(i) = 0 if i = 0
  rowStart(i) = rowStart(i-1) + rowSize(i-1) if i > 0

then find i such that rowStart(i) <= rgrid index < rowStart(i+1).

There are a number of variants that might be slightly easier to understand:

1) store rowStart directly

2) store rgrid(rgrid) with just the latitude index, this would allow random ordering of the points.


i thought id get feedback to see if its worth pursuing.
Received on Sat Oct 11 2008 - 16:22:04 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒