[CF-metadata] Particle Track Feature Type (was: Re: point observation data in CF 1.4) from John Caron on 2010-11-19 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: John Caron <caron>
Date: Thu, 18 Nov 2010 20:15:35 -0700

Im thinking that we need a new feature type for this. Im calling it
"particleTrack" but theres probably a better name.

My reasoning is that the nested table representation of trajectories is:

Table {
   traj_id;
   Table {
      time;
      lat, lon, z;
      data;
   }
}

but this case has the inner and outer table inverted:

Table {
   time;
   Table {
      particle_id;
      lat, lon, z;
      data;
      data2;
   }
}

So, following that line of thought, the possibilities in CDL are:

1) If avg number of particles ~ max number of particles at any time
step, then one could use multdimensional arrays:

dimensions:
   maxParticles = 1000 ;
   time = 7777 ; // may be UNLIMITED

variables:

   double time(time) ;

   int particle_id(time, maxParticles) ;
   float lon(time, maxParticles) ;
   float lat(time, maxParticles) ;
   float z(time, maxParticles) ;
   float data(time, maxParticles) ;

attributes:
   :featureType = "particleTrack";

note maxParticles is the max number of particles at any one time step,
not total particle tracks. The particle trajectories have to be found by
examining the values of particle_id(time, maxParticles).

2) The CDL of the ragged case would look like:

dimensions:
   obs = 500000; // UNLIMITED
   time = 7777 ;

variables:
   int time(time) ;
   int rowSize(time) ;

   int particle_id(obs) ;
   float lon(obs) ;
   float lat(obs) ;
   float z(obs) ;
   float data(obs) ;

attributes:
   :featureType = "particleTrack";

in this case, you dont have to know the max number of particles at any
one time step, but you do need to know the number of time steps
beforehand. The particle trajectories have to be found by examining the
values of particle_id(obs). The particles at time step i are contained
in the obs variables between start(i) to start(i) + rowSize(i).

these layouts are optimized for processing all particles at a given
time, and for sequentially processing time steps. If one wanted to
process particle trajectories, that will be much slower. If you needed
to do it a lot, you might want to rewrite the file. a more sophisticated
application, possibly a server, could write an index to speed it up.
Received on Thu Nov 18 2010 - 20:15:35 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST