⇐ ⇒

[CF-metadata] Particle Track Feature Type (was: Re: point observation data in CF 1.4)

From: Christopher Barker <Chris.Barker>
Date: Fri, 19 Nov 2010 13:46:20 -0800

John,

Thanks for putting this together, this is pretty close to what I was
thinking.

> I'm thinking that we need a new feature type for this. I'm calling it
> "particleTrack" but there's probably a better name.

something that captures the sense of a collection of particles would be
better:

particleCollectionTrack ?

not that I like that, either...

> but this case has the inner and outer table inverted:
yup.

> 1) If avg number of particles ~ max number of particles at any time
> step, then one could use multdimensional arrays:

> 2) The CDL of the ragged case would look like:

Are you proposing that both options should be supported, or that we need
to choose one. If we need to choose, I'd say the ragged representation
is the way to go -- it's more flexible. If you do have a model that adds
and removes particles, then you'd need to parse out the ID to follow a
particular particle anyway.

Which makes me think -- if we support the 2-d array approach, I think it
should be used only for data where the particle corresponding to a given
index along the particles axis does not change. If it does, then you
might as well use the ragged array approach.

> variables:
> int time(time) ;
> int rowSize(time) ;

How about:

int time_step_index(time);

That would give you an index into the nth time step easily -- so you
could very quickly grab a given time step, without having to add up all
the rowSize values up to that time step.

rowSize and time_step_index are redundant data, which is not a good
idea, as it could get out of sync. If I were to choose one, it would be
time_step_index, as you could compute rowSize simply by subtracting the
two adjacent values of time_step_index

> The particles at time step i are contained
> in the obs variables between start(i) to start(i) + rowSize(i).

where do you get "start"? did you intend to include that (I assume it's
like my time_step_index ?)

> these layouts are optimized for processing all particles at a given
> time, and for sequentially processing time steps.

Which are the most common use-cases.

> If one wanted to
> process particle trajectories, that will be much slower. If you needed
> to do it a lot, you might want to rewrite the file. a more sophisticated
> application, possibly a server, could write an index to speed it up.

yup.

One thing I'm not clear on:

Do the netcdf libs (the C lib in particular) have any built-in support
for ragged arrays? or does the client code have to handle that?

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception
Chris.Barker at noaa.gov
Received on Fri Nov 19 2010 - 14:46:20 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒