⇐ ⇒

[CF-metadata] CF-metadata Digest, Vol 168, Issue 24

From: Little, Chris <chris.little>
Date: Wed, 12 Apr 2017 08:49:38 +0000

In response to Sebastien and Jonathan's comments, you may find the W3C RDF Data Cube (https://www.w3.org/TR/vocab-data-cube/ ) concepts a helpful semantic framework, but it cunningly does not seem to define Dimension!

Rob Atkinson is working on an extension to distinguish which dimensions have coordinate properties (https://www.w3.org/TR/qb4st/ ).

RDF and triple stores are undoubtedly completely inappropriate for NetCDF type data, but the verbosity may be acceptable for the metadata, considering the benefits that could be brought to bear (as outlined in Sebastien's post)

The OGC Coverage standard WCS2.1 (http://www.opengeospatial.org/standards/wcs ) also gives a, more detailed, 'metadata' framework distinguishing geo-referenced coordinates and 'index-coordinates'. A variety of data payload formats, such as NetCDF3, and soon GRIB2, are supported.

I think that there may be long term benefit in trying to converge the concepts, terminologies and metadata, and even if that is not desired, an attempt at mapping the terminologies may help clarify issues.

Chris


Chris Little
Co-Chair, OGC Meteorology & Oceanography Domain Working Group

IT Fellow - Operational Infrastructures
Met Office? FitzRoy Road? Exeter? Devon? EX1 3PB? United Kingdom
Tel: +44(0)1392 886278? Fax: +44(0)1392 885681? Mobile: +44(0)7753 880514
E-mail: chris.little at metoffice.gov.uk? http://www.metoffice.gov.uk

I am normally at work Tuesday, Wednesday and Thursday each week




> -----Original Message-----
> From: CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf
> Of cf-metadata-request at cgd.ucar.edu
> Sent: Monday, April 10, 2017 6:54 PM
> To: cf-metadata at cgd.ucar.edu
> Subject: CF-metadata Digest, Vol 168, Issue 24
>
> Send CF-metadata mailing list submissions to
> cf-metadata at cgd.ucar.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> or, via email, send a message with subject or body 'help' to
> cf-metadata-request at cgd.ucar.edu
>
> You can reach the person managing the list at
> cf-metadata-owner at cgd.ucar.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of CF-metadata digest..."
>
>
> Today's Topics:
>
> 1. Re: axis attribute (Jonathan Gregory)
> 2. Re: high sample rate (seismic) data conventions (Seth McGinnis)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 10 Apr 2017 18:25:29 +0100
> From: Jonathan Gregory <j.m.gregory at reading.ac.uk>
> To: cf-metadata at cgd.ucar.edu
> Subject: Re: [CF-metadata] axis attribute
> Message-ID: <20170410172529.GA6873 at met.reading.ac.uk>
> Content-Type: text/plain; charset=iso-8859-1
>
> Dear S?bastien
>
> Yes, you're right, we can't have an axis without coordinates. This is
> because CF is a netCDF convention, and there's no way to attach
> attributes to a dimension other than by creating a coordinate variable.
> Similarly we don't have data variables with just dimensions but no
> data. We could have conventions for these concepts in CF-netCDF but it
> hasn't been proposed.
>
> However it seems pretty good to me to do as you suggest and create a
> coordinate variable with the gridbox indices in it, for which (as
> discussed) we could define a standard name. This is analogous to
> model_level_number, which is already a standard name and could be a
> vertical coordinate variable.
> It can do what you want, and I agree it makes sense to label these
> coordinate variables as X and Y. That indicates that they are the
> horizontal dimensions.
> It's probably less important which is X and which Y. That's a plotting
> issue.
>
> I think the confusion probably arises because we didn't define what we
> mean by "axis". We used the word as an "obvious" concept, but there is
> an ambiguity about whether it corresponds to a dimension of a data
> variable, or a physical variable (not necessarily spatiotemporal) which
> could be an independent variable on which the data depends. Latitude
> might be an axis in the second sense even if it's not an axis in the
> first sense. I prefer the first sense, which is the one the axis
> attribute originally had.
>
> Best wishes
>
> Jonathan
>
>
> ----- Forwarded message from Sebastien Villaume
> <sebastien.villaume at ecmwf.int> -----
>
> > Date: Fri, 7 Apr 2017 09:10:09 +0000
> > From: Sebastien Villaume <sebastien.villaume at ecmwf.int>
> > To: David Hassell <david.hassell at ncas.ac.uk>
> > CC: CF Metadata <cf-metadata at cgd.ucar.edu>, Jonathan Gregory
> > <j.m.gregory at reading.ac.uk>
> > Subject: Re: [CF-metadata] axis attribute
> > X-Mailer: Zimbra 8.6.0_GA_1200 (ZimbraWebClient - FF50
> > (Linux)/8.6.0_GA_1200)
> >
> > Dear David,
> >
> > I see your point and you are probably right that a plotting routine
> will probably sort this out.
> >
> > However this is not what I am after, I am more interested in metadata
> discovery and indexing.
> > I need to discover what I have in a file without plotting it, without
> having a human looking at it to confirm what it is and that it has been
> plotted correctly.
> > I also would like to use these metadata informations to perform
> actions like merging netCDF files, slicing, cropping, aggregating,
> interpolating, comparing data in different grids and representations,
> etc.
> >
> > I understand that implicit is fine and that explicit is not required
> for some applications. I have no issue with this.
> > My personal point of view is that explicit is better than implicit: I
> tend to prefer "mandatory" over "optional".
> >
> > Being implicit means that the assumptions made need to be valid 100%
> of the time to avoid accidents or corner cases.
> > I would like to be explicit so I need all the proper mechanisms
> (variables, semantics, etc.) in place so I can use them.
> > Right now it feels that I am missing some functionality.
> >
> > Let me copy below few bits of the terminology section in the CF 1.7
> draft document (very similar to 1.6). Please read it keeping in mind
> what is really an axis, a coordinate, a spatio-temporal dimension and
> an an array dimension. Each time you read "coordinate2, "dimension" or
> "dimensional", ask yourself what is implied and if it is not ambiguous:
> >
> > ------------------------
> > variables
> > ------------------------
> > auxiliary coordinate variable
> > Any netCDF variable that contains coordinate data, but is not a
> coordinate variable (in the sense of that term defined by the NUG and
> used by this standard - see below). Unlike coordinate variables, there
> is no relationship between the name of an auxiliary coordinate variable
> and the name(s) of its dimension(s).
> >
> > coordinate variable
> > We use this term precisely as it is defined in section 2.3.1 of
> the NUG . It is a one-dimensional variable with the same name as its
> dimension [e.g., time(time) ], and it is defined as a numeric data type
> with values that are ordered monotonically. Missing values are not
> allowed in coordinate variables.
> >
> > grid mapping variable
> > A variable used as a container for attributes that define a
> specific grid mapping. The type of the variable is arbitrary since it
> contains no data.
> >
> > multidimensional coordinate variable
> > An auxiliary coordinate variable that is multidimensional.
> >
> > scalar coordinate variable
> > A scalar variable (i.e. one with no dimensions) that contains
> coordinate data. Depending on context, it may be functionally
> equivalent either to a size-one coordinate variable (Section 5.7,
> "Scalar Coordinate Variables") or to a size-one auxiliary coordinate
> variable (Section 6.1, "Labels" and Section 9.2, "Collections,
> instances, and elements").
> >
> > ------------------------
> > dimensions
> > ------------------------
> > latitude dimension
> > A dimension of a netCDF variable that has an associated latitude
> coordinate variable.
> >
> > longitude dimension
> > A dimension of a netCDF variable that has an associated longitude
> coordinate variable.
> >
> > spatiotemporal dimension
> > A dimension of a netCDF variable that is used to identify a
> location in time and/or space.
> >
> > time dimension
> > A dimension of a netCDF variable that has an associated time
> coordinate variable.
> >
> > vertical dimension
> > A dimension of a netCDF variable that has an associated vertical
> coordinate variable.
> > ------------------------
> >
> > So according to this terminology, I have in my file, 2 auxiliary
> coordinates variables, but no "real" coordinates variables (according
> to the NUG) so my auxiliary coordinates are auxiliary to what?
> > What is a "multidimensional coordinate"? if dimension means spatio-
> temporal dimension it is a non sense because a coordinate can only
> reference 1 spatio-temporal dimension, if it is meant to be array-
> dimensions it is not clear...
> > What are my 2D array latitude and longitude then? are they latitude
> and longitude dimension defined in the terminology? not really....
> because there are no such things as latitude and longitude dimension:
> you can define latitude and longitude coordinates, associated with 2
> axis that themselves define 2 spatial dimensions... but the coordinates
> can be defined in whatever n-D array.
> > I like the definition of "grid mapping variable", I could use a
> similar variable to be a container for attributes for my "axis
> variable" with no data!
> >
> > I know that in the day-to-day life and discussions we don't make the
> effort to be precise (I don't) and that it is easy to overload the
> meaning of things but I think that the CF document needs to be very
> precise, non ambiguous and can not mix axes, coordinates, spatio-
> temporal and array dimensions.
> >
> > /S?bastien
> >
> > ----- Original Message -----
> > From: "David Hassell" <david.hassell at ncas.ac.uk>
> > To: "Sebastien Villaume" <sebastien.villaume at ecmwf.int>
> > Cc: "CF Metadata" <cf-metadata at cgd.ucar.edu>, "Jonathan Gregory"
> > <j.m.gregory at reading.ac.uk>
> > Sent: Friday, 7 April, 2017 08:37:20
> > Subject: Re: [CF-metadata] axis attribute
> >
> > Dear S?bastien,
> >
> > Please bear with me when I ask to right back to the beginning! I am
> > not sure what the benefit is in labelling the dimensions as X or Y.
> In
> > the original tripolar case we have:
> >
> > dimensions:
> > i = 96 ;
> > j = 73 ;
> > variables:
> > float latitude(j, i) ;
> > latitude:units = "degrees_north" ;
> > float longitude(j, i) ;
> > longitude:units = "degrees_east" ;
> > float sit(j, i) ;
> > sit:units = "m" ;
> > sit:standard_name = "sea_ice_thickness" ;
> > sit:coordinates = "latitude longitude" ;
> >
> > There is nothing stopping anything from seeing that this is 2-d array
> > of size i*j, and there is nothing stopping software subpacing the
> data
> > by i and j indices.
> >
> > I don't think a plotting routine would benefit from knowing that the
> i
> > dimension was "X", because there are no 1-d coordinates it can use
> > along that dimension.
> >
> > Many thanks and all the best,
> >
> > David
> >
> > On 6 April 2017 at 22:45, Sebastien Villaume
> > <sebastien.villaume at ecmwf.int>
> > wrote:
> >
> > > Dear Mark and Jonathan,
> > >
> > > thank you for your comments.
> > >
> > > _at_Mark:
> > > the short answer: you can put in principle whatever you want in
> that
> > > variable because in this case it is a dummy variable only there to
> > > hold the axis attribute. But please read the long explanation!
> > >
> > > the long, boring explanation:
> > > As I understand it, the CF convention does not recognize axis as a
> > > valid object on its own like for "dimensions" and the various type
> of "variables"
> > > and the convention seems to make it mandatory to attach to it a
> > > variable that becomes a "coordinate" variable. Note that I say that
> > > it is the coordinate variable that is attached to the axis and not
> the opposite.
> > >
> > > From a mathematical point of view, it is perfectly possible to
> > > define an axis without a coordinate on it (arguably it is not that
> > > useful). The common case is that a 1-D array defines positions on
> > > that axis (the coordinate). Then your 1-D data points are
> positioned
> > > with the help of the coordinate, itself attached to the axis.
> > >
> > > If you have one more axis, you can define a new coordinate on it.
> > > This creates a 2-D space. Now you have the choice on how you
> > > represent your 2-D data points:
> > > if the dataset is totally irregular you will have a 1-D array of
> "n"
> > > data points associated with a 1-D array of "n" positions for the
> > > first dimension and a 1-D array of "n" positions for the second
> > > dimension. It works, it is still a 2-D dataset stored in a long one
> dimensional vector.
> > >
> > > Imagine that you realize that your dataset is not as irregular as
> > > you thought, it is in fact a regular grid! you identify that you
> > > only have i possible values of the first coordinate and j possible
> > > values for the second coordinate, you also notice that i*j=n. Great
> > > you can now represent your dataset with 2 coordinates of length i
> > > and j respectively, each of them associated with 2 axes x and y and
> > > your data is now a 2-D array of size (i,j). you can position your
> > > data using the coordinates, it is mapped using the indices within
> > > each coordinate array. Now you have a 2-D spatial dataset sored in
> a
> > > 2-D array with 2 supporting 1-D spatial coordinates stored in one
> dimensional vectors.
> > >
> > > Lets say now that you take this regular grid and you distort it...
> > > your regular grid is gone you can no longer use i and j for
> partitioning!
> > > really? well no, nobody says that you can not slice your "n" long
> > > vectors into i*j arrays! you could choose whatever you want for i
> > > and j as long as i*j=n. Of course if you choose (2)*(n/2) or
> > > (n/2)*(2), it is a bit useless, but you can also choose meaningful
> i
> > > and j because even if your grid became irregular, it is not random
> > > points, it is still a grid of size i*j . This is exactly my use
> > > case! And in that situation your coordinates can be arranged in
> > > arrays of size i*j. What I need is 2 axes and 2 coordinates of
> > > dimension 2 with lengths i and j. The catch here is that I have 2-D
> > > arrays to store one "spatial" dimension! It is another case of
> > > overlapped concepts, dimension is used transparently for the
> > > dimension of arrays, dimension of the geometrical space, and
> sometimes for the size of one of the dimensions of an array!!
> > >
> > > Anyway, I should be able to define my axes like this:
> > >
> > > int x;
> > > x:axis = "X";
> > > x:standard_name = "x_axis" ; // no standard name exists...
> > > x:units = "1" ; // no units, it will come with the coordinate
> > > int y;
> > > y:axis = "Y";
> > > y:standard_name = "y_axis" ; // no standard name exists...
> > > y:units = "1" ; // no units, it will come with the coordinate
> > > float longitude(j,i);
> > > longitude:standard_name = "longitude" ;
> > > longitude:units = "degrees" ;
> > > longitude:positive = "east" ;
> > > longitude:long_name = "longitude" ;
> > > longitude:axis_mapping = "X" ;
> > > float latitude(j,i);
> > > latitude:standard_name = "latitude" ;
> > > latitude:units = "degrees" ;
> > > latitude:positive = "north" ;
> > > latitude:long_name = "latitude" ;
> > > latitude:axis_mapping = "Y" ;
> > > float sit(j, i) ;
> > > sit:units = "m" ;
> > > sit:standard_name = "sea_ice_thickness" ;
> > > sit:long_name = "Ice thickness" ;
> > > sit:coordinates = "latitude longitude" ;
> > >
> > > several comments:
> > > notice how one could tell on which axis the coordinate should go
> > > using for instance a "axis_mapping" attribute. Not a "coordinate"
> > > attribute, this one should be used to tell the coordinates of my
> data variable!
> > > I find this approach clearer and more flexible as it can probably
> > > cater for any situation of axes, coordinates, etc.
> > >
> > > But because in CF one cannot create bare axis, I follow the rules
> > > and
> > > creates:
> > >
> > > double x(i);
> > > x:axis = "X";
> > > x:standard_name = "..." ; // not an axis anymore, give me a
> > > standard name
> > > x:units = "1" ;
> > > y:long_name = "i-index of mesh grid" ; double y(j);
> > > y:axis = "Y";
> > > y:standard_name = "..." ; // not an axis anymore, give me a
> > > standard name
> > > y:units = "1" ;
> > > y:long_name = "j-index of mesh grid" ;
> > >
> > > and I have the choice of what I put in those arrays since it is
> > > somehow artificial.
> > >
> > > I could populate the "primary" coordinates with 1 to i and 1 to j
> > > which would represent the indices and if I subset the grid, I then
> > > retain the information that the domain has been cropped because the
> > > indices left will not be 1 to i/j but n to m.
> > > I don' t really like this but what can I do?
> > >
> > > If we follow this idea, it means introducing a clear concept from
> "axis"
> > > besides the other types of variables, defining new attribute to
> "attach"
> > > coordinates to axes, etc.
> > >
> > > Another solution, much less disturbing, would be to heavily modify
> > > the proper chapters in the CF document to:
> > > - completely decouple the concepts of "axis" and "coordinate": a
> > > coordinate is not an axis and vice versa.
> > > - completely decouple the concepts of spatio temporal dimension
> from
> > > array dimension from the size the array dimension
> > > - continue to use the "axis" attribute but on n-D array
> > > coordinates: the array has n-D dimensions but the coordinate map to
> > > 1 axis/spatial dimension only!
> > > - Whatever the dimensions of the array for the coordinate, all the
> > > values contained in the array must be mapped on one given axis, the
> > > one defined in axis attribute. For instance, a 2-D latitude only
> > > contains values that are latitudes and will only map on one axis.
> > > - In principle one could have in the same file several coordinates
> > > of possibly different "array" dimensions, different sizes and
> > > different units defined for one axis. This means that the attribute
> > > "axis=z" for instance can appears more than once in the file. The
> > > only restriction I see is that
> > > 2 data variables can be only plotted simultaneously if all their
> > > coordinates share the same units (the coordinate mapped on one axis
> > > of the first data variable must have the same units than the
> > > coordinate mapped on the same axis for the other data variable).
> > > This allow 2 data variables defined on two different grid sharing
> > > the same units to be in the same file and plotted together.
> > > - X and Y should be clearly decoupled from longitude and latitude.
> X
> > > and Y are the axes, longitude and latitude are the coordinates!
> > >
> > >
> > > _at_Jonathan:
> > > I think the whole confusion here comes from the overlapping of
> concepts:
> > > axes and coordinates on one hand and dimension of arrays and
> spatial
> > > dimensions on the other hand. If the relevant chapters are
> rewritten
> > > carefully to separate axes from coordinates and array dimensions
> > > from spatio-temporal dimensions we are good. think
> > >
> > > _at_all: Reading more through the Trac tickets system, I noticed the
> > > nice Trac ticket 117 about "multiple" time axis. This is a nice
> > > example of mixing axes, coordinates, dimensions of arrays, the time
> dimension, etc!
> > >
> > >
> > > /S?bastien
> > >
> > > ----- Original Message -----
> > > From: "Jonathan Gregory" <j.m.gregory at reading.ac.uk>
> > > To: cf-metadata at cgd.ucar.edu
> > > Sent: Thursday, 6 April, 2017 16:49:56
> > > Subject: Re: [CF-metadata] axis attribute
> > >
> > > Dear Jim and Sebastien
> > >
> > > The original intention of axis was to label the independent
> > > variables as 1D xyzt axes of the data variables. This can be
> > > deduced from other attributes, but it's more effort. It's partly a
> > > plotting hint, but also it's because you might reasonable want to
> > > tell software, "give me the z-axis coordinates", or "calculate a
> > > mean over the x-direction". The latter is often a zonal mean, but
> it
> > > isn't with a rotated-pole or tripolar grid, yet the operation is
> > > still performed sometimes.
> > >
> > > It's useful that you've pointed out the confusion of purpose. If it
> > > were regarded as an acceptable backwards-incompatibility, which I'm
> > > nervous about, I'd be happy if we returned "axis" to its original
> > > purpose of identifying 1D axes, and also for scalar coordinate
> > > variables (which are equivalent to axes of size one), and provided
> > > another attribute to label aux coords as horizontal.
> > >
> > > I agree that if we have 1D x and y, with 2D lat and lon, the 1D
> > > variables are the axes. That's consistent with the original purpose
> > > of the axis attribute.
> > >
> > > > I also find the units of latitude and longitude confusing: it
> > > > looks like
> > > it was a way to squeeze the direction of the coordinate inside the
> > > units. I have the same observation for the time coordinate that has
> > > its origin in the units!
> > >
> > > This convention was kept in CF for backwards-compatibility with
> > > COARDS. CF does not use units in any other case to identify the
> > > quantity or sense.
> > >
> > > > It was done correctly for z coordinate using "units" and
> > > > "positive",
> > > probably because there are many types of z coordinates with various
> > > origin and directions, and no real consensus. I note however that
> > > often the origin is not always clearly defined.
> > >
> > > The positive attribute was also kept for backwards-compatibility
> > > with COARDS.
> > > It has the advantage of being useful to identify the vertical axis,
> > > but this can also be done with axis="Z". CF standard names provide
> > > information which indicates the sign convention.
> > >
> > > If coordinate_index is confusing, I think standard_names containing
> > > x_index or y_index would be OK, provided we change the existing
> standard names
> > > magnitude_of_derivative_of_position_wrt_x_coordinate_index
> > > magnitude_of_derivative_of_position_wrt_y_coordinate_index
> > > to remove "_coordinate".
> > >
> > > Best wishes
> > >
> > > Jonathan
> > > _______________________________________________
> > > CF-metadata mailing list
> > > CF-metadata at cgd.ucar.edu
> > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > > _______________________________________________
> > > CF-metadata mailing list
> > > CF-metadata at cgd.ucar.edu
> > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > >
> >
> >
> >
> > --
> > David Hassell
> > National Centre for Atmospheric Science Department of Meteorology,
> > University of Reading, Earley Gate, PO Box 243, Reading RG6 6BB
> > Tel: +44 118 378 5613
> > http://www.met.reading.ac.uk/
>
> ----- End forwarded message -----
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 10 Apr 2017 11:54:18 -0600
> From: Seth McGinnis <mcginnis at ucar.edu>
> To: "Maccarthy, Jonathan K" <jkmacc at lanl.gov>
> Cc: "cf-metadata at cgd.ucar.edu" <cf-metadata at cgd.ucar.edu>
> Subject: Re: [CF-metadata] high sample rate (seismic) data conventions
> Message-ID: <8b9bfadb-d72b-2ca2-9f8c-1ac83f6a6a8d at ucar.edu>
> Content-Type: text/plain; charset=utf-8
>
> Hi Jonathan,
>
> Oh, climate model outputs are also supposed to have a uniform sample
> rate for the whole time series -- emphasis on *SUPPOSED TO*. To my
> dismay, I have encountered multiple cases where something went wrong
> with the generation of the data files, resulting in missing or repeated
> or weirdly-spaced timesteps, and sorting out the resulting problems is
> how I came to appreciate the value of the explicit coordinate...
>
> As far as I know, you are correct that CF does not have a standardized
> way to represent a coordinate solely in terms of a formula without
> reference to a corresponding coordinate variable.
>
> However, that doesn't mean you couldn't do it and still have the file
> be CF-compliant. As far as I am aware (and somebody correct me if I'm
> wrong), coordinate variables are not actually mandatory.
>
> So if, for reasons of feasibility, you found it necessary to do
> something like the following, I believe that strictly speaking it would
> be not just allowed but fully CF-compliant:
>
> dimensions:
> time = UNLIMITED; // (1892160000 currently)
> variables:
> double acceleration(time);
> acceleration:long_name = "ground acceleration";
> acceleration:units = "m s-2";
> acceleration:start_time = "2017-01-01 00:00:00.01667"
> acceleration:sampling_rate = "60 hz"
> data:
> acceleration = 1.324145e-6, ...
>
>
> I actually have some files without any coordinate variables sitting
> around from the intermediate stage of some processing I did; I checked
> one with Rosalyn Hatcher's cf-checker, and it didn't complain, so I
> think it is technically legal. It's kind of a letter-of-the-law rather
> than spirit-of-the-law thing, but it's at least theoretically
> compliant.
> Up to you whether that would count as sufficiently suitable for your
> use case.
>
> Cheers,
>
> --Seth
>
>
>
> On 4/10/17 10:54 AM, Maccarthy, Jonathan K wrote:
> > Hi Seth,
> >
> > Thanks for the very helpful response. I can understand the argument
> > for explicit coordinates, as opposed to using formulae; I think it
> > solves several problems. The assumption of a uniform sample rate for
> > the length of a continuous time series is deeply engrained in most
> > seismic software, however. Changing that assumption may lead to
> other
> > problems (but maybe not!). Data volumes for a single channel can be
> > 40-100 4-byte samples per second, which is something like 5-12 GB per
> > channel per year uncompressed. Commonly, dozens of channels are used
> > at once, though some of them may share time coordinates. It sounds
> > like this use-case is similar in volume to what you've used, and may
> > be worth trying out.
> >
> > Just to be clear, however, would I be correct in saying that CF has
> no
> > accepted way of representing the data as I've described?
> >
> > Thanks again,
> > Jonathan
> >
> >> On Apr 7, 2017, at 4:43 PM, Seth McGinnis <mcginnis at ucar.edu
> >> <mailto:mcginnis at ucar.edu>> wrote:
> >>
> >> Hi Jonathan,
> >>
> >> I would interpret the CF stance as being that the value in having
> >> explicit coordinate variables and other ancillary data to accompany
> >> the data outweighs the cost of increased storage.
> >>
> >> There are some cases where CF bends away from that for the sake of
> >> practicality (see, e.g., the discussion about external file
> >> references for cell_bounds in CMIP5), but overall, my sense is that
> >> the community feels that it's better to have things explicitly
> >> written out in the file than it is to provide them implicitly via a
> formula to calculate them.
> >>
> >> Based on my personal experiences, I think this is the right
> approach.
> >> (In fact, I take it even further: I prefer to avoid data compression
> >> entirely and to keep like data with like as much as possible, rather
> >> than splitting big files into smaller pieces.)
> >>
> >> I have endured far, far more suffering and toil from (a) trying to
> >> figure out what's wrong with a file that violates some implicit
> >> assumption (like "there are never gaps in the time coordinate") and
> >> (b) dealing with the complications of various tactics for keeping
> >> file sizes small than I ever have from storing and working with very
> large files.
> >>
> >> YMMV, of course. What are your data volumes like? I'm working at
> >> the terabyte scale, and as long as my file sizes stay under a few
> >> dozen GB, I don't really even bother thinking about anything that
> >> affects the file size by less than an order of magnitude.
> >>
> >> Cheers,
> >>
> >> Seth McGinnis
> >>
> >> ----
> >> NARCCAP / NA-CORDEX Data Manager
> >> RISC - IMAGe - CISL - NCAR
> >> ----
> >>
> >>
> >> On 4/7/17 9:55 AM, Maccarthy, Jonathan K wrote:
> >>> Hi all,
> >>>
> >>> I?m curious about the suitability of CF metadata conventions for
> >>> seismic sensor data. I?ve done a bit of searching, but can?t find
> >>> any mention of how CF conventions would store high sample-rate data
> >>> sensor data. I do see descriptions of time series conventions,
> >>> where hourly or daily sensor data samples are stored along with
> >>> their timestamps, but storing individual timestamps for each sample
> >>> of a high sample rate sensor would unnecessarily double the
> storage.
> >>> Seismic formats typically don?t store time vectors, but instead
> just
> >>> store vectors of samples with an associated start time and sampling
> >>> rate.
> >>>
> >>> Could someone please point me towards a discussion or existing
> >>> conventions on this topic? Any help or suggestion is appreciated.
> >>>
> >>> Best, Jon _______________________________________________
> >>> CF-metadata mailing list CF-metadata at cgd.ucar.edu
> >>> <mailto:CF-metadata at cgd.ucar.edu>
> >>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >>>
> >> _______________________________________________
> >> CF-metadata mailing list
> >> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
> >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
> ------------------------------
>
> End of CF-metadata Digest, Vol 168, Issue 24
> ********************************************
Received on Wed Apr 12 2017 - 02:49:38 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒