⇐ ⇒

[CF-metadata] CF Conventions and NetCDF4 string attributes

From: stephen.pascoe at stfc.ac.uk <stephen.pascoe>
Date: Wed, 22 Jan 2014 16:15:37 +0000

To my knowledge CF have so far not of how the conventions will be interpreted in NetCDF4 files. This is a big subject and it will take a long time to work through the many innovations in NetCDF4. However, we have recently received a NetCDF4 file which illustrates a small corner case I would like clarified.

The file is a full NetCDF4 file not NetCDF4-classic (i.e. it does not contain the hidden HDF5 flag indicating it is NetCDF4-classic). However, it does use a flat data model and looks almost identical to a classic model file. The one difference is that it uses NC_STRING as the type of it's string attributes. E.g.

netcdf4 {
dimensions:
                time = 108031 ;
...
// global attributes:
                                string :Conventions = "CF-1.5 ACDD-1.0" ;
...
}

This file was produced by a recent version of IDL which does not appear to support the NetCDF4-classic format [1]. In the IDL interface it is possible to create attributes of type NC_CHAR and be equivilent to arrays of chars in NetCDF-classic model. However, it is very easy to create attributes of NC_STRING in IDL so we can expect many further examples of this type of file in the future.

Strict interpretation of the CF-conventions seem to suggest this file is not compliant because of section 2.2 of the conventions:

2.2. Data Types
The netCDF data types char, byte, short, int, float or real, and double are all acceptable. [...]

NetCDF does not support a character string type, so these must be represented as character arrays. In this document, a one dimensional array of character data is simply referred to as a "string". An n-dimensional array of strings must be implemented as a character array of dimension (n,max_string_length), with the last (most rapidly varying) dimension declared large enough to contain the longest string in the array. All the strings in a given array are therefore defined to be equal in length. For example, an array of strings containing the names of the months would be dimensioned (12,9) in order to accommodate "September", the month with the longest name.

IMO this section needs to be updated to refer to NetCDF4 in order to remove ambiguity.

There is a further complication that the current official cf-checker cannot read this file becauseit cannot handle attributes of type NC_STRING. This sort of problem is exactly what NetCDF4-classic was intended to mittigate. We can fix the tool but in the mean time it would help if the CF document was crystal clear.

I suggest the following options:

1. We state that CF compliant files MUST be NetCDF3 or NetCDF4-classic format (i.e. "ncdump -k" returns "classic" "64-bit offset" or "netCDF-4 classic model")

2. We state that CF compliant files MUST only use the types described in section 2.2 and update the text to make referenct to NetCDF4 types that are not allowed. A warning about string attributes would be helpful.

3. We state that an attribute with a value of a length-1 array of types NC_STRING is equivilent to an array of NC_CHAR in CF, and therefore the example above is acceptable.


Thanks,
Stephen.

[1] http://www.exelisvis.com/docs/ncdf_create.html#netCDF_2618656010_1006152



---
Stephen Pascoe  +44 (0)1235 445980
Centre of Environmental Data Archival
STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0QX, UK
-- 
Scanned by iCritical.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20140122/175023aa/attachment.html>
Received on Wed Jan 22 2014 - 09:15:37 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒