⇐ ⇒

[CF-metadata] valid_min and valid_max considered harmful?

From: Jon Blower <j.d.blower>
Date: Tue, 9 Jul 2013 12:42:48 +0000

Hi all,

On very numerous occasions, I have found problems with datasets where the valid_min and valid_max attributes are not set correctly, either because the original data files are wrong, or because some processing chain or aggregation machinery has resulted in incorrect values. This is a particular problem in time coordinate arrays.

In my experience, these occasions have outweighed the number of times when these attributes are actually useful - in most cases the user only has one missing value and this should be recorded as a _FillValue, as in section 2.5.1 of the CF documentation, or does not have a missing value at all.

I think this happens because data producers (with good intentions) feel obliged to populate their NetCDF files with as much metadata as possible and end up specifying some attributes that don't provide much value for their data. Is it worth adding some text to the CF docs to say something along the lines of:

"The attributes valid_min, valid_max and valid_range should only be used when necessary [or should be used with caution], as they can cause unexpected behaviour in situations such as aggregation. If only one missing value is needed for a variable then we recommend strongly that this value be specified using the _FillValue attribute. "

The second sentence is already present in the standard. We may need to define what "when necessary" means...

Cheers,
Jon

--
Dr Jon Blower
Technical Director, Reading e-Science Centre
School of Mathematical and Physical Sciences
University of Reading, UK
Tel: +44 (0)118 378 5213
Mob: +44 (0)7919 112687
http://www.resc.reading.ac.uk
Received on Tue Jul 09 2013 - 06:42:48 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒