[CF-metadata] missing_value vs. _FillValue from Russ Rew on 2003-11-06 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Russ Rew <russ>
Date: Wed, 05 Nov 2003 22:40:28 -0700

Brian,

> _FillValue and missing_value are attributes that we inherit from the NUG.
> What is in the CF document is my best attempt at interpreting what's in the
> NUG. Unfortunately that's not easy, and what's currently in the NUG is not
> implemented in any generic software that I'm aware of (except maybe Harvey
> Davies' FAN operators since Harvey was the one responsible for the version
> 3 NUG conventions). Harvey wrote in a thread on the netCDF mail list that
> the intention was to deprecate the missing_value attribute and that's why I
> included that statement in CF.

The NUG (NetCDF User's Guide) does not deprecate the missing_value
attribute, but instead states that its purpose is different from the
purpose for _FillValue:

  missing_value: This attribute is not treated in any special way by
  the library or conforming generic applications, but is often useful
  documentation and may be used by specific applications. The
  missing_value attribute can be a scalar or vector containing values
  indicating missing data. These values should all be outside the
  valid range so that generic applications will treat them as missing.

Harvey was the last author of this section of the Users Guide on
attribute conventions, and chose not to explicitly deprecate the
missing_value attribute, although I think it was deprecated in an
earlier draft. Instead, he defined meanings for valid_range,
valid_min, and valid_max that would make any value outside of the
valid_range suitable for use as a missing value. You may be right
about Harvey deprecating it in an email message, but I think we can
use the NUG definition above as providing a useful meaning for it.

I think when John Caron wrote:

> We think that the NUG deprecation is out-of-data and should be, uh,
> deprecated ;^}

he was referring to the old NUWG conventions
<http://www.unidata.ucar.edu/packages/netcdf/NUWG/>, last modified in
1995, which did imply that the missing_value attribute should not be
used for NUWG-compliant datasets.

> The netCDF library does use _FillValue to prefill data. But this does not
> make it easy to catch the mistake of incompletely written data for several
> reasons:
> 1. The use of data prefill can be turned off. Basically data prefill
> doubles the cost of outputting your data, so it's often turned off for
> efficiency reasons (we do this in our atmosphere model).
> 2. The default _FillValue is not defined in a portable way. It's a literal
> constant in the netcdf header file, hence it is possible that it may have
> different binary representations on different machines.

A minor point, but the default _FillValue was intended to be portable,
even for floats and doubles, by specifying it to be a value (15 *
2^119) that only requires three 1 bits followed by all 0 bits in a
normalized mantissa, so it can be represented exactly as an IEEE float
or double and in any floating point representation that has at least 4
bits of precision. We tested this on all platforms we had access to
when developing netCDF, and checked that an equality comparison
actually worked for this value, as specified in netcdf.h. I would be
interested in a platform where an equality test does not work for
this. There may be a more portable way to specify this constant in
netcdf.h that does not assume IEEE floating-point. If so, I would be
happy to change it.

--Russ
Received on Wed Nov 05 2003 - 22:40:28 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST