[CF-metadata] use of _FillValue vs valid_range, and minimum and maximum variable attributes

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Seth McGinnis <mcginnis>
Date: Thu, 23 May 2013 07:51:02 -0600

>> Computing the min & max on the fly is cheap, and approximating it is even
>> cheaper, so why introduce the uncertainty?
>
>... but computing min & max on the fly can also be very expensive.
>We have aggregated model output datasets where each variable is more
>than 1TB!

Sure, I can see that that's useful metadata about the dataset, and that
there's value in caching it somewhere. I just don't think it belongs with
the metadata inside the netcdf file. What's the use case for storing it
there?

Because the problem remains that, unless you're storing and serving
that dataset as a single 1 TB file that never gets modified or subset,
as soon as anything at all happens to the file, those min and max
values become tainted and unreliable, and ought to be recomputed.

I could probably get behind it if it were called something different that
highlighted that unreliability, like nominal_range, or display_range,
or something like that, but calling it actual_range just seems to me
like it's going to be misleading and incorrect dangerously often.

Cheers,

--Seth
Received on Thu May 23 2013 - 07:51:02 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST