⇐ ⇒

[CF-metadata] Add new integer types to CF?

From: Daniel Lee <Daniel.Lee>
Date: Mon, 11 Sep 2017 08:29:09 +0000

Hi Charlie,

I support opening up the acceptable types (this is not the issue here, but I'd like to see complex types and enums in CF in the future!). If the format gives people the option, they'll want to use the types, and forbidding them without good reason contributes to noncompliant data proliferation.

I also support the use of an Oxford comma, although they are admittedly cruel.

Like Mary, I am confused by the restriction of packed data to signed types. If we're talking about simple packing, all the packed values are normally positive (I don't see anything in the standard which would require add_offset to be the lowest number in the set, but I think it's good practice and have never encountered data which does not match this requirement). That's also the way WMO packs data in BUFR and GRIB. If I understand this correctly, perhaps the better route would be to specify that add_offset is signed and equal to the lowest value in the array, so that the lowest *packed* value would be 0?

Daniel

Dr Daniel Lee
Software and Data Formats Engineer
System Engineering and Projects Division

EUMETSAT
Eumetsat-Allee 1
64295 Darmstadt
Germany

Tel: +49 6151 807 3250
Fax: +49 6151 807 5550
E-mail: daniel.lee at eumetsat.int
Web: www.eumetsat.int

> -----Original Message-----
> Date: Fri, 8 Sep 2017 14:39:31 -0600
> From: Charlie Zender <zender at uci.edu>
> To: CF Metadata Mail List <cf-metadata at cgd.ucar.edu>
> Subject: [CF-metadata] Add new integer types to CF?
> Message-ID: <e5881aa4-e73e-eb87-0ce1-26116b0070de at uci.edu>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> People,
>
> CF explicitly supports types char, byte, short, int, float, and double.
> There are five "new" numeric types it could support:
> unsigned byte, unsigned short, unsigned int, int64, and unsigned int64.
> These new types are in netCDF3 (in the CDF5 encoding released in netCDF v.
> 4.4.0) and in netCDF4. I suggest that CF 1.8 merge support for the new
> numeric types. Please comment on this proposal.
>
> The current CF 1.8 draft reads (Section 2.2):
>
> "The netCDF data types char, byte, short, int, float or real, and double are all
> acceptable. The char type is not intended for numeric data. One byte
> numeric data should be stored using the byte data type. All integer types are
> treated by the netCDF interface as signed. It is possible to treat the byte type
> as unsigned by using the NUG convention of indicating the unsigned range
> using the valid_min, valid_max, or valid_range attributes."
>
> I suggest replacing that text with something like:
>
> "The netCDF data types char, byte, unsigned byte, short, unsigned short, int,
> unsigned int, int64, unsigned int64, float or real, and double are all
> acceptable. The char type is not intended for numeric data. One byte
> numeric data should be stored using the byte or unsigned byte data type. It
> is possible to treat the byte type as unsigned by using the NUG convention of
> indicating the unsigned range using the valid_min, valid_max, or valid_range
> attributes. The convention explicitly distinguishes between signed and
> unsigned integer types only where necessary. Unless otherwise noted, int is
> interchangeable with unsigned int, int64, and unsigned int64 in this
> convention, including examples and appendices. Similarly short is
> interchangable with unsigned short, and byte with unsigned byte."
>
> Section 8.1 on Packed Data currently reads:
>
> "An additional restriction in this case is that the variable containing the
> packed data must be of type byte, short or int. It is not advised to unpack an
> int into a float as there is a potential precision loss."
>
> I suggest replacing that with something like:
>
> "An additional restriction in this case is that the variable containing the
> packed data must be of type byte, short, or int.
> Use of unsigned types to hold packed data is not permitted since they are
> incapable of representing negative numbers. It is not advised to unpack an
> int into a float as there is a potential precision loss."
>
> The insertion of an Oxford comma in this last change is optional, not intended
> to provoke an international incident.
>
> Unsigned,
> Charlie
> --
> Charlie Zender, Earth System Sci. & Computer Sci.
> University of California, Irvine 949-891-2429 )'(
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 8 Sep 2017 15:55:41 -0600 (MDT)
> From: Mary Jo Brodzik <brodzik at nsidc.org>
> To: Charlie Zender <zender at uci.edu>
> Cc: CF Metadata Mail List <cf-metadata at cgd.ucar.edu>
> Subject: Re: [CF-metadata] Add new integer types to CF?
> Message-ID:
> <alpine.OSX.2.11.1709081547420.5991 at vpn-nsidc209-
> dhcp.int.colorado.edu>
>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
>
> Charlie, your unsigned proposal change looks good to me, except for one
> question.
>
> I must be very tired right now, but I don't understand why you are
> suggesting the prohibition on unsigned types for packed data:
>
> "Use of unsigned types to hold packed data is not permitted since
> they are incapable of representing negative numbers."
>
> Why should it matter? For example, Kelvin temperatures are always
> positive. If I have a variable that stores such temperatures, why
> should I be prohibited from storing it as a packed unsigned int?
>
> Mary Jo
>
> On Fri, 8 Sep 2017, Charlie Zender wrote:
>
> > Date: Fri, 8 Sep 2017 14:39:31 -0600
> > From: Charlie Zender <zender at uci.edu>
> > To: CF Metadata Mail List <cf-metadata at cgd.ucar.edu>
> > Subject: [CF-metadata] Add new integer types to CF?
> >
> > People,
> >
> > CF explicitly supports types char, byte, short, int, float, and double.
> > There are five "new" numeric types it could support:
> > unsigned byte, unsigned short, unsigned int, int64, and unsigned int64.
> > These new types are in netCDF3 (in the CDF5 encoding released in netCDF
> > v. 4.4.0) and in netCDF4. I suggest that CF 1.8 merge support for the
> > new numeric types. Please comment on this proposal.
> >
> > The current CF 1.8 draft reads (Section 2.2):
> >
> > "The netCDF data types char, byte, short, int, float or real, and
> > double are all acceptable. The char type is not intended for numeric
> > data. One byte numeric data should be stored using the byte data
> > type. All integer types are treated by the netCDF interface as
> > signed. It is possible to treat the byte type as unsigned by using the
> > NUG convention of indicating the unsigned range using the valid_min,
> > valid_max, or valid_range attributes."
> >
> > I suggest replacing that text with something like:
> >
> > "The netCDF data types char, byte, unsigned byte, short, unsigned
> > short, int, unsigned int, int64, unsigned int64, float or real,
> > and double are all acceptable. The char type is not intended for
> > numeric data. One byte numeric data should be stored using the byte
> > or unsigned byte data type. It is possible to treat the byte type as
> > unsigned by using the NUG convention of indicating the unsigned range
> > using the valid_min, valid_max, or valid_range attributes. The
> > convention explicitly distinguishes between signed and unsigned
> > integer types only where necessary. Unless otherwise noted, int is
> > interchangeable with unsigned int, int64, and unsigned int64 in this
> > convention, including examples and appendices. Similarly short is
> > interchangable with unsigned short, and byte with unsigned byte."
> >
> > Section 8.1 on Packed Data currently reads:
> >
> > "An additional restriction in this case is that the variable
> > containing the packed data must be of type byte, short or int. It is
> > not advised to unpack an int into a float as there is a potential
> > precision loss."
> >
> > I suggest replacing that with something like:
> >
> > "An additional restriction in this case is that the variable
> > containing the packed data must be of type byte, short, or int.
> > Use of unsigned types to hold packed data is not permitted since
> > they are incapable of representing negative numbers. It is not
> > advised to unpack an int into a float as there is a potential
> > precision loss."
> >
> > The insertion of an Oxford comma in this last change is optional,
> > not intended to provoke an international incident.
> >
> > Unsigned,
> > Charlie
> > --
> > Charlie Zender, Earth System Sci. & Computer Sci.
> > University of California, Irvine 949-891-2429 )'(
> > _______________________________________________
> > CF-metadata mailing list
> > CF-metadata at cgd.ucar.edu
> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ^^^^^^^^^^^^^
> Mary Jo Brodzik, Senior Associate Scientist, 303-492-8263
> NSIDC/CIRES, Univ. of Colo. at Boulder, 449 UCB, Boulder, CO 80309-0449
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ^^^^^^^^^^^^^
> "We cannot solve our problems with the same thinking we used
> when we created them." --Albert Einstein
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
> ------------------------------
>
> End of CF-metadata Digest, Vol 173, Issue 8
> *******************************************

Any email message from EUMETSAT is sent in good faith but shall neither be binding nor construed as constituting a commitment by EUMETSAT, except where provided for in a written agreement or contract or if explicitly stated in the email. Please note that any views or opinions presented in this email are solely those of the sender and do not necessarily represent those of EUMETSAT. This message and any attachments are intended for the sole use of the addressee(s) and may contain confidential and privileged information. Any unauthorised use, disclosure, dissemination or distribution (in whole or in part) of its contents is not permitted. If you received this message in error, please notify the sender and delete it from your system.
Received on Mon Sep 11 2017 - 02:29:09 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒