⇐ ⇒

[CF-metadata] Using units with a scale factor

From: Martin Juckes - UKRI STFC <martin.juckes>
Date: Fri, 2 Nov 2018 21:54:21 +0000

Hello Dave,


I am not aware of any conformance checkers which do not accommodate units of the form "1e3 km3", though they may exist. So we don't need to make any adjustments there. The use of such coordinates is consistent with current CF conformance document, though not with the convention itself. So there is no need to change anything ... but I did raise the question below as to whether we want to remove this discrepancy between the CF checker and the standard. It would be simpler, from the CMIP6 perspective, if this was not done in a hurry.

regards,
Martin

________________________________
From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu> on behalf of Dave Allured - NOAA Affiliate <dave.allured at noaa.gov>
Sent: 02 November 2018 20:39
To: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Using units with a scale factor

Martin,

> Units of this form ["1e3 km3"] are used when the community
> requests them, usually because that is the common practice
> within their community -- they probably exist in many NetCDF
> files outside CMIP.

> Is it reasonable to refuse units which are widely used in the
> community? I think it is worth taking some time to consider
> this, and I suggest that we allow this anomalous unit for CMIP6.

A fair question. Please clarify. Are you proposing to change the prohibition of scaled unit strings in CF section 3.1? Or do you just want to modify one of the conformance checkers to accommodate the CMIP6 case alone? I am opposed to the first, but the second would be okay if it was guarded by a software switch that is not advertised for general application.

I am concerned about backpedaling on a long standing CF restriction and precedent that I think is good as is. OTOH, if there really is wide support for scaled unit strings, I will drop my objection.

--Dave


On Fri, Nov 2, 2018 at 3:47 AM, Martin Juckes - UKRI STFC <martin.juckes at stfc.ac.uk<mailto:martin.juckes at stfc.ac.uk>> wrote:
Dear All,

"micron" (recognised by Udunits) might be a good alternative to "um".

There is a typo in the last line of my message below -- question if whether to replace "1e6 km2" (not m2) with "Mm2",

regards,

Martin


From: Juckes, Martin (STFC,RAL,RALSP)
Sent: 02 November 2018 08:55

Dear Karl, Dave,

thanks, those are good suggestions.

As Karl says, 1e3 km3 is not hm3 and can't be represented with prefixes (Udunits does accept

As a compromise, I suggest the following changes for the CMIP6 data request:
1e-3 kg --> g
1e6 J --> MJ

and retain 1e-6 m, 1e3 km3, 1e6 km2 with a comment. For the standard names we can just drop the scale factor as Karl suggests.

The reason these are in there is because people were not aware of the restriction. As I noted below, the restriction is also omitted from the conformance document and from the compliance checker. Units of this form are used when the community requests them, usually because that is the common practice within their community -- they probably exist in many NetCDF files outside CMIP.

I believe that the statement in the convention to the effect that scale_factor and add_offset attributes can be used instead is misleading. These factors can be used, but they only affect the internal storage of the data, they to not modify what the user sees -- I think the truth is that packing is the only application of these attributes. The units attribute needs to be consistent with what the user sees, and will not be affected by us of a scale factor. If the motivation for a particular choice of units is to avoid issues with numerical precision by adding an offset, this underlying problem can be dealt with using the offset attribute, but if the choice of units has different motivation these attributes don't help.

This leaves us with a problem that people want to store data in units of "1e3 km3" and we cannot express this in CF (except by "Mm km2", which looks obscure to me). Is it reasonable to refuse units which are widely used in the community? I think it is worth taking some time to consider this, and I suggest that we allow this anomalous unit for CMIP6.

1e-6 m and 1e6 m2 we could use "um" and "Mm2", but I think the results would be unsatisfactory for the users. Although the unit micro-meter is in reasonably common use, the ascii representation "um" is a little obscure, and "Mm" also looks obscure to me. I'm happy to take other views on these.

regards,

Martin


From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu<mailto:cf-metadata-bounces at cgd.ucar.edu>> on behalf of Dave Allured - NOAA Affiliate <dave.allured at noaa.gov<mailto:dave.allured at noaa.gov>>
Sent: 01 November 2018 20:43

It sounds like consistency is more important than CF perfection in a particular use case. In that case, I suggest continuing use of CMIP6 prescribed unit strings for CMIP6 purposes only, and let the CMIP6 community know of this inconsistency with CF 3.1. This will probably be the least confusing for all concerned. People and software that understand UDUNITS will understand those scaled unit strings without help from CF.

--Dave


On Thu, Nov 1, 2018 at 1:39 PM, Taylor, Karl E. <taylor13 at llnl.gov<mailto:taylor13 at llnl.gov><mailto:taylor13 at llnl.gov<mailto:taylor13 at llnl.gov>>> wrote:
Hi again,

I think some groups may have already written CMIP6 fields with the currently specified units that violate the CF standard, but I'd rather have all CMIP6 datasets have comparable numeric values, so we shouldn't change the units themselves, just the units attribute, as suggested originally. This is possible, as Martin already indicated with one exception:

1e3 km3 --> hm3

I think this may be incorrect. Wouldn't 1e3 km3 be interpreted as 1e3 (km)^3 If so, then
1e3 km3 = (10 km)^3

and there is no prefix equivalent to "10 km", so I think we're in a fix for this one variable.

Karl



On 11/1/18 11:03 AM, Dave Allured - NOAA Affiliate wrote:
Martin, thanks for a well stated summation of this problem.

My opinion is there are good reasons for prohibiting numeric scale and offset in the units string, as indicated in section 3.1. (Decimal prefixes are fine, as always.) Therefore I agree with Karl's #1 and #3, keep the prohibition and do not change 3.1.

The CMIP6 data request itself is a problem because it actually violates CF 3.1. Best remedy would be to get CMIP6 authority to agree to alternate units that do not use numeric scale factors. I would modify Karl's #2 like this:

2) Replace the scaled units in the CMIP6 data request with unscaled units and equivalent decimal prefix (e.g., replace "1e6 J" with "MJ"). However, if the result is deemed not user friendly in some way, then decide on some user friendly unscaled units, e.g. "J", and consumers will need to re-scale data later, for some purposes.

Does anyone know why CMIP6 requested units in violation of CF 3.1?

--Dave


On Thu, Nov 1, 2018 at 10:58 AM, Taylor, Karl E. <taylor13 at llnl.gov<mailto:taylor13 at llnl.gov><mailto:taylor13 at llnl.gov<mailto:taylor13 at llnl.gov>>> wrote:
Hi Martin,

I think the main point of the relevant paragraph in section 3.1, which reads

"The Udunits syntax that allows scale factors and offsets to be applied to a unit is not supported by this standard. The application of any scale factors or offsets to data should be indicated by the scale_factor and add_offset attributes. Use of these attributes for data packing, which is their most important application, is discussed in detail in Section 8.1, "Packed Data"<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#packed-data>."

is that if you want to pack the data, the proper way to do that is through scale_factor and add_offset, not through the scale and offset options allowed by udunits in the units attribute. In general I find the "scale_factor" and "add_offset" attributes much easier to interpret than the scale and offset udunits options. I would therefore:

1) continue to forbid (or strongly discourage?) use of offset and scale in the units attribute (and modify the conformance document to be consistent with this).

2) replace the scaled units in the CMIP6 data request with units that might be less user friendly, but include equivalent prefix (e.g., replace "1e6 J" with "MJ")

3) replace in the standard names table all non-conforming units with conforming units. I don't think the new units need to be identical to the old (e.g., I would replace "1e-3 kg m-2" with "kg m-2", not "g m-2").

Regarding this last point, note that the so-called "Canonical units" in the standard names table are there to provide guidance on what the quantity represents (e.g., W m-2 indicates the quantity is a flux density, not a flux). CF does not recommend a particular unit among all equivalent (e.g., "kg" might appear in the canonical units, but "g" would be just as acceptable).

Do others have opinions about this?

best regards,
Karl


On 10/29/18 7:45 AM, Martin Juckes - UKRI STFC wrote:

Hello Karl, Alison,

As part of a separate discussion on 'months since' and 'years since' in time units<http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2018/020648.html><http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2018/020648.html>, Klaus pointed out the use of numerical scale factors in units strings, although allowed by Udunits, is prohibited by the CF convention in section 3.1. I'm raising this here because there are 3 standard names which make use of such scale factors in their canonical units, and a number of CMIP6 variables. The CF conformance document diverges from the standard and allows any string which is accepted by Udunits, and hence accepts such factors. The CF checker implements the version according to the conformance document, as does the cf-python code (and hence checks on the CMIP6 variables using cf-python didn't detect this problem).


The CF standard names are:

integral_wrt_depth_of_product_of_sea_water_density_and_salinity : 1e-3 kg m-2

ocean_salt_x_transport, ocean_salt_y_transport: 1e-3 kg s-1


In the CMIP6 data request, we have:

1.e6 J m-1 s-1 for atmospheric energy transport (intuadse, intvadse);

1e-3 kg m-2 for integral wrt depth of density and salinity (somint);

1e-6 m s-1 for saturated hydraulic conductivity;

1e3 km3 for sea ice volumes (sivoln, sivols);

1e6 km2 for sea ice areas (siarean, siareas, siextentn, siextents);


Should we stick to the statement in the standards document ... and bring the conformance document etc into line, or could the standards document be interpreted more loosely?


These scale factors could be replaced by prefixes, but I think there is some loss of legibility in some cases:

1e-3 kg --> g

1e6 J --> MJ

1e-6 m --> um

1e3 km3 --> hm3

1e6 km2 --> Mm2


(here "um" is a micrometer, "hm" a hectometer and "Mm" a megameter).


regards,

Martin
Received on Fri Nov 02 2018 - 15:54:21 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:43 BST

⇐ ⇒