⇐ ⇒

[CF-metadata] string valued coordinates

From: Jim Biard <jbiard>
Date: Tue, 04 Nov 2014 11:45:17 -0500

Mark,

I agree that CF is currently ambiguous on this front, and I'm fine with
improving definitions going forward, but 'no_unit' smacks of the classic
'this page intentionally left blank' found in government documents. I
think it's overkill, as backward compatibility will pretty much require
that having no units attribute be interpretable as having a units
attribute saying 'no_unit'.

Grace and peace,

Jim

On 11/4/14, 11:38 AM, Hedley, Mark wrote:
> Hello Jim
>
> > A variable with no units attribute at all is also pretty
> unambiguously a marker for something that isn't intended to be a even
> a pure number.
>
> If only this were the case. CF conventions state that:
> Units are not required for dimensionless quantities. A variable with
> no units attribute is assumed to be dimensionless. However, a units
> attribute specifying a dimensionless unit may optionally be included.
> http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#units
>
> Thus, the absence of a unit is to be interpreted identically to a
> statement that
> units = '1'
>
> This is the current situation and it is likely that there is lots of
> data like this around.
>
> > Do we really need something more than a disambiguation of units =
> '1' vs no units attribute present?
>
> Yes, I think we do: this situation is not ambiguous in CF, they are
> the same thing.
>
> What I believe we require is a udunits entity which is clearly 'there
> is no unit of measure here, this is not dimensioned and not dimensionless'
>
> The udunits value
> ''
> delivers this functionality (I think), but it does not read very well,
> hence my suggestion that we ask for a new entry in udunits,
> 'no_unit'
> which is hopefully clear in its meaning and interpretation
> and which behaves the same as '' : failing all udunits processing
> attempts and operating as 'not a unit'
>
> all the best
> mark
>
> ------------------------------------------------------------------------
> *From:* CF-metadata [cf-metadata-bounces at cgd.ucar.edu] on behalf of
> Jim Biard [jbiard at cicsnc.org]
> *Sent:* 31 October 2014 15:18
> *To:* cf-metadata at cgd.ucar.edu
> *Subject:* Re: [CF-metadata] string valued coordinates
>
> Mark,
>
> I'm not clear on what you are suggesting that udunits do with
> 'no_unit' or '?'.
>
> I had thought that the desire was to be able to differentiate between
> a pure number (as you mention below) and a value (whether a string or
> a bit pattern) that should not be interpreted as any number at all.
>
> As the situation stands, a units value of '1' is pretty unambiguously
> a marker for a pure number. We may need to modify docs to make this
> clearer, but I don't think that poses a problem. A variable with no
> units attribute at all is also pretty unambiguously a marker for
> something that isn't intended to be a even a pure number. Again, we
> may need to modify docs to make this clearer. Because these two
> concepts are somewhat conflated in the current documentation and usage
> (area_type being an example), there is the issue of other places where
> cleanup would be good going forward, but even if you have a units
> value of '1' on a non-number, it doesn't hurt anything in practice.
>
> Do we really need something more than a disambiguation of units = '1'
> vs no units attribute present?
>
> Grace and peace,
>
> Jim
>
> On 10/31/14, 11:04 AM, Hedley, Mark wrote:
>> Thank you for all the responses, it sounds like 'all of the above' is
>> the preferred response to my suggestions of plausible next steps. I
>> will pursue all of these.
>>
>> Eizi's point about having no_unit in udunits is sound; I suggest we
>> request udunits use
>> 'no_unit'
>> as a representation of
>> '?'
>> such that the behaviour is consistent; 'no_unit' should always raise
>> an exception when used in the udunits processing rules, exactly as
>> '?' does.
>>
>> With regard to meaning, I have found the wikipedia entry useful:
>> http://en.wikipedia.org/wiki/Dimensionless_quantity
>> `In dimensional analysis
>> <http://en.wikipedia.org/wiki/Dimensional_analysis>, a *dimensionless
>> quantity* or *quantity of dimension one* is a quantity
>> <http://en.wikipedia.org/wiki/Quantity> without an associated
>> physical dimension
>> <http://en.wikipedia.org/wiki/Dimensional_analysis>. It is thus a
>> "pure" number, and as such always has a dimension of 1.^[1]
>> <http://en.wikipedia.org/wiki/Dimensionless_quantity#cite_note-1> '
>> which it has sourced from
>> "*1.8* (1.6) *quantity of dimension one* dimensionless quantity"
>> <http://www.iso.org/sites/JCGM/VIM/JCGM_200e_FILES/MAIN_JCGM_200e/01_e.html#L_1_8>.
>> /International vocabulary of metrology --- Basic and general concepts
>> and associated terms (VIM)/. ISO
>> <http://en.wikipedia.org/wiki/International_Organization_for_Standardization>.
>> 2008. Retrieved 2011-03-22.
>>
>> This is a good enough source for me.
>>
>> I will wait to give space for more comments, then, if people are
>> content, I will raise a change request with udunits.
>> Assuming this is accepted and processed I will raise a change request
>> for CF to add some text to 3.1.
>> Finally I will request a change for any standard_names which appear
>> not to follow this approach (I have only 'area_type' so far).
>>
>> I hope this seems like a reasonable response.
>>
>> ------------------------------------------------------------------------
>> *From:* Eizi TOYODA [toyoda at gfd-dennou.org]
>> *Sent:* 31 October 2014 08:44
>> *To:* John Graybeal
>> *Cc:* Hedley, Mark; CF Metadata List
>> *Subject:* Re: [CF-metadata] string valued coordinates
>>
>> Hi John
>>
>> > I think '?' is not a definition that is helpful to most users -- it
>> is more like an indication that the string -- the empty string in
>> this case for example -- has not provided a meaningful indication of
>> what the units are.
>>
>> I share the same impression. I was thinking it would be nicer for
>> maintener of udunits. We should ask modifying udunits so that it
>> would refuse processing "no_units" otherwise ut_multiply("no_units",
>> "no_units") returns "no_units 2". If I remember right the unit
>> string "?" causes immediate error, so we don't have to change udunits.
>>
>> But I'm okay if the majority here agrees that sort of thing is not a
>> responsibility of udunits.
>>
>> Best,
>> Eizi
>>
>>
>>
>> Best Regards,
>> --
>> Eiji (aka Eizi) TOYODA
>> http://www.google.com/profiles/toyoda.eizi
>>
>> On Fri, Oct 31, 2014 at 9:45 AM, John Graybeal
>> <john.graybeal at marinexplore.com
>> <mailto:john.graybeal at marinexplore.com>> wrote:
>>
>> Thanks for summing this up so neatly Mark!
>>
>>> We could take the view that the conventions would benefit from
>>> the addition of some text into 3.1 to explicitly make the point
>>> about quantities which are not dimensioned or dimensionless.
>>> We could alternatively defer to udunits as most unit questions
>>> do, which already exhibits this behaviour, and just patch the
>>> 'area_type' and any similar names with erroneous canonical units.
>>> We could also request that udunits be updated with a clearer
>>> string for this case, given the need for it, such as including
>>> the term 'no_units' as a valid udunits term to mean there are no
>>> units here: this is not dimensionless, this is not dimensioned.
>>
>> Why is the first option exclusive to the others? Seems useful to
>> improve the documentation regardless.
>>
>> So I agree that '1' makes no sense for area_type. I'm wondering
>> if someone can crisply describe what is meant when we (or
>> UDUNITS) say a unit is dimensionless? I'm not entirely sure I get it.
>>
>> In any case, I think '?' is not a definition that is helpful to
>> most users -- it is more like an indication that the string --
>> the empty string in this case for example -- has not provided a
>> meaningful indication of what the units are.
>>
>> So my ideal solution has CF well aligned with UDUNITS, and a
>> clear concept and definition. Which I think suggests asking
>> UDUNITS for a term 'no_units', defined as "the values do not have
>> units; values are neither dimensioned nor dimensionless."
>>
>> John
>>
>>
>> On Oct 30, 2014, at 11:06, Hedley, Mark
>> <mark.hedley at metoffice.gov.uk
>> <mailto:mark.hedley at metoffice.gov.uk>> wrote:
>>
>>> > The unit of '1' is generally used to indicate fractions and
>>> the like. In cases where I am storing a raw binary value, I
>>> leave off the units attribute, as the 'number' isn't something
>>> that should be treated as a decimal quantity.
>>>
>>> This is the same behaviour as I was looking to adopt, but CF 3.1
>>> makes this incorrect, I believe, as a lack of a units attribute
>>> is to be interpreted as a units of '1'.
>>>
>>> I think a clear way to define that a quantity is not dimensioned
>>> and is not dimensionless is required. I would have liked to use
>>> the lack of a unit for this purpose, but this has already been
>>> taken, so something else is needed.
>>>
>>> >My preference is that one explicitly puts in the units. For
>>> dimensionless, "1" or "" is ok for udunits.
>>>
>>> udunits2 treats '1' and '' differently.
>>>
>>> a unit of '1' has a definition of '1'
>>> a unit of '' has a definition of '?'
>>>
>>> The CF conventions description of units (3.1) states that an
>>> absence of a units attribute is deemed to be equivalent to
>>> dimensionless, a unit of '1'. This is the convention, and it
>>> has been in force a long time.
>>>
>>> However CF makes no statement that I can find regarding a unit
>>> of ''. Thus I believe we defer back to udunits, which CF states
>>> is how units are defined. Udunits states that a unit of '' is
>>> undefined, the quantity is not dimensioned and is not
>>> dimensionless. We could adopt this to use for the cases in
>>> question.
>>>
>>> >area_type is given in the standard_name table as having a unit
>>> of 1. It is a categorical string-valued quantity.
>>>
>>> On the basis of the discussion, I would suggest that this is an
>>> error. If area_type is a categorical string-valued quantity, it
>>> is not dimensionless, it is not continuous and numerical, it is
>>> not a pure number and should not be treated as such. I think we
>>> should fix this.
>>>
>>> We could take the view that the conventions would benefit from
>>> the addition of some text into 3.1 to explicitly make the point
>>> about quantities which are not dimensioned or dimensionless.
>>> We could alternatively defer to udunits as most unit questions
>>> do, which already exhibits this behaviour, and just patch the
>>> 'area_type' and any similar names with erroneous canonical units.
>>> We could also request that udunits be updated with a clearer
>>> string for this case, given the need for it, such as including
>>> the term 'no_units' as a valid udunits term to mean there are no
>>> units here: this is not dimensionless, this is not dimensioned.
>>> I don't mind which route is preferred, I'm happy to put a change
>>> together and pursue it; whichever way seems better to people.
>>>
>>> cheers
>>> mark
>>>
>>> ------------------------------------------------------------------------
>>> *From:*CF-metadata [cf-metadata-bounces at cgd.ucar.edu
>>> <mailto:cf-metadata-bounces at cgd.ucar.edu>] on behalf of Jim
>>> Biard [jbiard at cicsnc.org <mailto:jbiard at cicsnc.org>]
>>> *Sent:*30 October 2014 16:12
>>> *To:*cf-metadata at cgd.ucar.edu <mailto:cf-metadata at cgd.ucar.edu>
>>> *Subject:*Re: [CF-metadata] string valued coordinates
>>>
>>> CF says that if the units attribute is missing, then the
>>> quantity has no units.
>>>
>>> The Conventions document, section 3.1 says:
>>>
>>> The|units|attribute is required for all variables that represent
>>> dimensional quantities (except for boundary variables defined
>>> inSection 7.1, "Cell
>>> Boundaries"<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#cell-boundaries>and
>>> climatology variables defined inSection 7.4, "Climatological
>>> Statistics"<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#climatological-statistics>).
>>>
>>> and
>>>
>>> Units are not required for dimensionless quantities. A variable
>>> with no units attribute is assumed to be dimensionless. However,
>>> a units attribute specifying a dimensionless unit may optionally
>>> be included. The Udunits package defines a few dimensionless
>>> units, such as|percent|, but is lacking commonly used units such
>>> as ppm (parts per million). This convention does not support the
>>> addition of new dimensionless units that are not udunits
>>> compatible. The conforming unit for quantities that represent
>>> fractions, or parts of a whole, is "1". The conforming unit for
>>> parts per million is "1e-6". Descriptive information about
>>> dimensionless quantities, such as sea-ice concentration, cloud
>>> fraction, probability, etc., should be given in
>>> the|long_name|or|standard_name|attributes (see below) rather
>>> than the|units|.
>>>
>>> The unit of '1' is generally used to indicate fractions and the
>>> like. In cases where I am storing a raw binary value, I leave
>>> off the units attribute, as the 'number' isn't something that
>>> should be treated as a decimal quantity.
>>>
>>> Grace and peace,
>>>
>>> Jim
>>>
>>> On 10/30/14, 11:35 AM, John Caron wrote:
>>>> My preference is that one explicitly puts in the units. For
>>>> dimensionless, "1" or "" is ok for udunits. If the units
>>>> attribute isnt there, I assume that the user forgot to specify
>>>> it, so the units are unknown.
>>>>
>>>> Im not sure what CF actually says, but it would be good to clarify.
>>>>
>>>> John
>>>>
>>>> On Thu, Oct 30, 2014 at 2:37 AM, Hedley,
>>>> Mark<mark.hedley at metoffice.gov.uk
>>>> <mailto:mark.hedley at metoffice.gov.uk>>wrote:
>>>>
>>>> Hello CF
>>>>
>>>> > From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu
>>>> <mailto:cf-metadata-bounces at cgd.ucar.edu>] on behalf of
>>>> Jonathan Gregory [j.m.gregory at reading.ac.uk
>>>> <mailto:j.m.gregory at reading.ac.uk>]
>>>>
>>>> > Yes, there are some standard names which imply string
>>>> values, as Karl says. If the standard_name table says 1,
>>>> that means the quantity is dimensionless, so it's also fine
>>>> to omit the units, as Jim says.
>>>>
>>>> I would like to raise question about this statement.
>>>> Omitting the units and stating that the units are '1' are
>>>> two very different things;
>>>> dimensionless != no_unit
>>>> is an important statement which should be clear to data
>>>> consumers and producers.
>>>>
>>>> If the standard name table defines a canonical unit for a
>>>> standard_name of '1' then I expect this quantity to be
>>>> dimensionless, with a unit of '1' or some multiple there of.
>>>> If the standard name states that the canonical unit for a
>>>> standard_name is '' then I expect that quantity to have no
>>>> unit stated.
>>>> Any deviation from this behaviour is a break with the
>>>> conventions. I have code which explicitly checks this for
>>>> data sets.
>>>>
>>>> Are people aware of examples of the pattern of use
>>>> described by Jonathan, such as a categorical quantities
>>>> identified by a standard_name with a canonical unit of '1'?
>>>>
>>>> thank you
>>>> mark
>>>>
>>>> _______________________________________________
>>>> CF-metadata mailing list
>>>> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> CF-metadata mailing list
>>>> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>> --
>>> <iiagagce.png> <http://www.cicsnc.org/>Visit us on
>>> Facebook <http://www.facebook.com/cicsnc> *Jim Biard*
>>> *Research Scholar*
>>> Cooperative Institute for Climate and Satellites
>>> NC<http://cicsnc.org/>
>>> North Carolina State University<http://ncsu.edu/>
>>> NOAA's National Climatic Data Center<http://ncdc.noaa.gov/>
>>> 151 Patton Ave, Asheville, NC 28801
>>> e:jbiard at cicsnc.org <mailto:jbiard at cicsnc.org>
>>> o: +1 828 271 4900
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>>
>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> --
> CICS-NC <http://www.cicsnc.org/>Visit us on
> Facebook <http://www.facebook.com/cicsnc> *Jim Biard*
> *Research Scholar*
> Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
> North Carolina State University <http://ncsu.edu/>
> NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
> 151 Patton Ave, Asheville, NC 28801
> e: jbiard at cicsnc.org
> o: +1 828 271 4900
>
>
>
>

-- 
CICS-NC <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc> 	*Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: jbiard at cicsnc.org
o: +1 828 271 4900
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20141104/5afcb5f2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 11847 bytes
Desc: not available
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20141104/5afcb5f2/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cddgedeh.png
Type: image/png
Size: 11847 bytes
Desc: not available
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20141104/5afcb5f2/attachment-0003.png>
Received on Tue Nov 04 2014 - 09:45:17 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒