⇐ ⇒

[CF-metadata] Usage of the 'Conventions' attribute

From: Lowry, Roy K. <rkl>
Date: Mon, 28 Jan 2013 20:41:41 +0000

Hi Nan,

Would the CF web site be an appropriate place for communities to post the attributes they have added to CF - either with or without namespace prefixes?

Cheers, Roy.

________________________________________
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Nan Galbraith [ngalbraith at whoi.edu]
Sent: 28 January 2013 20:25
To: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Usage of the 'Conventions' attribute

Any of these special characters, other than the '_', would probably
cause problems for code that deals with NetCDF files. The '.' is used
in Matlab for access to structures, and the '_at_' is used to identify a
variable as a function handle. There are work-arounds, but they
likely wouldn't add efficiency or elegance to our code.

I agree with Roy that CF should be the default namespace in a CF
compliant file, and that this problem belongs to groups that are writing
extensions. In OceanSITES we've mostly ignored this problem-waiting-
to-happen, but our code checks the versions of CF (and our OTS spec) in
our files, which I hope offers some protection.

Should more of these community conventions be added to CF? I'm sure
there are SDN and NCADD (data discovery) attributes that would be
helpful to some CF users; it would be awfully nice to have a list of
already-defined attributes - in one place - to choose from when putting
together a CF-based spec for a project.

Cheers - Nan



On 1/28/13 1:17 PM, Dennis Heimbigner wrote:
> With respect to netcdf (at least the C version),
> it is the case that these characters can appear
> unescaped: _. at +-
>
> It should be noted however that dot in particular
> causes problems for accessing remote datasets
> through DAP because the dot character is used
> in DAP constraints to specify fields inside
> DAP Sequences or Structures or Grids.
>
> The problem you have is that no matter what
> choice of character(s) you make, someone may
> use the characters in a different way.
> This means that whatever choice you make, you need
> to enshrine it in a standard somewhere so that at
> least there is a chance that people will avoid it.
>
> Personally, I would think that a two character sequence
> is least likely to be used by others, but two underscores
> is probably not a good choice. I would think something
> like _at_@ ++ might be a better choice.
>
>
> =Dennis Heimbigner
> Unidata
> Bentley, Philip wrote:
>> Roy et al.,
>>> Martin's comments on namespace highlight a concern I identified
>>> whilst doing the research for the SeaDataNet specification. Several
>>> communities have added large numbers of both global and variable
>>> attributes with no indication of namespace. Not only does this make
>>> it difficult to tease out what is CF and what is a community
>>> extension, but it creates an accident in waiting. What happens if
>>> CF creates a new attribute with a name already in community usage?
>>> In my view it's too late to introduce a CF namespace and prefer the
>>> idea that for a CF-compliant file CF should be the default
>>> namespace, with communities taking responsibility for their
>>> extensions. This is what I've done for SeaDataNet.
>>
>> In working up a local metadata profile of CF for use here at the Met
>> Office, we also spent much time thinking about the 'namespace problem'.
>> In an early draft of our metadata profile, and after having reviewed
>> previous discussions (e.g. https://cf-pcmdi.llnl.gov/trac/ticket/27), we
>> elected to use the double underscore character sequence ('__') as a
>> namespace separator. Our namespace prefixes were then mnemonics like
>> 'ukmo' for the Met Office, 'dc' for Dublin Core, 'cim' for the Common
>> Information Model, and so on. And we devised additional (fairly simple)
>> machinery to associate the prefixes with target namespaces, just as in
>> the XML world.
>>
>> Thus, we envisaged using netcdf attributes along the lines of:
>>
>> variables:
>> float myvar(t, y, x) ;
>> myvar:ukmo__stashcode = "m01s01i123" ;
>> myvar:ukmo__runid = "abcde" ;
>>
>> // global attributes
>> :dc__rights = "Copyright (c) 2013, Acme Wind and Rain Corp." ;
>> :dc__created = "2013-01-01 ..." ;
>>
>> In the end, driven by a practical need to release a simpler, more
>> digestible release 1.0 of our metadata specification, we dropped all the
>> aforementioned namespace stuff.
>>
>> As part of some subsequent low-level netcdf work, however, I chanced
>> upon the fact that the '.' character is not treated in any special way
>> within netcdf names (or rather, it is one of netcdf's original special
>> characters, but not one that needs to be escaped in the way that, say,
>> the ':' character does).
>>
>> This got me to thinking that the '.' character might be the ideal
>> namespace separator for use in CF/netCDF attribute names. Since '.' is
>> not in the set of characters currently permitted in CF attribute names,
>> we can be reasonably sure that it is not being used in existing
>> CF-compliant netcdf files.
>>
>> The '.' character also has collateral appeal for python/java developers
>> in that it is the familiar namespace separator used by those languages.
>>
>> Applied to the previous example, then, we'd now have netcdf attributes
>> such as ukmo.stashcode, ukmo.runid, dc.rights, dc.created, and so on.
>> Which looks considerably more elegant, IMO.
>>
>> While in your context, Roy, you might elect to use namespace'd
>> attributes called sdn.conventions, sdn.foo, sdn.bar, etc. Or bodc.foo,
>> bodc.bar, etc. for BODC stuff.
>>
>> Clearly there are several technical issues that would need to be
>> addressed (e.g. how/when to use the 'cf.' prefix, what would the default
>> namespace be, how would prefixes and their namespaces be associated, how
>> should software interpret namespaces, and so on).
>>
>> But, assuming these could be resolved, what do people think about use of
>> '.' as a namespace separator? Good idea? Bad idea?
>>
>> Some recent postings to this list have suggested using a 'cf_' prefix,
>> with the implied suggestion of a '_' namespace separator. IMHO, this
>> approach has the limitation that client software would not be able to
>> disambiguate existing names which include the '_' character. For
>> example, would the name 'cell_methods' refer to a property called
>> 'cell_methods' in some default namespace, or a property called 'methods'
>> in the 'cell' namespace? Likewise for some possible new attribute
>> called, e.g. 'cf_my_new_thing', what namespace would that be in? cf?
>> cf_my? cf_my_new?
>>
>> Regards,
>> Phil
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>


--
*******************************************************
* Nan Galbraith                        (508) 289-2444 *
* Upper Ocean Processes Group            Mail Stop 29 *
* Woods Hole Oceanographic Institution                *
* Woods Hole, MA 02543                                *
*******************************************************
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
Received on Mon Jan 28 2013 - 13:41:41 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒