⇐ ⇒

[CF-metadata] Expanding the standard_name metadata

From: John Graybeal <jgraybeal>
Date: Tue, 11 Sep 2012 14:36:55 -0700

> What you are suggesting sounds almost like you wanted to replace the standard_names by some other mechanism of controlled vocabulary, a collection of URIs from different fields and different servers which would point to the actual reference term in each case?

No, definitely not replace! CF *is* a controlled vocabulary, one of the best I know of. My comments are strictly about the compound attributes.

You have seen Jonathan's response, which raised some of my own questions also, and suggests recording semantic markers. We are opening the door on decomposing "what's inside" each standard name, which would be very useful, though (as Roy Lowry can attest) quite detailed.

I just wasn't sure what you wanted to use the compound_name item for. Assuming it is not the unique identifier for machine linking of the compound concepts (which the compound_codelist seems to provide), it appears to define a label. In which case, I don't know if CF wants to start asserting CF-defined labels for all the compounds, or other objects of interest within a standard name. As you say, "There may be other tags which could be useful to add", and however gradually we may choose to consider such tags, I wondered if CF was ready to take that on.

> a collection of URIs from different fields and different servers which would point to the actual reference term in each case?

I think that last is exactly what compound_codelist will become. Unless CF stipulates in advance that there is a single vocabulary from a single server for compounds, general semantic practice and experience says different groups will use different authorities. Still, either way forward is OK, with its own pros and cons.

John


On Sep 11, 2012, at 01:23, Schultz, Martin wrote:

> Hi John,
>
> the problem is that the compound name is obvious for a human, but very hard to extract for a machine, because we don?t have a strict set of grammar rules. What you are suggesting sounds almost like you wanted to replace the standard_names by some other mechanism of controlled vocabulary, a collection of URIs from different fields and different servers which would point to the actual reference term in each case? Perhaps I got you wrong here, but I would feel rather uneasy about going too far in this direction at present. We were very happy to find out in Dublin that the community (of atmospheric chemists) is beginning (!) to recognize standard_names as a valuable resource enabling them to speak about the same thing with the same words (even though sometimes a bit clumsy), and to have one ?master list? of terms seems much simpler and more resilient to me at present. Yet, it may be good to reflect within the standard_name list what is often brought up in the list discussions anyhow, that is that some comm
unities have established controlled vocabulary for their field, and ? as far as I follow the discussions ? this is usually a good argument for accepting a standard_name proposal, unless it is in conflict with other rules.
>
> The specific situation in atmospheric chemistry (maybe not so specific but at least very prominent) is that the ?variable name space? is not 1-dimensional, but multi-dimensional, i.e. for each (new) compound we can easily add a dozen or more new terms (= standard_names) which describe the molar fraction or mass content in the atmosphere, emission or deposition fluxes (due to a myriad individual processes if need be), chemical reaction rates or turnover rates, etc. My proposal to add the compound_name and a URI/URL to the accepted standard vocabulary list for compounds merely aims at making sure we can link the various compound properties together, so that an application can understand that ?mole_fraction_of_trimethylbenzene_in_air? is linked to ?tendency_of_atmosphere_mass_content_of_trimethylbenzene_in_air_due_to_emissions_from_traffic?, for example. If you show me a parser that can extract all compound names from the standard_name table and which would work for all future versions of the standard_nam
e table, then we might not need this (although the reference to a controlled vocabulary list might still be useful and take a little responsibility away from CF).
>
> Cheers,
>
> Martin
>
>
>
> Von: John Graybeal [mailto:jgraybeal at ucsd.edu]
> Gesendet: Montag, 10. September 2012 18:28
> An: Schultz, Martin
> Cc: Lowry, Roy K.; cf-metadata at cgd.ucar.edu
> Betreff: Re: [CF-metadata] Expanding the standard_name metadata
>
> Congratulations on your great meeting!
>
> Concur that when the name is derivable fairly obviously from the other matter, it should not be required. In this case the CF name is supposed to be clear enough that the compound name should be within it already. Suggest this be available as an option if you value it highly (it is perhaps as much the label, as the unique identifier?).
>
> We are bootstrapping best semantic practices for a long lifetime of their use (hopefully), and so having a URL (well, URI/IRI; yours works) is the principal computational reference. (How does the computer know with some confidence what the thing is?) Yes, definitely a web 2.0 kind of answer. Although a particular unique identifier may no longer be maintained in 10 or 20 years, it is likely enough of a 'standard reference' that it has been mapped to its replacement, or even forward linked from the old URL. Absolute worst case, a web search should find traces of it.
>
> To generalize this (for creatures, phenomena, etc.), could we call it not "compound_codelist", but "object_codelist" or "object_IRI", as the compound is the direct object of the prepositional phrase? OK, that's pretty grammar-centric and therefore obscure, but I see the names quickly described via their mapped components (a great thing!). This is very much the first step of that.
>
> John
>
> On Sep 10, 2012, at 02:35, Schultz, Martin wrote:
>
>
> Hi Roy,
>
> thanks for supporting this idea. Why include the ?compound_name?? I didn?t really think about this, but only copied what is common practice in ISO metadata files. They usually pair a name with the link to the controlled vocabulary list. It could have to do with resilience. What do you do if the controlled vocabulary server doesn?t work at the time when you need it? Actually, I would tend to think that the ?compound_name? tag is the more important one, and I would see the URL more in the sense of a bibliographic reference. In a sense, this bibliographic reference lends some weight to the name. But perhaps I am still living too much in the web 1.0 world?
>
> Cheers,
>
> Martin
>
>
> Von: Lowry, Roy K. [mailto:rkl at bodc.ac.uk]
> Gesendet: Montag, 10. September 2012 11:03
> An: Schultz, Martin; cf-metadata at cgd.ucar.edu
> Betreff: RE: Expanding the standard_name metadata
>
> Hello Martin,
>
> I really like the idea of linking the Standard Name to a resolveable URL for the compound, but would question the need for adding the compound name to the standard name table as well as the URL. The plaintext compound name has to be included in the Standard Name and is available through resolution of the URL. Why introduce a further duplicate of the information with the inherent risk of discrepencies creeping in?
>
> In a similar vein, should Standard Names get deeper into biological parameters it would be good to include a link to the World Register for Marine Species (WoRMS) for the taxon.
>
> Cheers, Roy.
> From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Schultz, Martin [m.schultz at fz-juelich.de]
> Sent: 10 September 2012 09:33
> To: cf-metadata at cgd.ucar.edu
> Subject: [CF-metadata] Expanding the standard_name metadata
>
> Dear all,
>
> last week, we had a rather successful workshop on ?Metadata for air quality and atmospheric composition? in Dublin. It was nice to see that the community (i.e. those present) seemed to agree without much discussion, that ISO 19115 (-1) is the way to go for discovery metadata, while CF is the way forward for descriptive metadata to be stored in (usually) netcdf data files. The main discussions at the workshop centered around ISO issues, but there was one interesting point that came up with respect to CF standard_names and their relation to controlled vocabulary:
>
> We did have discussions on this list earlier about a more grammar-oriented approach, and this was also brought up at our workshop again, mainly in light of the ?threat? that the atmospheric composition group will soon begin to flood this email list with hundreds of new names in order to add additional chemical compounds. As we have seen with the problem of standard_names for emissions, this is stretching the limits of the current ways to operate and publish new standard_names. I don?t want to argue against the concept of one ?flat? master list (we have been through this and there are good reasons for sticking to this concept), but I would like to stipulate a discussion about adding more ?metadata? to the standard_name table in order to better link it to other controlled vocabulary lists and avoid confusing inconsistencies, for example in the naming of chemical compounds. Specifically, I would like to propose two ?conditional? tags compound_name and compound_codelist in the standard_name list which shal
l appear for all standard_names having to do with chemical compounds. Example:
>
> -<entry id="atmosphere_mass_content_of_carbon_monoxide">
> <compound_name>Carbon monoxide</compound_name>
> <compound_codelist>http://rdfdata.eionet.europa.eu/airquality/components/10</compound_codelist>
> <canonical_units>kg m-2</canonical_units>
> <description>"Content" indicates a quantity per unit area. The "atmosphere content" of a quantity refers to the vertical integral from the surface to the top of the atmosphere. For the content between specified levels in the atmosphere, standard names including content_of_atmosphere_layer are used. The chemical formula of carbon monoxide is CO.</description>
> </entry>
>
> In a way, this may be seen as duplication of information, but it would really help to tie ends together, because it is practically impossible to parse the standard_names in order to extract such information (due to the lack of a strict grammar). There may be other tags which could be useful to add, and one will have to decide about the pros and cons in each case. However, for compound names I would see a clear need arising now.
>
> Best regards,
>
> Martin
>
>
> PD Dr. Martin G. Schultz
> IEK-8, Forschungszentrum J?lich
> D-52425 J?lich
> Ph: +49 2461 61 2831
>
>
>
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
>
> Kennen Sie schon unsere app? http://www.fz-juelich.de/app
>
> --
> This message (and any attachments) is for the recipient only. NERC
> is subject to the Freedom of Information Act 2000 and the contents
> of this email and any reply you make may be disclosed by NERC unless
> it is exempt from release under the Act. Any material supplied to
> NERC may be stored in an electronic records management system.
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
> ----------------
> John Graybeal <mailto:jgraybeal at ucsd.edu> phone: 858-534-2162
> Product Manager
> Ocean Observatories Initiative Cyberinfrastructure Project: http://ci.oceanobservatories.org
> Marine Metadata Interoperability Project: http://marinemetadata.org
>
>
>
>
>
>


----------------
John Graybeal <mailto:jgraybeal at ucsd.edu> phone: 858-534-2162
Product Manager
Ocean Observatories Initiative Cyberinfrastructure Project: http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://marinemetadata.org







-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20120911/7ea1888e/attachment-0001.html>
Received on Tue Sep 11 2012 - 15:36:55 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒