[CF-metadata] Another potentially useful extension to the standard_name table from Cameron-smith, Philip on 2012-09-22 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Cameron-smith, Philip <cameronsmith1>
Date: Sat, 22 Sep 2012 11:41:38 -0700

Hi All,

I agree too. The challenge is: how do we get there from here, and who will do it, while maintaining backwards compatibility? That barrier has been too big.

Hence, my proposal is to leave what we have in place, and take a step which will make our current lives easier, while gently refining the grammar over time so that the next step will be easier.

One of the aspects of my proposal is that the parser/generator doesn't need to be perfect, which will save a huge amount of effort. If we require that the parser/generator for a new or existing std_name always correctly identifies whether it conforms with grammar/vocab rules, I am sure a huge effort will be needed to deal with a few difficult cases, such as those you have identified, ie 90% of the effort will be to deal with 10% of the cases. By only requiring that the parser/generator return whether a std_name is "probably valid" or "can't be determined", those difficult cases are left to people on the email list as normal. Thus, 10% effort will produce a 90% return, which is much more likely to happen, and will put us on a better road for the future :-).

Best wishes,

       Philip

-----------------------------------------------------------------------
Dr Philip Cameron-Smith, pjc at llnl.gov, Lawrence Livermore National Lab.
-----------------------------------------------------------------------

-----Original Message-----
From: CF-metadata [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Schultz, Martin
Sent: Saturday, September 22, 2012 9:37 AM
To: Lowry, Roy K.; cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Another potentially useful extension to the standard_name table

Hi Roy,

     exactly! Just how can we get there?

Cheers,

Martin

-----Urspr?ngliche Nachricht-----
Von: Lowry, Roy K. [mailto:rkl at bodc.ac.uk]
Gesendet: Samstag, 22. September 2012 18:24
An: Schultz, Martin; cf-metadata at cgd.ucar.edu
Betreff: RE: [CF-metadata] Another potentially useful extension to the standard_name table

Hello Martin,

I understand exactly what you want - or at least I thing I do. I think that you would like to enter a URL representing the concept 'carbon monoxide' and get back a document giving you all the Standard Names pertaining to carbon monoxide. Am I right?

My vision - which I'm pretty sure John Graybeal shares - is of a grammar in which each element is populated from a controlled vocabulary comprising concepts that are included in a thesaurus or more likely a full-blown ontology.

Does that sound like what you need?

Cheers, Roy.

________________________________________
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Schultz, Martin [m.schultz at fz-juelich.de]
Sent: 22 September 2012 16:26
To: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Another potentially useful extension to the standard_name table

Dear Philip, John and others,

      I take the point that indeed a grammar approach would be the solution to my problem. However, the grammar as it once stood based on Jonathan's python program (which indeed works quite nicely) unfortunately doesn't help with respect to the problem that I intended to solve with the addition of <attribute> tags (specifically <compound>). The problem is that the current grammar, derived from parsing the standard_name table, does not take into account semantic relations, but is strictly rule-based. Although I am not able to prove this now, the experience I gathered with Jonathan's tool and the associated lexicon suggests that it would require a major overhaul of the standard_name table in order to make it "parseable" in a sense that the relations among terms are not mere (computer) rule constructs, but make sense for the human reader. In essence, this is why I opened track ticket #91. Unfortunately, I haven't found the time yet to take this any further. ..

    Personally, I am much less worried about the procedures for suggesting and accepting standard_names. I fully agree that a grammar-based approach would also help in this regard, but that is a different issue.

    If I were in charge of creating a new standard_name table from scratch, I would go for a rigorous grammar-based syntax, where (sorry to bring this up again) the standard_name for "air_temperature" would be "temperature_of_air" in order to identify the relation <propert<>of<medium>, etc. Indeed, in this hypothetical standard_name table, one would define aliases and give them a more prominent role than now, i.e. it would be fine to use "air_temperature" (aliases should not be considered deprecated as is often the case in the current table). The interoperable application could then look up the real standard_name behind the alias and find something that can indeed be parsed - eh voila: you get what you need, i.e. you will know that you have a property and a medium, and that the property is "temperature" and the medium is "air".

    Of course, I am not in charge if creating a new standard_name table (and I am sure no one would like me to be in charge ;-), but I hope this illustrates the problem we have with the current table. Sad as it seems, I really see only two options: A) if most people agree that a grammar-based approach is the way to go, then we need to start overhauling the standard_name table (track ticket #91) and slowly transform it into something that "makes sense" (please don't misunderstand this phrase!). Option B): we leave things as they are, but then we would indeed have to further discuss the <attribute> idea, because this would provide a way of interpreting standard_names without having to parse them (which, as I hope to have made clear, is impossible at present).

      I agree with the precautions that were raised in that the <attribute>s pose some danger of becoming uncontrolled and simply too many. However, perhaps it is not so bad, because the standard_names usually consist of no more than 6 lexical tokens, and if we could agree that there should be not more than one <attribute> per lexical token (and these would anyhow be optional), then it appears manageable and finite.

With somewhat Quichotte'sque feelings,

Martin

------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

Kennen Sie schon unsere app? http://www.fz-juelich.de/app _______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata--
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Sat Sep 22 2012 - 12:41:38 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST