⇐ ⇒

[CF-metadata] new chemical species

From: Philip J. Cameron-smith <cameronsmith1>
Date: Thu, 2 Aug 2007 18:32:59 -0700 (PDT)

Hi,

I have been at meetings of atmospheric chemists where the subject of
standard specie names and output variables has been discussed, and there
has always seemed to be general support of the desirability of this. In
fact, it was one such discussion earlier this year, and someone mentioning
Christiane's work that inspired me to join the CF email list.

Unfortunately I can't really take this on right now, but I might be able
to take it on in the future.

It might not be too hard.

I think a pre-requisite would be for CF to agree to at least a simple
ontological system, where specie is a category. Christiane's web site
would seem like a good starting point of moderate complexity:
http://wiki.esipfed.org/index.php/CF_Standard_Names_-_Construction_of_Atmospheric_Chemistry_and_Aerosol_Terms

Christiane's website also already has a list of 70 of the more common
species.

Perhaps even easier would be to use/borrow/appropriate the list of species
from one of the master mechanism groups, since they must already have had
to deal with this issue for large numbers of species (1,000's). If that
doesn't work then I also have a list of species (100's) from the Harvard
group of Daniel Jacob that could be a starting point. Other lists may
also be floating around.

Best wishes,

     Philip

On Thu, 2 Aug 2007, Roy Lowry wrote:

> Dear Jonathan,
>
> Having a managed, authoritative list of approved chemical entities
> including formalised names, common names and accepted abbreviations for
> inclusion into standard names, ideally with content governance support
> from an interested group of chemists, would be an extremely valuable
> asset, not just to CF but to anyone concerned with seamntic
> interoperability.
>
> Of course, this has already been done by CAS, who have a database of
> some 32,000,000 substances each with its own identifying key (the CAS
> number) and set of formal/common labels, as part of the international
> chemical regulatory infrastructure. Trouble is they charge two bucks a
> query.
>
> I worry a little about this area of content govenance in CF.
> Christiane is making valiant efforts to harness the expertise of the
> scientists in her project through her Wiki but I wonder how many of them
> have standardised nomenclature at the top of their agenda. Getting
> nomenclature right requires fully motivated people with the right
> knowledge and I know only too well how hard these are to find,
> particularly when there is no financial incentive.
>
> There is already evidence of weakness in our chemical contentent
> governance for the CF Standard Names. When an e-mail to Christiane
> whilst trying to map the chemical standard names revealed my ignorance
> as to the nature of 'hexachloropbiphenyl' I did some research and
> discovered that this is in fact a term for 42 different PCB cogenors
> (PCB128-PCB169 - see http://www.epa.gov/toxteam/pcbid/table.htm) each
> with a different IUPAC name and CAS number. Whilst I'm not saying it's
> wrong to have a Standard name covering a group of chemical entities and
> am not proposing that we revisit this specific example (other than maybe
> adding a short explanation to the definition), I feel it's something
> that should have been at least raised for consideration when the
> standard name was first proposed.
>
> Is there anyone (Philip?) willing to champion the development of such a
> list and who else believes it would be a useful thing to do? I would be
> happy to support the serving of the terms assembled, including their
> semantic interrelationships.
>
> Cheers, Roy.
>
>>>> Jonathan Gregory <j.m.gregory at reading.ac.uk> 8/1/2007 2:23 pm >>>
> Dear Philip
>
>> 1) One of the current problems with chemistry output is that there is
>> currently NO agreed upon list of species names: many species have various
>> common names, and those names are often abbreviated in codes, which can
>> lead to confusion and incompatible files (eg ACET could refer to acetone
>> or acetaldehyde)
>>
>> It would be really good if an agreed upon list of names could be agreed
>> upon bf CF.
>
> I think that we are doing this, in effect, in CF, because we would always use
> the same name for a given species in the standard name table. Of course we have
> to be careful to choose the species names in a reasonable way. Christiane has
> thought about this e.g. when a common name can be used instead of a IUPAC name.
> We would not decide in advance a complete set of species names, however; as
> usual, we add them as the need arises. Use of codes and abbreviations would not
> be consistent with the self-describing intentions of CF, so I don't think we
> ought to do that.
>
>> 2) The internal structure containing chemicals in models does NOT have to
>> be the same as that used in the NetCDF output, and my preference is to
>> have them be different.
> ...
>> If the
>> netCDF structure just uses a dimension for species, it is a pain to create
>> new input files: I often find myself needing to add or remove species, or
>> scale some of the species. For analysis, I also usually find myself
>> wanting to extract a long time history of just one or two species (out of
>> a hundred), which is also a real nuisance with a single array for all
>> species, and the process is prone to error.
>> nuisance that if a single specie is extracted, the degenerate dimension
>> needs to be handled.
>>
>> Lastly, it is usually sufficient to output a small subset of species
>> (saving a lot of file-space), in which case the output array will not
>> match the internal array anyway.
> ...
>> I have used codes that handled this situation both ways. Personally, I
>> prefer to have each specie separate in the netCDF: the ease and accuracy
>> of post-processing easily outweighs the disadvantages.
>
> Those are interesting points. There is no reason why CF should not support
> both approaches. At the moment we are following your preferred approach, which
> means giving standard names with the species names in them. That means the
> standard name table is larger, but I don't think it matters.
>
> Cheers
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> --
> This message (and any attachments) is for the recipient only. NERC
> is subject to the Freedom of Information Act 2000 and the contents
> of this email and any reply you make may be disclosed by NERC unless
> it is exempt from release under the Act. Any material supplied to
> NERC may be stored in an electronic records management system.
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>

------------------------------------------------------------------------
Dr Philip Cameron-Smith Energy & Environment Directorate
pjc at llnl.gov Lawrence Livermore National Laboratory
+1 925 4236634 7000 East Avenue, Livermore, CA94550, USA
------------------------------------------------------------------------
Received on Thu Aug 02 2007 - 19:32:59 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST

⇐ ⇒