⇐ ⇒

[CF-metadata] proposed standard names for Enterococcus and?Clostridium perfringens

From: Lowry, Roy K. <rkl>
Date: Mon, 25 Mar 2013 17:04:06 +0000

Thanks Jonathan,

I was indeed responsible for introducing 'green dogs' to discussions in CF, but since then my experience has expanded further into biological data and, in particular, into the world of contaminants in biota through EMODNET and our work in BODC with the Sea Mammal Research Unit. This has shown what you say about invalid combination possibilities for taxa being much less of an issue to be exactly right. It has also shown me that protection against 'green dogs' can in some circumstances become an unaffordable luxury.

There are couple of points in your message where I would do things slightly differently.

First, I would prefer 'number_concentration_of_taxon_in_sea_water' to 'number_concentration_of_biological_species_in_sea_water', because not all biological data are identified to the species level. Often the counts are at the level of genus, class or even phylum.

Secondly, I think that CF setting up a controlled vocabulary for taxa is an unnecessary duplication that will cause us a lot of unnecessary work and take us out of our domain expertise comfort zone. In the marine domain, there is an almost universally accepted taxonomic controlled vocabulary with lashings of accompanying metadata that is extremely well governed by internationally recognised experts in the field with high quality technical governance in the form of tools, including a web service API. This is the World Register of Marine Species (WoRMS). I fully appreciate that CF covers more than the marine domain, but there is an alternative governance in the form of the International Taxonomic Information System (ITIS) , which is aimed more at terrestrial life than marine. If we say that names used in CF should be registered in at least one of these then we should be OK.

As you will see in a message that has just been released, I'm proposing taking this forward through a Trac ticket.

Cheers, Roy.



________________________________________
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Jonathan Gregory [j.m.gregory at reading.ac.uk]
Sent: 25 March 2013 09:00
To: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] proposed standard names for Enterococcus and?Clostridium perfringens

Dear all

I agree with Philip that cfu should be spelled out. I was also going to make
the same point about Roy's proposal being different from our treatment of
chemical species, which are encoded in the standard name; this system seems to
be working. One reason for keeping this approach was the "green dog" problem.
That particular phrase is actually Roy's, if I remember correctly. That is, we
wish to prevent nonsensical constructions, by approving each name which makes
(chemical) sense individually.

However Roy argues that there is an order of magnitude more biological species
to deal with than chemical. I don't think that keeping the same approach
(encoding in the standard name) would break the system, but it would make the
standard name table very large. Perhaps more importantly, if there were so
many species, I expect that data-writers would simply assume that each of the
possible combinations of pattern and species did already exist in the standard
name table, without bothering to check or have them approved. That would defeat
the object of the system of individual approval.

We don't have to follow the chemical approach. For named geographical
regions and surface area types (vegetation types etc.) we use string-valued
coordinate variables, rather like Roy proposes here. To follow that approach
we would need a new table, subsidiary to the standard name table, containing
a list of controlled names of biological species. We would use the same
approval process to add names to this list as we do for the standard name
table. (This is what we do for geographical regions and area types.) We would
then have a standard_name such as
  number_concentration_of_biological_species_in_sea_water
whose definition would note that a data variable with this standard_name must
have a string-valued auxiliary coordinate variable of biological_species
containing a valid name from the biological species table. If there is just
one species, the auxiliary coordinate variable wouldn't need a dimension,
but this construction would also allow a single data variable to contain data
for several species, by having a dimension of size greater than one.

Cheers

Jonathan
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
Received on Mon Mar 25 2013 - 11:04:06 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒