⇐ ⇒

[CF-metadata] Fw: Standard Names to support Trac ticket 99

From: Lowry, Roy K. <rkl>
Date: Fri, 20 Apr 2018 09:42:39 +0000

Hello Daniel,


The point that seems to have been missed is that the LSID is an umbrella covering BOTH ITIS and WoRMS plus other taxonomies from other parts of the world. For oceanographic data my strong recommendation would be that WoRMS be the first port of call because as your colleagues rightly state the coverage for marine organisms is so much better. In this case, the LSID is built from the AphiaID, which should be familiar to your colleagues, by adding a simple fixed prefix. As I explained to John Graybeal if identifiers that aren't encodable into LSIDs are required then we simply need to add another Standard Name for the taxon dimension.


Regarding your other point, I am following the model set up by the EurOBIS community for encoding biological oceanographic data into SeaDataNet, based on the Darwin Core format. In this the 'metadata' parameters you mention are split between the taxon dimension (gender, stage) and the 'time' (i.e. sample) dimension (size, size minimum, size maximum, number in sample, etc., etc.). All that is needed to fit them into what I'm proposing is the creation of appropriate Standard Names. However, the CF standard practice is not to set up Standard Names until the need arises - i.e. somebody wants to actually encode a plankton survey into CF NetCDF. My crystal ball tells me this pressure will come before too long from SeaDataNet/SeaDataCloud.


Cheers, Roy.


Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk. Please also use this e-mail if your requirement is urgent.


________________________________
From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu> on behalf of Daniel Neumann <daniel.neumann at io-warnemuende.de>
Sent: 20 April 2018 10:12
To: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Fw: Standard Names to support Trac ticket 99


Hi Roy, Hi List,


I talked to people from the data management department of my current institution (IOW, Germany) today. Some researchers at IOW do quite detailed Phyto- and Zooplankton surveys and also some other "life form" surveys in the Baltic Sea. Therefore, our data management people have some experience in the requirements of this field of research.


They would favor a concept which is expendable by further data bases. ITIS seems to lack a lot of marine species living in the Baltic Sea. At least in the past it seemed to be focused on North American waters. WoRMS seems to include most of them. The German Federal Maritime and Hydrographic Agency also seems to require WoRMS IDs when my colleagues submit data to them. Maybe there are specific databases for Australian or Chinese marine regions ... . They don't have experience with LSIDs and did not comment on that.


The life stage (not sure, if this is the correct English word) and the size class are parameters, which are recorded in our plankton abundance surveys. The biomass is often estimated by the product "abundance" times "size-class-and-region-depending factor". Therefore, size class seems to be relevant meta data. Different researcher communities and in different regions seem to have different size categorizations. If we look into fish, the life stage and maybe also the gender are important parameters. The life stage and size class are probably important also for modelers. Therefore, we could consider to include attributes (or further variables?) like 'size class', 'gender' and 'life stage'.


Cheers

Daniel



On 19.04.2018 18:02, Lowry, Roy K. wrote:

Hi John,


To my thinking your arguments add support for 'biological_taxon_lsid' over 'biological_taxon_identifier' defined as LSID, which Jonathan prefers and I am starting to feel better about. LSIDs backed by the WoRMS and ITIS taxonomies cover the biological oceanography use cases known to me. Should use cases within the sphere of CF with the need for additional identifiers come along, we can accommodate these through the straightforward mechanism of Standard Name creation. Remember Standard Name creation is on a 'current needs' basis, not setting up to cover possible future needs.


Note the initial solution is designed to be semantically verifiable through cross-checking the name against the identifier, so having a single 'identifier' that can take many forms would not be helpful.


Cheers, Roy.


Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk<mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.


________________________________
From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu><mailto:cf-metadata-bounces at cgd.ucar.edu> on behalf of John Graybeal <jbgraybeal at mindspring.com><mailto:jbgraybeal at mindspring.com>
Sent: 18 April 2018 18:23
To: CF Metadata List
Subject: Re: [CF-metadata] Fw: Standard Names to support Trac ticket 99

(Please note caveats in my final paragraph.)

Because I have been in at least one (biomedical) meeting in the last year where LSIDs have been dismissed as not good solutions for that community?s purposes, and because I know there are multiple taxonomy classification systems *and* multiple taxonomy identifier systems, I would not support *only* allowing an LSID identifier. I think that is inappropriately restrictive. (For those who wonder, I assure you there are biomedical applications throughout environmental science, and plenty of earth science applications using biomedical resources and identifiers. So I think this data point is applicable.)

The way I think about it is that there should be an identifier for the taxon that is at a minimum globally unique. And, if at all possible that identifier (as is) should be resolvable using standard DNS resolution. As described, an LSID URN by itself is not resolvable, but requires an additional prefix. (And in some cases some unpleasant escape sequences, but technology adoption will slowly overcome that part.)

So I would support one of the following options:
* biological_taxon_identifier by itself, defined as a globally unique and resolvable (_as is_) identifier (LSIDs would have to be entered using a resolvable form)
* biological_taxon_identifier by itself, recognizing that some identifiers may not be resolvable as is (which makes the whole solution less automatically computable)
* biological_taxon_identifier (either of the above) plus biological_taxon_lsid ? that is, provide both options, the user can specify one or both. This way the LSID fans can specify the identifier in their terms, and those with a more expansive identifier palette can use their identifier of choice. Semantic resolution, if not immediately available using existing mappings, can be achieved as needed through additional post hoc mappings, consistent with best semantic web practices.

In that vein, I would propose a different definition for the biological_taxon_identifier. I am adding that it has to be globally unique, and subtracting that it is opaque (because that is a best practice for many situations, but not all), but also not using the word label, because that implies semantic meaning to me. I specify IRI rather than URI because I understand that is the modern form. I am also not favoring LSID in the definition based on my historical and present understanding that these are not universally accepted in all communities (it is conceivable this is no longer true, despite the cited case above); because IMHO the identifier should be an IRI in the first place; and because we can specifically offer an LSID identifier as well. Finally, I am open to the existence of biologically classification system that are not hierarchical. So:

"A globally unique string, most usefully an IRI that resolves to an authoritative information source, referencing a specific biological taxon. Biological taxon is a defined entity representing an organism or a group of organisms as a (typically hierarchical) unit of biological classification.?

This is pretty radically different than what has been proposed, and my knowledge in the taxonomic classification space is much more from a technical perspective, not scientifically authoritative at all. So if I?m the only one concerned on these points, I?ll get out of the way. Even though I am pretty firmly convinced of the importance of using unambiguously semantically interoperable and resolvable identifiers, and of the extensive use of non-LSID taxonomic identifiers (search ?organism? in BioPortal to find both good and bad examples of this).

John

---------------------------------------
John Graybeal
jbgraybeal at mindspring.com<mailto:jbgraybeal at mindspring.com>
650-450-1853
skype: graybealski
linkedin: http://www.linkedin.com/in/johngraybeal/

On Apr 16, 2018, at 03:10, Lowry, Roy K. <rkl at bodc.ac.uk<mailto:rkl at bodc.ac.uk>> wrote:

Forgot to do reply all....

Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk<mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.


________________________________
From: Lowry, Roy K.
Sent: 16 April 2018 11:09
To: Daniel Neumann
Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99

Thanks Daniel,

To clarify LSID isn't a database, it's an identifier for an organism that neatly brings together multiple taxonomies under the single umbrella of the Catalogue of Life project. It also resolves, actually in multiple ways, into a URL that then provides access into a database providing information on that organism.

We came to the conclusion that we should use LSIDs in CF in the first round of discussions on Trac 99. My quandary is not whether we should use them, but whether the Standard Name should specify 'lsid' or just 'identifier'. 'Identifier' is what we discussed, but 'lsid' opens the door for future Standard Names based on other governances should there be a need to deal with entities not covered by lsids. I'm aware of one possible issue related to coccoliths plus the possibility of dealing with organism parts (e.g. cod livers).

Cheers, Roy.

Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk<mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.


________________________________
From: Daniel Neumann <daniel.neumann at io-warnemuende.de<mailto:daniel.neumann at io-warnemuende.de>>
Sent: 16 April 2018 10:44
To: Lowry, Roy K.
Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99

Dear Roy,

Thank you for bringing this topic forward!

I contacted the responsible person for our institute's data publishing und metadata policy and will talk to her about the choice of the LSID database. She is more into that topic than I am. It may take some days.

Cheers,
Daniel


On 13.04.2018 16:02, Lowry, Roy K. wrote:
Dear All,

Here is an initial batch of 8 Standard Names to support the CF taxon dimension. Two are dimension labels whilst the other six are measurements to which the taxon is a co-ordinate. Five of these are to cover Daniel's proposal that prompted the resurrection of Ticket 99.

I've presented a summary list followed by a full list with units and definitions. I have one uncertainty in my mind (biological_taxon_label versus biological_taxon_lsid) where I would really appreciate input.

Cheers, Roy.

biological_taxon_name
biological_taxon_identifier or biological_taxon_lsid ? any preferences????
number_concentration_of_biological_taxon_in_sea_water
mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water
mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water




biological_taxon_name

A plaintext human-readable label, usually a Latin binomial such as Calanus finmarchicus, applied to a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

dimensionless

biological_taxon_identifier

An opaque label, most usefully a URI that resolves to an authoritative information source, applied to a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. The identifier adopted for CF is the Life Science Identifier (LSID), a URN with the syntax ?urn:lsid:<Authority>:<Namespace>:<ObjectID>[:<Version>]?. For example, the copepod Calocalanus pavo may be represented by LSIDs ?urn:lsid:marinespecies.org<http://marinespecies.org>:taxname:104669? (based on WoRMS) and urn:lsid:itis.gov<http://itis.gov>:itis_tsn:85335? (based on ITIS). These URNs may be converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.

dimensionless

OR

biological_taxon_lsid

The Life Science Identifier (LSID) is a standard URI for a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. The LSID is a URN with the syntax ?urn:lsid:<Authority>:<Namespace>:<ObjectID>[:<Version>]?. For example, the copepod Calocalanus pavo may be represented by LSIDs ?urn:lsid:marinespecies.org<http://marinespecies.org>:taxname:104669? (based on WoRMS) and urn:lsid:itis.gov<http://itis.gov>:itis_tsn:85335? (based on ITIS). These URNs may be converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.

dimensionless

number_concentration_of_biological_taxon_in_sea_water

Number concentration means the count of an entity per unit volume and is used in the construction ?number_concentration_of_X_in_Y?, where X is a material constituent of Y.. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. Number concentration of biota is also referred to as abundance.

m-3

mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water

Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as
'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Mass concentration of biota expressed as carbon is also referred to as carbon biomass. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

 kg m-3


mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water

Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical or biological species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the
construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Chlorophyll means all naturally occurring pigments of the chlorophyll group. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

 kg m-3

 mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water

 Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as
'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Mass concentration of biota expressed as nitrogen is also referred to as nitrogen biomass. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

kg m-3

mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water

Mole concentration means number of moles per unit volume, also called ?molarity?, and is used in the construction ?mole_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

mol m-3

mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water

Mole concentration means number of moles per unit volume, also called ?molarity?, and is used in the construction ?mole_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.

mol m-3




Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk<mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________



_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


--
Daniel Neumann
Leibniz Institute for Baltic Sea Research Warnemuende
Physical Oceanography and Instrumentation
Seestrasse 15
18119 Rostock
Germany
phone:  +49-381-5197-287
fax:    +49-381-5197-114 or 440
e-mail: daniel.neumann at io-warnemuende.de<mailto:daniel.neumann at io-warnemuende.de>
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
Daniel Neumann
Leibniz Institute for Baltic Sea Research Warnemuende
Physical Oceanography and Instrumentation
Seestrasse 15
18119 Rostock
Germany
phone:  +49-381-5197-287
fax:    +49-381-5197-114 or 440
e-mail: daniel.neumann at io-warnemuende.de<mailto:daniel.neumann at io-warnemuende.de>
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20180420/7962ca54/attachment.html>
Received on Fri Apr 20 2018 - 03:42:39 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒