⇐ ⇒

[CF-metadata] Fw: Standard Names to support Trac ticket 99

From: John Graybeal <jbgraybeal>
Date: Wed, 18 Apr 2018 10:23:17 -0700

(Please note caveats in my final paragraph.)

Because I have been in at least one (biomedical) meeting in the last year where LSIDs have been dismissed as not good solutions for that community?s purposes, and because I know there are multiple taxonomy classification systems *and* multiple taxonomy identifier systems, I would not support *only* allowing an LSID identifier. I think that is inappropriately restrictive. (For those who wonder, I assure you there are biomedical applications throughout environmental science, and plenty of earth science applications using biomedical resources and identifiers. So I think this data point is applicable.)

The way I think about it is that there should be an identifier for the taxon that is at a minimum globally unique. And, if at all possible that identifier (as is) should be resolvable using standard DNS resolution. As described, an LSID URN by itself is not resolvable, but requires an additional prefix. (And in some cases some unpleasant escape sequences, but technology adoption will slowly overcome that part.)

So I would support one of the following options:
* biological_taxon_identifier by itself, defined as a globally unique and resolvable (_as is_) identifier (LSIDs would have to be entered using a resolvable form)
* biological_taxon_identifier by itself, recognizing that some identifiers may not be resolvable as is (which makes the whole solution less automatically computable)
* biological_taxon_identifier (either of the above) plus biological_taxon_lsid ? that is, provide both options, the user can specify one or both. This way the LSID fans can specify the identifier in their terms, and those with a more expansive identifier palette can use their identifier of choice. Semantic resolution, if not immediately available using existing mappings, can be achieved as needed through additional post hoc mappings, consistent with best semantic web practices.

In that vein, I would propose a different definition for the biological_taxon_identifier. I am adding that it has to be globally unique, and subtracting that it is opaque (because that is a best practice for many situations, but not all), but also not using the word label, because that implies semantic meaning to me. I specify IRI rather than URI because I understand that is the modern form. I am also not favoring LSID in the definition based on my historical and present understanding that these are not universally accepted in all communities (it is conceivable this is no longer true, despite the cited case above); because IMHO the identifier should be an IRI in the first place; and because we can specifically offer an LSID identifier as well. Finally, I am open to the existence of biologically classification system that are not hierarchical. So:

"A globally unique string, most usefully an IRI that resolves to an authoritative information source, referencing a specific biological taxon. Biological taxon is a defined entity representing an organism or a group of organisms as a (typically hierarchical) unit of biological classification.?

This is pretty radically different than what has been proposed, and my knowledge in the taxonomic classification space is much more from a technical perspective, not scientifically authoritative at all. So if I?m the only one concerned on these points, I?ll get out of the way. Even though I am pretty firmly convinced of the importance of using unambiguously semantically interoperable and resolvable identifiers, and of the extensive use of non-LSID taxonomic identifiers (search ?organism? in BioPortal to find both good and bad examples of this).

John

---------------------------------------
John Graybeal
jbgraybeal at mindspring.com
650-450-1853
skype: graybealski
linkedin: http://www.linkedin.com/in/johngraybeal/

> On Apr 16, 2018, at 03:10, Lowry, Roy K. <rkl at bodc.ac.uk> wrote:
>
> Forgot to do reply all....
>
> Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk <mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.
>
>
> From: Lowry, Roy K.
> Sent: 16 April 2018 11:09
> To: Daniel Neumann
> Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99
>
> Thanks Daniel,
>
> To clarify LSID isn't a database, it's an identifier for an organism that neatly brings together multiple taxonomies under the single umbrella of the Catalogue of Life project. It also resolves, actually in multiple ways, into a URL that then provides access into a database providing information on that organism.
>
> We came to the conclusion that we should use LSIDs in CF in the first round of discussions on Trac 99. My quandary is not whether we should use them, but whether the Standard Name should specify 'lsid' or just 'identifier'. 'Identifier' is what we discussed, but 'lsid' opens the door for future Standard Names based on other governances should there be a need to deal with entities not covered by lsids. I'm aware of one possible issue related to coccoliths plus the possibility of dealing with organism parts (e.g. cod livers).
>
> Cheers, Roy.
>
> Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk <mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.
>
>
> From: Daniel Neumann <daniel.neumann at io-warnemuende.de <mailto:daniel.neumann at io-warnemuende.de>>
> Sent: 16 April 2018 10:44
> To: Lowry, Roy K.
> Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99
>
> Dear Roy,
>
> Thank you for bringing this topic forward!
>
> I contacted the responsible person for our institute's data publishing und metadata policy and will talk to her about the choice of the LSID database. She is more into that topic than I am. It may take some days.
>
> Cheers,
> Daniel
>
>
> On 13.04.2018 16:02, Lowry, Roy K. wrote:
>> Dear All,
>>
>> Here is an initial batch of 8 Standard Names to support the CF taxon dimension. Two are dimension labels whilst the other six are measurements to which the taxon is a co-ordinate. Five of these are to cover Daniel's proposal that prompted the resurrection of Ticket 99.
>>
>> I've presented a summary list followed by a full list with units and definitions. I have one uncertainty in my mind (biological_taxon_label versus biological_taxon_lsid) where I would really appreciate input.
>>
>> Cheers, Roy.
>>
>> biological_taxon_name
>> biological_taxon_identifier or biological_taxon_lsid ? any preferences????
>> number_concentration_of_biological_taxon_in_sea_water
>> mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
>> mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water
>> mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
>> mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
>> mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
>>
>>
>> biological_taxon_name
>>
>> A plaintext human-readable label, usually a Latin binomial such as Calanus finmarchicus, applied to a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> dimensionless
>>
>> biological_taxon_identifier
>>
>> An opaque label, most usefully a URI that resolves to an authoritative information source, applied to a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. The identifier adopted for CF is the Life Science Identifier (LSID), a URN with the syntax ?urn:lsid:<Authority>:<Namespace>:<ObjectID>[:<Version>]?. For example, the copepod Calocalanus pavo may be represented by LSIDs ?urn:lsid:marinespecies.org:taxname:104669? (based on WoRMS) and urn:lsid:itis.gov:itis_tsn:85335? (based on ITIS). These URNs may be converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/ <http://lsid.tdwg.org/>'.
>>
>> dimensionless
>>
>> OR
>>
>> biological_taxon_lsid
>>
>> The Life Science Identifier (LSID) is a standard URI for a biological taxon. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. The LSID is a URN with the syntax ?urn:lsid:<Authority>:<Namespace>:<ObjectID>[:<Version>]?. For example, the copepod Calocalanus pavo may be represented by LSIDs ?urn:lsid:marinespecies.org:taxname:104669? (based on WoRMS) and urn:lsid:itis.gov:itis_tsn:85335? (based on ITIS). These URNs may be converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/ <http://lsid.tdwg.org/>'.
>>
>> dimensionless
>>
>> number_concentration_of_biological_taxon_in_sea_water
>>
>> Number concentration means the count of an entity per unit volume and is used in the construction ?number_concentration_of_X_in_Y?, where X is a material constituent of Y.. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy. Number concentration of biota is also referred to as abundance.
>>
>> m-3
>>
>> mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
>>
>> Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as
>> 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Mass concentration of biota expressed as carbon is also referred to as carbon biomass. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> kg m-3
>>
>>
>> mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water
>>
>> Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical or biological species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the
>> construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Chlorophyll means all naturally occurring pigments of the chlorophyll group. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> kg m-3
>>
>> mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
>>
>> Mass concentration means mass per unit volume and is used in the construction ?mass_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as
>> 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Mass concentration of biota expressed as nitrogen is also referred to as nitrogen biomass. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> kg m-3
>>
>> mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
>>
>> Mole concentration means number of moles per unit volume, also called ?molarity?, and is used in the construction ?mole_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> mol m-3
>>
>> mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
>>
>> Mole concentration means number of moles per unit volume, also called ?molarity?, and is used in the construction ?mole_concentration_of_X_in_Y?, where X is a material constituent of Y. A chemical species denoted by X may be described by a single term such as 'nitrogen' or a phrase such as 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the construction ?A_expressed_as_B?, where B is a chemical constituent of A. It means that the quantity indicated by the standard name is calculated solely with respect to the B contained in A, neglecting all other chemical constituents of A. Biological taxon is a name or other label identifying an organism or a group of organisms as belonging to a unit of classification in a hierarchical taxonomy.
>>
>> mol m-3
>>
>>
>>
>> Please note that I partially retired on 01/11/2015. I am now only working 7.5 hours a week and can only guarantee e-mail response on Wednesdays, my day in the office. All vocabulary queries should be sent to enquiries at bodc.ac.uk <mailto:enquiries at bodc.ac.uk>. Please also use this e-mail if your requirement is urgent.
>> This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
>>
>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> --
> Daniel Neumann
>
> Leibniz Institute for Baltic Sea Research Warnemuende
> Physical Oceanography and Instrumentation
> Seestrasse 15
> 18119 Rostock
> Germany
>
> phone: +49-381-5197-287
> fax: +49-381-5197-114 or 440
> e-mail: daniel.neumann at io-warnemuende.de <mailto:daniel.neumann at io-warnemuende.de>
> This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu <mailto:CF-metadata at cgd.ucar.edu>
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata <http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20180418/cc691b96/attachment.html>
Received on Wed Apr 18 2018 - 11:23:17 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒