⇐ ⇒

[CF-metadata] Web reference to a standard name?

From: Benno Blumenthal <benno>
Date: Tue, 21 Dec 2010 01:15:25 -0500

On Sat, Dec 18, 2010 at 1:40 AM, John Graybeal <jbgraybeal at mindspring.com>wrote:
>
> The proliferation/versioning topic was one we thought hard about in MMI,
> and there are some writeups on the site that may be of interest to a few.
> [1] It includes a bit more detail than we have pursued here.
>
> In case anyone cares: Knowing which vocabulary version a term came from
> can reveal more than you might realize. While we probably didn't say this
> explicitly anywhere, the thought in my head for versioning every term when
> the file changed will seem like an angels on a pin argument to all you
> practical data folks. It goes as follows: if additional terms are added that
> are more detailed than the original term (say, sea_surface_skin_temperature
> and sea_surface_foundation_temperature added to the original term
> sea_surface_temperature), I would expect people to use the more specific
> term when possible. (Similarly, 'gay' still means happy, but you don't hear
> people use it that way.) Thus, even though the definition of a term hasn't
> changed, its application and implicit meaning can change. Historians have to
> account for this when reading and understanding older material, and
> historical data analysts the same.
>


This is an important example, and this argument is the one I cryptically
referred to earlier. I would argue that versioning is not the best way to
handle this particular problem.

One of the core principles of CF evolution that has been frequently stated
in these pages is to not invalidate current (and earlier) dataset metadata
-- many proposed CF changes have been rejected by arguing that they
unnecessarily invalidate datasets described under the current version of CF.

This particular case of sophistication -- two new terms giving more precise
subsets of an old term -- is essential in describing Science's evolving
understanding of a particular topic. And in the sense of making the best
choice of cfatt:standard_name in tagging a dataset, it allows a new choice
to be made in describing the data, so that it is true that the appearance of
sea_surface_temperature as such an attribute has changed. As a practical
matter, it is not sufficient to know when a dataset is tagged to figure out
what set of terms was being chosen from -- the requester of the change would
probably use the standard before it was official, people like me would
probably lag the implementation of the standard by some considerable time --
if one is serious about versions, then the version has to be part of the
tag, which in CF, it is not.

Fortunately, no one is arguing that sea_surface_temperature ceases to
describe the dataset because there are more precise terms, it is just that
in tagging a dataset, we use the most precise term available. Roy, in fact,
is serving relationships between terms, and I would expect that
sea_surface_temperature skos:broaderThan sea_surface_skin_temperature would
be one of them.
What this means in RDF is that having the relationship dataset
term:isDescribedBy sea_surface_skin_temperature implies the relationship
 dataset term:isDescribedBy sea_surface_temperature, and an infering
triple-store would return both relationships in a query, presuming that it
understood skos.

There are two examples of this which are particularly important, First of
all, someone with incomplete information (most of us, in the long run, it is
science, after all) could successfully tag a dataset with a term more
general than the expert would use, and still make a correct statement.
 Similarly, someone adhering to the original standard could make a correct
statement about the data at some earlier date. Most importantly,
*additional* metadata could be included, such as a precise specification of
the sensor/platform that produced the measurement, that would allow someone
to infer at a later date (when the standard has become more specific, as in
this example), a more specific standard_name to tag the dataset with. Yes,
at that point the one best choice would have changed, but none of the
earlier statements would be invalidated.

Benno


>
> John
>
> [1]
> http://marinemetadata.org/apguides/ontprovidersguide/ontguideconstructinguris(see the Versioned and Unversioned URLs section toward the bottom)
>
> On Dec 17, 2010, at 08:37, Lowry, Roy K. wrote:
>
> > Hi Richard,
> >
> > This echoes a discussion in BODC this afternoon and a thought I had some
> while ago in that having terms inherit the version number of the list in
> which they reside was not the best of ideas. However, we concluded that
> maintenance of list versions was a good idea, particularly if the list
> governance permits deprecation.
> >
> > I'm not aware of anybody maintaining version numbers at the term level
> for the Standard Names. Establishing them retrospectively wouldn't be a
> pleasant task and I don't think there's anything to be gained from their
> introduction from this point onwards.
> >
> > Cheers, Roy.
> >
> > -----Original Message-----
> > From: cf-metadata-bounces at cgd.ucar.edu [mailto:
> cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Hattersley, Richard
> > Sent: 17 December 2010 16:13
> > To: John Graybeal; Jeff deLaBeaujardiere
> > Cc: cf-metadata at cgd.ucar.edu
> > Subject: Re: [CF-metadata] Web reference to a standard name?
> >
> > It would seem that the two conflicting requirements, which I'll
> > caricature as "rigorous version control" and "pragmatic usability",
> > could go some way to being reconciled by using term-specific version
> > numbers.
> >
> > Instead of:
> > <some_prefix>/<cf_vocab_number>/<term> (e.g.
> > .../P071/16/CFSN0023)
> > Or:
> > <some_prefix>/<term> (e.g. .../parameter/air_temperature)
> >
> > You have:
> > <some_prefix>/<term>/<term_version>
> >
> > Where the term_version is only updated when the definition of the term
> > changes - not when a release of the standard name table occurs. Thus
> > avoiding the proliferation of identifiers.
> >
> > This would still leave a discussion on what constitutes a "change", as
> > it's quite possible one may wish to allow minor edits (eg. spelling
> > corrections) for pragmatic reasons.
> >
> >
> > Richard Hattersley AVD Expert Software Developer
> > Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom
> > Tel: +44 (0)1392 885702 Fax: +44 (0)1392 885681
> > Email: richard.hattersley at metoffice.gov.uk Website:
> > www.metoffice.gov.uk
> >
> >
> > -----Original Message-----
> > From: cf-metadata-bounces at cgd.ucar.edu
> > [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of John Graybeal
> > Sent: 17 December 2010 00:16
> > To: Jeff deLaBeaujardiere
> > Cc: cf-metadata at cgd.ucar.edu
> > Subject: Re: [CF-metadata] Web reference to a standard name?
> >
> > I offer my two cents on versioned terms, prompted by the 'absolutely
> > right' phrasing :->. I am firmly straddling the fence on this question.
> >
> > There are multiple science users and many technical opinions that say
> > not having versions is absolutely wrong. The circumstances that could
> > make 'current' *not* what you want include:
> > - you need to understand what definition (or other statements) was in
> > effect when the tag was applied
> > - you want to understand the transitions that the definition (or other
> > statements) has undergone over time
> > - the meaning of a term actually is significantly different than it used
> > to be
> > - additional meanings are associated with a term (e.g., an acronym is
> > repurposed by another organization) at a later date
> >
> > I believe the last happens much more often than your confidence suggest
> > -- perhaps especially in emerging fields or those that are newly
> > developing documented vocabularies, extremely advanced or subjective
> > fields, and concepts that get 'culturally adopted', e.g., turned into a
> > pejorative (slang (that last not our problem, for the most part). I
> > don't see how the exclusive use of non-versioned terms supports these
> > situations.
> >
> > So while I appreciate the motivations for not including versions, I
> > think versions have to be offered by the system, and ideally should be
> > used where unique persistent identifiers are required.
> >
> > John
> >
> >
> > On Dec 16, 2010, at 13:08, Jeff deLaBeaujardiere wrote:
> >
> >> Actually, my recollection is that EPSG & OGC proposed to include
> > version numbers, and several of us argued against it and managed to
> > convince them. I would have to dig up old emails to find out for
> > certain who was in which camp, however.
> >>
> >> Regards,
> >> Jeff DLB
> >>
> >> On 2010-12-16 15:57, Lowry, Roy K. wrote:
> >>> Hi Jeff,
> >>>
> >>> It's interesting to see the difference of opinion between the
> > standards developers (the idea of version number in URI came from the
> > OGC URN specification: interesting how EPSG came to a different
> > conclusion) and those who have to live with the consequences. The more I
> > think about it, the more I think you and Benno are absolutely right.
> >>>
> >>> Cheers, Roy.
> >>> ________________________________________
> >>> From: cf-metadata-bounces at cgd.ucar.edu
> >>> [cf-metadata-bounces at cgd.ucar.edu] On Behalf Of Jeff deLaBeaujardiere
> >
> >>> [Jeff.deLaBeaujardiere at noaa.gov]
> >>> Sent: 16 December 2010 19:40
> >>> To: John Graybeal
> >>> Cc: cf-metadata at cgd.ucar.edu
> >>> Subject: Re: [CF-metadata] Web reference to a standard name?
> >>>
> >>> On 2010-12-14 12:56, John Graybeal wrote:
> >>>> Just to be crystal clear, the places where you have '16' could also
> > have 'current' (if I understand correctly what Roy was saying about
> > their server), and the mmisw one could also be served with a particular
> > version ID (analogous to the NERC example).
> >>>
> >>> I think it is of the utmost importance to have a URI that does not
> >>> include a version number and always provides the latest answer.
> >>> Otherwise you have a proliferation of identifiers mean the same thing
> >
> >>> but appear to change every time the overall vocabulary is updated.
> > You can also have a version-specific entry if desired.
> >>>
> >>> There were similar discussions regarding identifiers for coordinate
> >>> reference system identifiers from EPSG (European Petroleum Survey
> >>> Group), and it was fortunately recognized that a version-less URI was
> > essential.
> >>>
> >>> -Jeff
> >>>
> >>>
> >>> _______________________________________________
> >>> CF-metadata mailing list
> >>> CF-metadata at cgd.ucar.edu
> >>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata--
> >>> This message (and any attachments) is for the recipient only NERC is
> >>> subject to the Freedom of Information Act 2000 and the contents of
> >>> this email and any reply you make may be disclosed by NERC unless it
> >>> is exempt from release under the Act. Any material supplied to NERC
> >>> may be stored in an electronic records management system.
> >> _______________________________________________
> >> CF-metadata mailing list
> >> CF-metadata at cgd.ucar.edu
> >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
> >
> >
> > John Graybeal <mailto:jgraybeal at ucsd.edu>
> > phone: 858-534-2162
> > System Development Manager
> > Ocean Observatories Initiative Cyberinfrastructure Project:
> > http://ci.oceanobservatories.org
> > Marine Metadata Interoperability Project: http://marinemetadata.org
> >
> > _______________________________________________
> > CF-metadata mailing list
> > CF-metadata at cgd.ucar.edu
> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > _______________________________________________
> > CF-metadata mailing list
> > CF-metadata at cgd.ucar.edu
> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> > --
> > This message (and any attachments) is for the recipient only NERC
> > is subject to the Freedom of Information Act 2000 and the contents
> > of this email and any reply you make may be disclosed by NERC unless
> > it is exempt from release under the Act. Any material supplied to
> > NERC may be stored in an electronic records management system.
> > _______________________________________________
> > CF-metadata mailing list
> > CF-metadata at cgd.ucar.edu
> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
>
> John Graybeal <mailto:jgraybeal at ucsd.edu>
> phone: 858-534-2162
> System Development Manager
> Ocean Observatories Initiative Cyberinfrastructure Project:
> http://ci.oceanobservatories.org
> Marine Metadata Interoperability Project: http://marinemetadata.org
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>



-- 
Dr. M. Benno Blumenthal          benno at iri.columbia.edu
International Research Institute for climate and society
The Earth Institute at Columbia University
Lamont Campus, Palisades NY 10964-8000   (845) 680-4450
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20101221/9c7d802b/attachment-0001.html>
Received on Mon Dec 20 2010 - 23:15:25 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒