Jonathan
I think we all understand the distinction between things which are
measured, and things which are about measurement techniques, or who made
the measurements (or simulations).
We want to start building smart tools that can assist data users
understand these distinctions faster and (as Roy implies, more
accurately), but for these smart tools to work we need to build
semantics directly into our vocabs. One way of doing this (and it's
simply a first step) is to separate the vocabs. It's a design decision
as to whether one does that by rules within a vocab or between vocabs.
Either way, in our ontology building, we can start with the science
vocabs, and the job is that little bit more tractable (and I think
Jonathan underestimates how hard this going to be ... although it's only
possible because of the quality of what we have now!)
At the moment it is only your voice which is arguing against separating
these vocabs. Are there any others who have a problem with the proposed
separation? (Which is simply the creation of an additional table, to be
called standard_metadata, which we will continue to control in the same
way as we control standard_names ... and yes, at some future time we
might deprecate some existing standard names and define them as standard
metadata --- and to forestall Jonathan's objection: this wont hurt
existing files because this will all be version controlled :-)
Bryan
On Fri, 2006-12-29 at 10:35 +0000, Jonathan Gregory wrote:
> Dear Roy
>
> > Consider a case where a metadata record has two fields, one for geographic
> > coverage and one for parameter. If selection drop-downs for these are
> > covered by two separate lists - either vocabs or within an ontology - then
> > 'sea_temperature' will not appear in the geographic coverage drop-down and
> > 'Atlantic_Ocean' will not appear in the paramer drop-down. Were both
> > drop-downs covered by a single ' Standard Name list' then both terms would
> > appear. This not only increases the risk of field population with nonsense
> > (the type of error I was visualising - admittedly it's still possible to
> > call temperature salinity), but also makes the drop-down appear eccentric to
> > say the least.
>
> We distinguish between lists for (a) standard names (b) the possible values of
> quantities which have a standard name. "atlantic_ocean" is not a standard name;
> it is a possible value for a variable whose standard name is "region". A menu
> of standard names would includes sea_surface_temperature, rainfall_flux,
> latitude, region and land_cover (to list a few from the present table) and
> also (if my proposal is agreed, to meet Paco's requirement) source, institution
> and experiment_id. These are all names for things which a data variable or a
> coordinate variable could contain. The *values* of these variables are dealt
> with in other ways. sea_surface_temperature, rainfall_flux and latitude are
> numeric, so no list is needed. The others are string-valued. At present only
> region has standardised values; the possible values are given by
> http://www.cgd.ucar.edu/cms/eaton/cf-metadata/region.html
> It's quite likely we might develop a standard list for land_cover. As we have
> discussed a lot, it would be useful to make links to other people's controlled
> vocabularies if we can, and the proposal also includes a new attribute to point
> to external lists of the possible values for a quantity (Bryan's suggestion).
>
> > Jon's comment that we can carry on as we are now and change later worries me a little. So many times in my work with metadata I have found that aggregation is infinitely easier than teasing things apart.
>
> It is right to be cautious, but I think this reasonable concern of yours is
> that things should be sufficiently informative. I agree with you. That's why
> we spend so much time making sure we know exactly what quantity is being
> identified by a standard name, and why quantities with different physical
> dimensions (units) have different standard names. In this case, I think we
> are talking about a categorisation that can be introduced whenever we need it.
> There are only 814 standard names at present, so it would not be a big job to
> classify them in future, given a clear criterion for doing it - which is what
> we lack, since we don't have a need for it (as far as I can see).
>
> Cheers
>
> Jonathan
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Fri Dec 29 2006 - 06:26:18 GMT