[CF-metadata] CF standard names for chemical constituents and aerosols from Jonathan Gregory on 2008-10-21 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Jonathan Gregory <j.m.gregory>
Date: Tue, 21 Oct 2008 22:58:20 +0100

Dear all

I think there are two kinds of difficulty with additions to the standard name
table which need to be distinguished. First, there is the problem of the
standard name table getting very large because of the possibly large number of
chemical species. That presents a problem of organisation of metadata, but it
does not cause delays in assigning standard names. It is easy to add lots more
standard names which follow the same patterns as existing ones. Second, there
is the problem of delays when requests are made for new standard names. This
problem is caused by the intellectual difficulties of working out what the
concepts are and what the names for them should be.

As discussed e.g. by Heinke, Martin and Philip, we could avoid the first
problem by adopting species-independent standard names. I would favour a
syntax which identifies where to look for the species name,
e.g. mass_concentration_of_[VAR]_in_air, where VAR is the name of the
string-valued coordinate variable or scalar coordinate variable that names the
species, and [] is a special syntax. That kind of syntax would allow more than
one place-holder, which may be necessary because some quantities might
identify more than one species e.g. reaction rates or the existing ones of the
kind mole_concentration_of_[VAR1]_in_sea_water_expressed_as_[VAR2], where VAR1
could be have the value "mesozooplankton" and VAR2 the value "nitrogen". The
lists of what can fill the gaps could, as has been suggested, be maintained by
groups with the relevant expertise.

I think that a stronger argument for this than the size of the standard name
table, which should be no problem for software, is that chemical models may
internally have array dimensions for species, in which case it would be
natural to write out arrays of results. Are the models indeed like that?

If we take this kind of approach, any combination of standard name and species
would be possible, as only the contents of the lists would be regulated, not
the combinations. There would not be a way to prevent nonsense such as mass
concentration of mesozooplankton in air. However, I don't think that's a
problem really. We currently have no way to prevent sea_water_temperature with
a height coordinate of 10 km above the ground, and that's not a problem.

The second issue is more difficult. As I have argued before, I do not think it
can be helped by allowing projects to develop independent tables if we want to
use standard names to compare data from different sources. That is one of the
main reasons they are useful, I think, as Seth says too, and it's why they are
called "standard". If there were many tables, of course it would become
easier to add new names within projects, but interoperability would be lost
among projects. Interoperability can be maintained by across tables by
mappings (ontologies) but that is hard work. With more tables it would be
harder work. Who would do it? Dividing up the standard name table would
compound the intellectual difficulty, rather than easing the problem.

So why is agreement of new standard names slow? I think it is because it is
difficult. It is not principally because we are arguing about syntax (though
it is partly), but because we are working out what we actually mean, and how
to describe it in ways consistent with other quantities we have defined. That
is, I think the slowness is mostly about the the definition, not about the
meaningful identifier, in Bryan's terms. It is largely a scientific and
communication problem, not a technological one. (As an example, at the end of
this email I have listed some of the issues that Stephen Griffies and I have
just been discussing in order to make proposals for standard names for ocean
quantities to be requested by CMIP5.) I do not see any easy way to dissolve
this difficulty. We can move it around or conceal it, but not easily get rid
of it. If we want to reduce the difficulty of the problem, we could choose to
lower the standards currently applied to the clarity of concepts. That would
mean that projects using standard names would have to decide for themselves
more about what they meant before using them, or suffer more confusion through
not doing so, and interoperability would be reduced too. CF would do less
work, and would be less useful as a result. But if we decide to go that way, I
for one won't complain about doing less work. I don't do it for fun!

I agree with Steve H that technology could help to ease the problem, though,
by providing more tools. Could we provide tools to allow it to be easier to
search standard names in cleverer ways? It might be that the ocean names I've
been discussing with Stephen G could have been chosen more quickly if it had
been easier to search the existing names, as many of the quantities that
appeared to be new did actually have existing names. Could tools be written
to digest the table into those phrases and words from which the existing names
are constructed, and to present menus which allow construction of names from
the existing elements, with the possibility of proposing new elements to be
inserted in existing patterns? That would be a big help.

Best wishes

Jonathan

Some issues in defining ocean standard names:
- Basin masks for tracer and velocity are the same geophysical quantity, but
distinguished by coordinates. Grids are not identified by standard names; that
is an issue of how CF organises metadata.
- We say "sea floor", not "ocean bottom". You could say either, but in a set
of definitions it is important to be consistent about terminology, or the
reader will wonder if a distinction is being drawn.
- If you speak of the mass of the ocean, does it include sea-ice?
- We do not need a separate name for global-mean sea water temperature; we can
use sea_water_temperature and indicate the mean in cell_methods. That's another
issue of organisation of metadata.
- What does "ideal age" of sea water mean?
- Is the vertical integral of mass transport in an ocean model with a free
surface to be regarded as the same geophysical quantity as the vertical
integral of volume transport in a rigid-lid model multiplied by density?
- Is the mixed-layer depth determined by a buoyancy criterion the same concept
as mixed-layer depth determined by sigma-theta?
- Is the sea water "mixing depth" the same concept as mixed-layer depth defined
by the mixing scheme?
- Transports across various straits are all the same geophysical quantity, and
the strait should be identified by some string-valued coordinate.
- Do we want to know the rainfall flux over the whole grid box, or just the
part that falls into the liquid water (and not on the sea-ice)? These can be
distinguished by cell_methods.
- What's a clear way to describe the heat flux associated with the temperature
of rainfall not being the same as the temperature of the ocean it falls into?
Received on Tue Oct 21 2008 - 15:58:20 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST