Tools and CF compliance Was: CF Conventions 1.2 (Jon Blower)
Hi,
concerning compliance and the CF compliance checker: In my view there are two very different ways how this can/should be interpreted:
(1) check whether (or to what extent) a given file adheres to the convention. This assumes that the file does in theory adhere to the convention and can be quite useful to detect errors.
(2) support the development of scripts, CMOR tables, etc. which shall generate CF compliant files. Here, it cannot be expected a priori that the file adheres to CF at all (including the missing "Convention" attribute), so the tool should produce hints to the developer as to what changes he/she should make. While of course most CF attributes are "optional", individuals and projects should nevertheless strive to implement a good part of them. Thus, a good "compliance test" would go beyond critizing what is there but wrong and notify the user of what is not there but should perhaps be added.
At present the CF checker operates under principle number 1, but in order to proliferate CF I would suggest to consider some way of going towards variant 2. I believe that most of the testng that is necessary for this is embedded in the CF checker anyway, so it would probably be mostly a matter of some program logic and generation of verbose messages. Perhaps this can be realized on the web interface with a simple check box:
[ ] suggest further improvements to the file's attribute structure
Best regards,
Martin
< Dr. Martin G. Schultz, ICG-2, Forschungszentrum J?lich >
< D-52425 J?lich, Germany >
< ph: +49 (0)2461 61 2831, fax: +49 (0)2461 61 8131 >
< email: m.schultz at fz-juelich.de >
< web:
http://www.fz-juelich.de/icg/icg-2/m_schultz >
----------------------------------------------------------------------
Message: 1
Date: Thu, 08 May 2008 11:53:44 +0100
From: Philip Bentley <philip.bentley at metoffice.gov.uk>
Subject: Re: [CF-metadata] CF Conventions 1.2
To: cf-metadata at cgd.ucar.edu
Message-ID:
<1210244025.25310.87.camel at eld414.desktop.frd.metoffice.com>
Content-Type: text/plain; charset="us-ascii"
Hi Jonathan, Ethan,
> Dear Ethan
>
> I think that the current rules are a good compromise between the needs
> of people who write and analyse data, and the needs of developers of
> analysis and other software. The former group of people would like CF
> to be modified fairly rapidly, when they are about to start producing
> data from a project, and they want that data to have proper metadata.
> As you will have seen from previous discussions, our discussions are
> too slow as it is sometimes. Hence we decided the rules so that changes could be made, but marked as provisional.
Indeed. I think the 4-plus years between CF 1.0 and CF 1.1 - according to the date stamps on the documents - says it all. Perhaps the recent flurry of CF proposal activity in part reflects a general desire to 'play catch-up'.
>
> For provisional changes to become permanent depends on at least two
> applications supporting them. That requires some development effort to
> be invested. CF doesn't have staff resources of its own to commit to
> it. I think the most likely applications to make changes first are the
> cf-checker and libcf. It will be interesting to see how long it takes
> for the changes so far agreed to be implemented in these or other applications.
>
> I fear that if we followed this approach:
>
> > 2) Don't add changes to the upcoming version of the specification
> > document "until at least two applications have successfully
> > interpreted the test data".
>
> development of CF would effectively be halted altogether. It would be
> impossible for writers of data to agree changes to the CF standard on
> a short enough timescale. Consequently they would bypass CF, and write
> and analyse data with their own metadata conventions, and the
> usefulness of CF in providing a common standard would be undermined.
I agree 100% with this. If, as a community, we set the barrier to progress [of the CF conventions] too high then people will necessarily devise local, incompatible solutions - not out of willfulness, but simply to meet project deadlines.
>
> Applications don't have to keep entirely up to date, do they? I think
> the value of the Conventions attribute should be that it is easy to be
> clear about what conventions are being implemented in data and metadata.
>
> I agree about the test data. We should construct a file which contains
> some test data for the changes of CF 1.2. (The changes of CF 1.1 did
> not introduce any new attribute.) We'll need a place to deposit such files.
> As moderator of that ticket, I'll discuss it with Phil and Velimir.
I can produce some simple test files for the changes at CF 1.2. But the question of what constitutes application conformance is, I suggest, not easily defined. For instance, I could create a noddy netcdf file with two new grid mapping attributes, as follows:
float temperature(t,z,lat,lon);
:grid_mapping = "crs";
char crs;
:grid_mapping_name = "latitude_longitude";
:semi_major_axis = "92389234"; // new at CF 1.2
:semi_minor_axis = "78682347"; // new at CF 1.2
And I could read this file today using, say, ncdump and ncview. Which clearly doesn't tell us much. Yet a proposer of a given CF change cannot force the hands of software developers to produce compliant software within a particular time frame, if at all. In some (many?) circumstances I think we have to take it as an act of faith that a particular update to the CF convention will be advantageous. Plus I believe that the robustness of the CF peer review and challenge mechanisms is sufficient to ensure that those updates will be advantageous.
Regards,
Phil
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20080508/c0ac8bd7/attachment-0001.html
------------------------------
Message: 2
Date: Thu, 8 May 2008 12:04:14 +0100
From: Jonathan Gregory <j.m.gregory at reading.ac.uk>
Subject: [CF-metadata] CF Conventions 1.2
To: cf-metadata at cgd.ucar.edu
Message-ID: <20080508110414.GA1858 at met.reading.ac.uk>
Content-Type: text/plain; charset=us-ascii
Dear Phil
> Perhaps the recent
> flurry of CF proposal activity in part reflects a general desire to
> 'play catch-up'.
Yes, I think that is the case. It certainly is the case for the two proposals I have made, on the axis and cell_methods attributes. These were discussed on the email list and in abeyance for a long time because we had no way to adopt them formally until we agreed the new rules.
> I can produce some simple test files for the changes at CF 1.2. But
> the question of what constitutes application conformance is, I
> suggest, not easily defined. For instance, I could create a noddy
> netcdf file with two new grid mapping attributes, as follows:
Yes, I think such a file would be useful, because it does at least provide input data that the cf-checker can check for conformance, and other applications could likewise check that they can read in and interpret, if they are interested in these features. I agree with you that what "compliance"
actually means for an application is ill-defined. This is an issue which has come up before, of course. Since most of CF is optional, in one sense (but not a very useful sense) an application is compliant even if it ignores all that optional metadata. On the other hand I am sure no application currently exists which interprets all the metadata. But I don't think that means the metadata is not useful. It can still be read by humans, it describes the data properly, and we only add features when people have a need for them (usually people who intend to produce data).
Best wishes
Jonathan
------------------------------
Message: 3
Date: Thu, 8 May 2008 13:23:02 +0100
From: "Jon Blower" <jdb at mail.nerc-essc.ac.uk>
Subject: [CF-metadata] Tools and CF compliance Was: CF Conventions 1.2
To: "Philip Bentley" <philip.bentley at metoffice.gov.uk>
Cc: cf-metadata at cgd.ucar.edu
Message-ID:
<2bb6ee950805080523s1b315d15i2e7cd0dfb0fbe867 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Hi Philip and list,
(I've started a new thread as this is probably a new topic for discussion.)
> And I could read this file today using, say, ncdump and ncview. Which
> clearly doesn't tell us much.
This is a really important point. It would be very difficult, in the general case, to ascertain whether a certain piece of software actually interprets a certain CF attribute correctly. Conversely it is perhaps unreasonable to expect a piece of software to implement correctly every feature of a certain CF version.
What a tool user really wants to know (I think) is, for a given NetCDF file, which attributes in the file are correctly interpreted by the tool. I can't think of a neat way to do this - perhaps tool developers could publish a list of attributes that they claim to be able to interpret for each version of the tool they produce? A given tool might then implement 100% of CF1.0 but 50% of CF1.2 for example.
Then the CF community could maintain a list of tools that users could go to to find out which tools might be most suited to their purpose.
An add-on to the CF compliance checker could be created that, having scanned a file for CF attributes, produces a list that says "Tool X understands all of the attributes in this file, but Tool Y only understands 7 out of 9".
All this requires effort of course, but I think it's useful to consider what we really mean when we call for "CF compliance". How can we help users to judge which tools they should use and how can we help data providers to ensure that their data can be interpreted by a wide community?
Jon
On Thu, May 8, 2008 at 11:53 AM, Philip Bentley <philip.bentley at metoffice.gov.uk> wrote:
>
> Hi Jonathan, Ethan,
>
>
> Dear Ethan
>
> I think that the current rules are a good compromise between the needs
> of people who write and analyse data, and the needs of developers of
> analysis and other software. The former group of people would like CF
> to be modified fairly rapidly, when they are about to start producing
> data from a project, and they want that data to have proper metadata.
> As you will have seen from previous discussions, our discussions are
> too slow as it is sometimes. Hence we decided the rules so that
> changes could be made, but marked as provisional.
>
> Indeed. I think the 4-plus years between CF 1.0 and CF 1.1 -
> according to the date stamps on the documents - says it all. Perhaps
> the recent flurry of CF proposal activity in part reflects a general desire to 'play catch-up'.
>
> For provisional changes to become permanent depends on at least two
> applications supporting them. That requires some development effort to
> be invested. CF doesn't have staff resources of its own to commit to
> it. I think the most likely applications to make changes first are the
> cf-checker and libcf. It will be interesting to see how long it takes
> for the changes so far agreed to be implemented in these or other
> applications.
>
> I fear that if we followed this approach:
>
> > 2) Don't add changes to the upcoming version of the specification
> > document "until at least two applications have successfully
> > interpreted the test data".
>
> development of CF would effectively be halted altogether. It would be
> impossible for writers of data to agree changes to the CF standard on
> a short enough timescale. Consequently they would bypass CF, and write
> and analyse data with their own metadata conventions, and the
> usefulness of CF in providing a common standard would be undermined.
>
> I agree 100% with this. If, as a community, we set the barrier to
> progress [of the CF conventions] too high then people will necessarily
> devise local, incompatible solutions - not out of willfulness, but
> simply to meet project deadlines.
>
> Applications don't have to keep entirely up to date, do they? I think
> the value of the Conventions attribute should be that it is easy to be
> clear about what conventions are being implemented in data and metadata.
>
> I agree about the test data. We should construct a file which contains
> some test data for the changes of CF 1.2. (The changes of CF 1.1 did
> not introduce any new attribute.) We'll need a place to deposit such files.
> As moderator of that ticket, I'll discuss it with Phil and Velimir.
>
> I can produce some simple test files for the changes at CF 1.2. But
> the question of what constitutes application conformance is, I
> suggest, not easily defined. For instance, I could create a noddy
> netcdf file with two new grid mapping attributes, as follows:
>
> float temperature(t,z,lat,lon);
> :grid_mapping = "crs";
> char crs;
> :grid_mapping_name = "latitude_longitude";
> :semi_major_axis = "92389234"; // new at CF 1.2
> :semi_minor_axis = "78682347"; // new at CF 1.2
>
> And I could read this file today using, say, ncdump and ncview. Which
> clearly doesn't tell us much. Yet a proposer of a given CF change
> cannot force the hands of software developers to produce compliant
> software within a particular time frame, if at all. In some (many?)
> circumstances I think we have to take it as an act of faith that a
> particular update to the CF convention will be advantageous. Plus I
> believe that the robustness of the CF peer review and challenge
> mechanisms is sufficient to ensure that those updates will be advantageous.
>
> Regards,
> Phil
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
--
--------------------------------------------------------------
Dr Jon Blower Tel: +44 118 378 5213 (direct line) Technical Director Tel: +44 118 378 8741 (ESSC) Reading e-Science Centre Fax: +44 118 378 6413 ESSC Email: jdb at mail.nerc-essc.ac.uk University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------
------------------------------
Message: 4
Date: Thu, 08 May 2008 09:23:10 -0600
From: John Caron <caron at unidata.ucar.edu>
Subject: Re: [CF-metadata] CF Conventions 1.2
Cc: cf-metadata at cgd.ucar.edu
Message-ID: <48231ADE.2080401 at unidata.ucar.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Jonathan Gregory wrote:
> Dear Phil
>
>> Perhaps the recent
>> flurry of CF proposal activity in part reflects a general desire to
>> 'play catch-up'.
>
> Yes, I think that is the case. It certainly is the case for the two proposals
> I have made, on the axis and cell_methods attributes. These were discussed on
> the email list and in abeyance for a long time because we had no way to adopt
> them formally until we agreed the new rules.
>
>> I can produce some simple test files for the changes at CF 1.2. But
>> the question of what constitutes application conformance is, I
>> suggest, not easily defined. For instance, I could create a noddy
>> netcdf file with two new grid mapping attributes, as follows:
>
> Yes, I think such a file would be useful, because it does at least provide
> input data that the cf-checker can check for conformance, and other
> applications could likewise check that they can read in and interpret, if they
> are interested in these features. I agree with you that what "compliance"
> actually means for an application is ill-defined. This is an issue which has
> come up before, of course. Since most of CF is optional, in one sense (but not
> a very useful sense) an application is compliant even if it ignores all that
> optional metadata. On the other hand I am sure no application currently exists
> which interprets all the metadata. But I don't think that means the metadata is
> not useful. It can still be read by humans, it describes the data properly,
> and we only add features when people have a need for them (usually people who
> intend to produce data).
As a tool developer, a real netcdf file that has the new features(s) in it is extremely useful. In
fact I dont even try to implement a feature until I have a real example of it.
So on a practical level, requiring an example netcdf file before the final acceptance of a feature
seems to me to be reasonable. Proving that software "correctly conforms" is difficult in the general
case.
We have a repository at Unidata of sample CF files, but they dont document which features they use.
It would be very useful to start that documentation, and tie in back to CF section numbers or anchors.
I propose we start a repository of sample files, ideally on the CF site, documented as to what CF
features they use. It would be good if that documentation is a wiki (or equivilent), so that the
initial person can make a start, then others can augment and comment on.
------------------------------
Message: 5
Date: Thu, 08 May 2008 10:07:23 -0600
From: Ethan Davis <edavis at unidata.ucar.edu>
Subject: Re: [CF-metadata] CF Conventions 1.2
To: John Caron <caron at unidata.ucar.edu>
Cc: cf-metadata at cgd.ucar.edu
Message-ID: <4823253B.2040702 at unidata.ucar.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
John Caron wrote:
> We have a repository at Unidata of sample CF files, but they dont document which features they use.
> It would be very useful to start that documentation, and tie in back to CF section numbers or anchors.
>
> I propose we start a repository of sample files, ideally on the CF site, documented as to what CF
> features they use. It would be good if that documentation is a wiki (or equivilent), so that the
> initial person can make a start, then others can augment and comment on
I think it would be useful for each approved change to have a document
detailing the changes to be made to the CF spec. This document could be
referenced by the test data documents. As it stands, we have the trac
ticket discussions which can be very voluminous and even after approval
aren't always clear what exact changes to the specification are to be made.
Perhaps the closing comment to the trac ticket should be a detailed
change request. Though I think a separate (wiki?) document would be more
useful (both during the trac ticket discussion and afterwards for
documentation purposes).
Ethan
--
Ethan R. Davis Telephone: (303) 497-8155
Software Engineer Fax: (303) 497-8690
UCAR Unidata Program Center E-mail: edavis at ucar.edu
P.O. Box 3000
Boulder, CO 80307-3000 http://www.unidata.ucar.edu/
---------------------------------------------------------------------------
------------------------------
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
End of CF-metadata Digest, Vol 62, Issue 2
******************************************
-------------------------------------------------------------------
-------------------------------------------------------------------
Forschungszentrum J?lich GmbH
52425 J?lich
Sitz der Gesellschaft: J?lich
Eingetragen im Handelsregister des Amtsgerichts D?ren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in B?rbel Brumme-Bothe
Gesch?ftsf?hrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr. Harald Bolt,
Dr. Sebastian M. Schmidt
-------------------------------------------------------------------
-------------------------------------------------------------------
Received on Fri May 09 2008 - 01:03:29 BST