Opened 9 years ago

Closed 4 years ago

#85 closed defect (fixed)

Link Appendix H from section 9, and clarify missing data requirements

Reported by: jonathan Owned by: davidhassell
Priority: medium Milestone:
Component: cf-conventions Version:
Keywords: Cc:

Description

We removed most of the detail regarding the storage of discrete sampling geometries from the text in section 9 to Appendix H, but as far as I can see we forgot to make links to Appendix H. To correct this defect, I propose that we:

  • Add this sentence to the end of the first paragraph of section 9.1, just before the table: "Details and examples of storage of each of these feature types are provided in Appendix H, as indicated in the table."
  • Add a column to Table 9.1, headed "Links", in which the cells indicate and provide links to the relevant sections H.1-H.6.

In section 9.6, we introduced the possibility of missing data in auxiliary coordinate variables for discrete sampling geometries, and Appendix A notes that missing data is only allowed for section 9.6. To be clear, I think we should note this in section 2.5.1 "Missing data" also. I propose that we append this sentence to the first paragraph of 2.5.1.

Missing data is not allowed in coordinate variables or auxiliary coordinate variables, except for auxiliary coordinate variables in discrete sampling geometries (Section 9.6, "Missing data").

Also, I propose that in the "Links" column of Appendix A, we add links to section 2.5.1 and 9.6 for _FillValue (in addition to the link to the NUG), and to section 9.6 for missing_value (2.5.1 is already linked).

Jonathan

Change History (8)

comment:1 Changed 9 years ago by jonathan

In a discussion on the email list with the subject "CF-1.6 Conformance Requirements/Recommendations?" there has been general agreement to change the rules for missing data in auxiliary coordinate variables. Therefore I withdraw the paragraph in the above starting "In section 9.6". (The first and third paragraphs above still stand.) Instead, I propose that we append the following to the first paragraph of section 2.5.1:

Missing data is allowed in data variables and auxiliary coordinate variables. Generic applications should treat the data as missing where any auxiliary coordinate variables have missing values; special-purpose applications might be able to make use of the data. Missing data is not allowed in coordinate variables.

Section 9.6 currently reads

Auxiliary coordinate variables (spatial and time) must contain missing values to indicate a void in data storage in the file but must not have missing data for any other reason. This situation may arise for unused elements in the incomplete multidimensional array representation, and in any representation if the instance dimension is set to a larger size than the number of features currently stored. It is not permitted for auxiliary coordinate variables to have missing values for elements where there is non-missing data. Where any auxiliary coordinate variable contains a missing value, all other coordinate, auxiliary coordinate and data values corresponding to that element should also contain missing values. Data variables should (as usual) also contain missing values to indicate when there is no valid data available for the element, although the coordinates are valid.

For consistency with the above, this should be changed to read

Wherever there is a void in data storage, the data variable and all its auxiliary coordinate variables (spatial and time) must contain missing values. This situation may arise for unused elements in the incomplete multidimensional array representation, and in any representation if the instance dimension is set to a larger size than the number of features currently stored. Data variables should (as usual) also contain missing values to indicate when there is no valid data available for the element, although the coordinates are valid.

Jonathan

comment:2 Changed 9 years ago by jonathan

Following Karl's email posting, here is a new proposal for the final paragraph.

Wherever there are unused elements in data storage, the data variable and all its auxiliary coordinate variables (spatial and time) must contain missing values. This situation may arise for the incomplete multidimensional array representation, and in any representation if the instance dimension is set to a larger size than the number of features currently stored. Data variables should (as usual) also contain missing values to indicate when there is no valid data available for the element, although the coordinates are valid.

Jonathan

comment:3 follow-up: Changed 9 years ago by taylor13

I am concerned (perhaps without good reason) that someone will read the first sentence: "Whenever there are unused elements in data storage, the data variable and all its auxiliary coordinate variables .... must contain missing values", and taken out of context they will assume that if they have gridded data with missing values, they should set any corresponding coordinate variables to "missing". I don't think we want to do this even in the case when an entire plane of the gridded data is missing. There is no need to set the coordinate variables to "missing" in this case is there? If not, then perhaps indicating at the beginning of the above paragraph that these rules apply only to the variables discussed in section 9 would help avoid confusion.

disclaimer: I haven't been able to read all of the discussion, so if I'm totally missing something, please ignore.

comment:4 in reply to: ↑ 3 Changed 9 years ago by jonathan

Dear Karl

Yes, this rule about setting all aux coord vars to missing where there is missing data just applies to unused elements in discrete sampling geometry feature types. This para appears in sect 9.6, but you're right, it could be misleading if someone came across it just skimming through the CF standard (which is long enough that people might be tempted to skim it than read it all intently :-). We could begin, "In data for discrete sampling geometries written according to the rules of this section, wherever there are unused ...". Would that be sufficient?

Cheers

Jonathan

comment:5 Changed 9 years ago by taylor13

(Karl Taylor)

Yes, I think your revision would be sufficient. Having been guilty of "skimming" turns out to have been of some value here, I think, since this slight rewording should help to prevent confusion.

thanks, Karl

comment:6 Changed 9 years ago by jonathan

For clarity, here is a restatement of the correction.

  • Add this sentence to the end of the first paragraph of section 9.1, just before the table: "Details and examples of storage of each of these feature types are provided in Appendix H, as indicated in the table."
  • Add a column to Table 9.1, headed "Links", in which the cells indicate and provide links to the relevant sections H.1-H.6.
  • Append the following to the first paragraph of section 2.5.1:

Missing data is allowed in data variables and auxiliary coordinate variables. Generic applications should treat the data as missing where any auxiliary coordinate variables have missing values; special-purpose applications might be able to make use of the data. Missing data is not allowed in coordinate variables.

  • Change the first paragraph of Section 9.6 to read:

In data for discrete sampling geometries written according to the rules of this section, wherever there are unused elements in data storage, the data variable and all its auxiliary coordinate variables (spatial and time) must contain missing values. This situation may arise for the incomplete multidimensional array representation, and in any representation if the instance dimension is set to a larger size than the number of features currently stored. Data variables should (as usual) also contain missing values to indicate when there is no valid data available for the element, although the coordinates are valid.

  • In the "Links" column of Appendix A, add links to section 2.5.1 and 9.6 for _FillValue (in addition to the link to the NUG), and to section 9.6 for missing_value (2.5.1 is already linked).
  • In the "Description" column of Appendix A, in the entries for _FillValue and missing_value, replace "Not allowed for coordinate data except in the case of auxiliary coordinate varibles in discrete sampling geometries." with "Allowed for auxiliary coordinate variables but not allowed for coordinate variables."

The last point is new - I have just noticed the need for it.

Jonathan

comment:7 Changed 4 years ago by davidhassell

  • Owner changed from cf-conventions@… to davidhassell
  • Status changed from new to accepted

comment:8 Changed 4 years ago by painter1

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.