⇐ ⇒

[CF-metadata] Use of CF standard name region

From: martin.juckes at stfc.ac.uk <martin.juckes>
Date: Thu, 26 May 2016 09:05:54 +0000

Dear All,

I can see some sense in Jonathan's interpretation that a string valued concept could be represented in a file by integers and flag_values/flag_meanings. If we want to follow that interpretation, however, the current standard name definition, which states that the variable contains strings, is unhelpful. If we adopt this approach, it would help to modify the standard name definition slightly:
From: A variable with the standard name of region contains strings which indicate geographical regions. These strings must be chosen from the standard region list.
To: A variable with the standard name of region contains strings which indicate geographical regions, or refers to them through the flag_values/flag_meanings construct. These strings must be chosen from the standard region list.

I have a slight preference for Karl's approach, as I feel that the above is putting too many technical requirements into the standard name definition. However, rather than use "index", it might be clearer to have a standard name modifier "flag":

basin: standard_name = "region flag"
basin: flag_values = "....."

Using a variable "region_flag" or "region_index" would have the advantage that we could keep the standard name definitions reasonably clear and transparent.

There is a more general question here about the treatment of formatting constraints which are expressed in standard name definitions but not explicitly represented in the convention text or the corresponding conformance document. Would it be helpful to add an appendix of requirements associated with specific standard names, so that the implications of whichever options is chosen can be spelt out with an example?

regards,
Martin

#####################################

Dear Karl

> This is why I suggested defining a new name modifier, "index". We
> could then write:
>
> basin: standard_name = "region index"
>
> alternatively we could just define a new standard name:
> standard_name="region_index"
>
> You suggest that we should simply allow the standard name "region"
> be used for both string variables or for integer variables when they
> are associated with strings with the flag_meanings attribute.

Yes, that's right. We have previously recommended this treatment for area
type variables too. The flag attributes provide a self-describing encoding
mechanism that doesn't alter the intention of the data.

> That would be fine, but I think we'll need to make this explicit.

We could certanly do that. I wouldn't restrict it to this case, but point it
out as generally possible use of the flag attributes.

> I don't think many folks view indexes as "encodings of a strings as a
> numbers".

The difference is only that if you defined a new variable of basin index, you
would need an external table to translate its numerical values into basin
names. That would be not be self-describing metadata and it would not be
CF-like, I feel.

Best wishes

Jonathan

----- Forwarded message from Karl Taylor <taylor13 at llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> -----

> Date: Sat, 21 May 2016 10:35:41 -0700
> From: Karl Taylor <taylor13 at llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>>
> To: cf-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> Subject: Re: [CF-metadata] Use of CF standard name region
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0)
> Gecko/20100101 Thunderbird/38.7.2
>
> Hi Jonathan and Martin,
>
> I think the issue pertains to the following variable and metadata (I
> *think* this is how we did it in CMIP5):
>
> int basin(lon, lat)
> basin: standard_name = "region";
> basin: flag_values = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10;
> basin: flag_meanings = "global_land", "southern_ocean",
> "atlantic_ocean", "pacific_ocean", "arctic_ocean", "indian_ocean",
> "mediterranean_sea", "black_sea", "hudson_bay", "baltic_sea",
> "red_sea";
> [and there were additional attributes]
>
> The construct is fine, I think, but according to the standard name
> table, "region" is supposed to be reserved for string variables.
> Here it is attached to the "basin" variable, which is an integer
> index (or I guess we could call it a "flag").
>
> This is why I suggested defining a new name modifier, "index". We
> could then write:
>
> basin: standard_name = "region index"
>
> alternatively we could just define a new standard name:
> standard_name="region_index"
>
> You suggest that we should simply allow the standard name "region"
> be used for both string variables or for integer variables when they
> are associated with strings with the flag_meanings attribute. That
> would be fine, but I think we'll need to make this explicit. I
> don't think many folks view indexes as "encodings of a strings as a
> numbers".
>
> So I think we have a few options. Perhaps others might weigh in.
>
> best regards,
> Karl
>
>
>
>
> On 5/21/16 2:05 AM, Jonathan Gregory wrote:
> >Dear Martin and Karl
> >
> >Actually I think the way it's done in CMIP5 is consistent with the convention.
> >It is correct that region is the standard name for a string-valued variable,
> >and flag_values and flag_meanings supply a method to encode the strings as
> >numbers. This is very much like Example 3.3 in Section 3.5, where string-valued
> >status flags are encoded as numbers. On this email list we have advised people
> >from time to time to use flag_values and flag_meanings in this way to encode
> >strings as numbers.
> >
> >You could argue that it is a bit different in principle. The intention of Sect
> >3.5 is to supply a way to decode numbers in a data variable into strings. That
> >is arguably not identical with an intention of providing a way to encode
> >strings as numbers in a data variable, but since the process is reversible the
> >mechanism works both ways! If you think that this use of the convention is not
> >obvious as it stands, then I would propose that we insert an extra sentence in
> >Sect 3.5 pointing out the use of this mechanism to encode strings. We could
> >include the CMIP5 basins as an example of it.
> >
> >Best wishes
> >
> >Jonathan
> >
> >----- Forwarded message from Karl Taylor <taylor13 at llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> -----
> >
> >>Date: Fri, 20 May 2016 15:16:23 -0700
> >>From: Karl Taylor <taylor13 at llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>>
> >>To: cf-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> >>Subject: Re: [CF-metadata] Use of CF standard name region
> >>User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0)
> >> Gecko/20100101 Thunderbird/38.7.2
> >>
> >>Hi all,
> >>
> >>Perhaps we should define a new standard_name: e.g., basin_index (or
> >>region_index) to replace the misused "region" standard_name.
> >>
> >>I would note that in the conventions document in example 3.3 there
> >>is a standard name: "sea_water_speed status_flag"
> >>
> >>"status_flag" is a standard "name modifier" (see appendix C).
> >>
> >>So, if we want to modify the convention, we could define a new name
> >>modifier (say "index") and explicitly indicate that flag_values can
> >>be used as indexes (when they are integers).
> >>
> >>regards,
> >>Karl
> >>
> >>
> >>On 5/20/16 12:44 PM, martin.juckes at stfc.ac.uk<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> wrote:
> >>>Hello All,
> >>>
> >>>In CMIP5 the variable "basin" was used as a fixed spatial field with integer values and the CF Standard Name "region", which has the definition "A variable with the standard name of region contains strings which indicate geographical regions. These strings must be chosen from the standard region list."
> >>>
> >>>The integer valued CMIP5 variable is clearly not consistent with this definition. The CMIP5 variable was defined with flag_values and flag_meanings, such that the flag_meanings were from the CF standard region list.
> >>>
> >>>The question is, should we redefine the CMIP5 variable somehow, or would it be acceptable to adjust the CF Standard Name definition for region to accept this usage which appears clear enough and is presumably much easier for plotting packages to handle than a spatial array of string values,
> >>>
> >>>regards,
> >>>Martin
> >>>
> >>>
> >>>_______________________________________________
> >>>CF-metadata mailing list
> >>>CF-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >>_______________________________________________
> >>CF-metadata mailing list
> >>CF-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> >>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> >
> >----- End forwarded message -----
> >_______________________________________________
> >CF-metadata mailing list
> >CF-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> >http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

----- End forwarded message -----
Received on Thu May 26 2016 - 03:05:54 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒