⇐ ⇒

[CF-metadata] New standard_name of quality_flag for corresponding quality control variables

From: Martin Juckes - UKRI STFC <martin.juckes>
Date: Tue, 23 Jul 2019 12:50:27 +0000

Dear Ken,


thanks for your response to me below.


Would it be fair to suggest that "status" should, as far as possible, reflect a generic objective classification, with terms such as "sensor_nonfunctional" which have a comparable meaning for all datasets, while "quality" is a subjective *measure* with a meaning that may from dataset to dataset? E.g. if dataset A has a maximum "quality" of 11 and dataset B only goes up to 10, it doesn't necessarily imply that dataset A is in any sense better and B.


If you want to use it in weighted means, perhaps it should be "quality_measure" rather than "quality_flag"? With "status_flag" the order of integer values does not have any meaning, but with quality perhaps it would make more sense have some concept of a sequence of quality settings (so that, for example "1" always indicates a quality between "0" and "2" within a dataset, but could have different meanings in different datasets). Could the quality also be expressed as a floating point number without any flag meanings?


Responding to a point Barna raised: it is certainly possible to have more than one "status_flag" variable, but I don't think it is ideal: if information needs to be split across multiple variables we generally like to describe the difference between the variables in the standard name or in other metadata. In this case, I think there is a good case for using a new standard name.


regards,

Martin




________________________________
From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu> on behalf of Andrew Barna <abarna at ucsd.edu>
Sent: 23 July 2019 00:23
To: Kehoe, Kenneth E.
Cc: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] New standard_name of quality_flag for corresponding quality control variables

Ken,

I guess, I don't see this proposed change as necessary since the
distinction between the terms "quality" and "status" is really done in
the "flag_meanings" attribute and is basically free form/uncontrolled.
These attributes need to be used by this new name as well.

Let me rephrase my suggestion/question:
If this proposal is not adopted, but an example of how to use a
variable, with the standard name of "status_flag", to only indicate
data quality is included in the document, would that help?

-Barna

On Mon, Jul 22, 2019 at 1:22 PM Kehoe, Kenneth E. <kkehoe at ou.edu> wrote:
>
> Barna,
>
> Yes an update to the CF document should follow after the new
> standard_name is implemented. I think multiple examples are needed since
> status_flag covers many different types of state variables.
>
> Ken
>
>
>
> On 2019-7-22 10:35, Andrew Barna wrote:
> > Hi Martin, Ken,
> >
> > Is there anything wrong with including multiple "status_flag"
> > variables to capture all separate state you wish? The CF document
> > unfortunately only includes an example of how to encode the status of
> > a sensor, but the actual meanings of the flag values are entirely up
> > to you, and this will not change with this proposal. Perhaps the CF
> > document would benefit from additional examples (e.g. one that only
> > shows data quality flags).
> >
> > -Barna
> >
> >
> > On Mon, Jul 22, 2019 at 9:04 AM Kehoe, Kenneth E. <kkehoe at ou.edu> wrote:
> >> Hi Martin,
> >>
> >> I see status encompassing multiple metadata pieces of information. For
> >> example it could be a state of the instrument as it cycles through a
> >> pre-programed routine (Look at calibration target, look at sky, look at
> >> ground, look at second calibration target, repeat...). Or the sources of
> >> the inputs for a model where the availability or some other reason could
> >> require making a decision on what source(s) to use. For provenance this
> >> source information is important to report on a time step basis. Or the
> >> status could be a data providers method to provide uncertainty
> >> information (I see this as incorrect but some people do see it this
> >> way). Each of these are important metadata but the method of use is
> >> different than a strictly quality variable. A quality variable provides
> >> information indicating if the data should be used or possibly could be
> >> used in a weighted mean method to favor high quality data over low
> >> quality data. The way the metadata is used is different depending on the
> >> metadata type. A state of the instrument would be used for sub-setting
> >> calibration vs. data. There is no ambiguity in this as data from a
> >> calibration target is not used in a weather research analysis. But
> >> quality is more subjective and is decided by the data user. If the
> >> quality variable has 20 different quality tests the user would need to
> >> decided if all 20 test results should be used or only a subset. Also,
> >> the code for applying the quality is different than the state of the
> >> instrument view (in my example above).
> >>
> >> It is possible to have a quality test result from the state of the
> >> instrument, but not the other way around (typically). So I need a way to
> >> distinguish the two for automated or semi-automated tools. Hence my
> >> point of quality_flag essentially being a subset of status_flag
> >>
> >> Ken
> >>
> >>
> >>
> >> On 2019-7-22 02:57, Martin Juckes - UKRI STFC wrote:
> >>> Dear Ken,
> >>>
> >>>
> >>> Can you expand on the distinction between "quality" and "status"? I understand that they are different in principle, but, in order to support this new standard name I think we need a clear objective statement of how we would want to distinguish between them in CF.
> >>>
> >>> The conventions section on flags (3.5) mixes the two up (http://cfconventions.org/cf-conventions/cf-conventions.html#flags ), so some re-wording of the document would also be needed,
> >>>
> >>> regards,
> >>> Martin
> >>>
> >>> ________________________________
> >>> From: CF-metadata <cf-metadata-bounces at cgd.ucar.edu> on behalf of Kehoe, Kenneth E. <kkehoe at ou.edu>
> >>> Sent: 19 July 2019 06:42
> >>> To: cf-metadata at cgd.ucar.edu
> >>> Subject: [CF-metadata] New standard_name of quality_flag for corresponding quality control variables
> >>>
> >>> Dear CF,
> >>>
> >>> I am proposing a new standard name of "quality_flag" to indicate a variable is purely a quality control variable. A quality control variable would use flag_values or flag_masks along with flag_meanings to allow declaring levels of quality or results from quality indicating tests of the data variable. This variable be a subset of the more general "status_flag" standard name. Currently the definition of "status_flag" is:
> >>>
> >>> - A variable with the standard name of status_flag contains an indication of quality or other status of another data variable. The linkage between the data variable and the variable with the standard_name of status_flag is achieved using the ancillary_variables attribute.
> >>>
> >>> This definition includes a variable used to define the state or other status information of a variable and can not be distinguished by standard name alone from a state of the instrument, processing decision, source information, needed metadata about the data variable or other ancillary variable type. Since there is no other way to define a purely quality control variable, the use of "status_flag" is too general for strictly quality control variables. By having a method to define a variable as strictly quality control the results of quality control tests can be applied to the data with a software tool based on requests by the user. This would not affect current datasets that do use "status_flag" nor require a change to the definition outside of the indication that "quality_flag" standard name is available and a better use for pure quality control variables.
> >>>
> >>> Proposed addition:
> >>>
> >>> quality_flag = A variable with the standard name of quality_flag contains an indication of quality information of another data variable. The linkage between the data variable and the variable or variables with the standard_name of quality_flag is achieved using the ancillary_variables attribute.
> >>>
> >>> Proposed change:
> >>>
> >>> status_flag = A variable with the standard name of status_flag contains an indication of status of another data variable. The linkage between the data variable and the variable with the standard_name of status_flag is achieved using the ancillary_variables attribute. For data quality information use quality_flag.
> >>>
> >>> Thanks,
> >>>
> >>> Ken
> >>>
> >>>
> >>>
> >>> --
> >>> Kenneth E. Kehoe
> >>> Research Associate - University of Oklahoma
> >>> Cooperative Institute for Mesoscale Meteorological Studies
> >>> ARM Climate Research Facility - Data Quality Office
> >>> e-mail: kkehoe at ou.edu<mailto:kkehoe at ou.edu> | Office: 303-497-4754
> >> --
> >> Kenneth E. Kehoe
> >> Research Associate - University of Oklahoma
> >> Cooperative Institute for Mesoscale Meteorological Studies
> >> ARM Climate Research Facility - Data Quality Office
> >> e-mail: kkehoe at ou.edu | Office: 303-497-4754
> >>
> >> _______________________________________________
> >> CF-metadata mailing list
> >> CF-metadata at cgd.ucar.edu
> >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
> --
> Kenneth E. Kehoe
> Research Associate - University of Oklahoma
> Cooperative Institute for Mesoscale Meteorological Studies
> ARM Climate Research Facility - Data Quality Office
> e-mail: kkehoe at ou.edu | Office: 303-497-4754
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Received on Tue Jul 23 2019 - 06:50:27 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:43 BST

⇐ ⇒