[CF-metadata] Encoding Errors on variables in CF from Stephens, A on 2003-03-21 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Stephens, A <A.Stephens>
Date: Fri, 21 Mar 2003 12:51:32 -0000

Dear all,

Here is an example from one of our datasets (which is not yet in NetCDF).

The data comes from the DOAS (Differential Optical Absorption Spectrometer)
instrument which records "NO2 mixing ratio" (ppbv). For each value there is
also an "NO2 error limit" (ppbv) and an "NO2 detection limit" (ppbv). The
error limit is quoted for 2 sigma random errors plus systematic
uncertainties derived from cross-sectional fits. The detection limit is
quoted for 2 sigma.

So, how could we encode this into CF in a number of ways:

-----------------------------------------------
1. No frills, just three separate variables.

        float no2(time) ;
                no2:long_name = "Nitrogen Dioxide Mass Mixing Ratio" ;
                no2:units = "1-e9" ;
                no2:comment = "Units are given in parts per billion by
volume." ;

        float no2_error_limit(time) ;
                no2_error_limit:long_name = "Nitrogen Dioxide Error Limit" ;
                no2_error_limit:units = "1-e9" ;
                no2_error_limit:comment = "Units are given in parts per
billion by volume. The error limit is quoted for 2 sigma random errors plus
systematic uncertainties derived from cross-sectional fits." ;

        float no2_detection_limit(time) ;
                no2_detection_limit:long_name = "Nitrogen Dioxide Detection
Limit" ;
                no2_detection_limit:units = "1-e9" ;
                no2_detection_limit:comment = "Units are given in parts per
billion by volume. The detection limit is quoted for 2 standard deviations."
;

-----------------------------------------------
2. Trying to link them and introduce standard_name linking...

        float no2(time) ;
                no2:standard_name = "no2_mixing_ratio" ; ###NOT OFFICIAL AT
PRESENT
                no2:long_name = "Nitrogen Dioxide Mass Mixing Ratio" ;
                no2:units = "1-e9" ;
                no2:error_var = "no2_error_limit" ;
                no2:comment = "Units are given in parts per billion by
volume." ;

        float no2_error_limit(time) ;
                no2_error_limit:standard_name =
"uncertainty_on_no2_mixing_ratio" ; ###NOT OFFICIAL AT PRESENT
                no2_error_limit:long_name = "Nitrogen Dioxide Error Limit" ;
                no2_error_limit:units = "1-e9" ;
                no2_error_limit:root_var = "no2" ;
                no2_error_limit:comment = "Units are given in parts per
billion by volume. The error limit is quoted for 2 sigma random errors plus
systematic uncertainties derived from cross-sectional fits." ;

        float no2_detection_limit(time) ;
                no2_detection_limit:standard_name =
"lod_on_no2_mixing_ratio" ; ###NOT OFFICIAL AT PRESENT [lod = limit of
detection]
                no2_detection_limit:long_name = "Nitrogen Dioxide Detection
Limit" ;
                no2_detection_limit:units = "1-e9" ;
                no2_detection_limit:comment = "Units are given in parts per
billion by volume. The detection limit is quoted for 2 standard deviations."
;

-----------------------------------------------

Detection Limit introduces yet another twist as we might want a
standard_name prefix for that as well. However, many users may prefer that
the any values below the detection limit are not included (i.e. encoded as
_FillValues) but I suspect the actual value should be recorded for
reference.

Any thoughts?

Ag

-----Original Message-----
From: John Evans [mailto:johnevans at acm.org]
Sent: 20 March 2003 14:56
To: Bryan Lawrence
Cc: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Encoding Errors on variables in CF

I'd like to jump into this (and hopefully not expose my ignorance).
I have a similar situation where it's not error that I'm tracking, but
instead data quality, and I guess that I've been thinking kind of along
Bryan's line here. I'm recording realtime buoy observations (wind speed,
direction, wave height, etc) and with to each observation a quality flag
is assigned, and right now I do this by keeping another variable attached
to the parameter in question. For example, here's an example from a
current meter file

        float current_speed(time, depth, lat, lon) ;
                current_speed:long_name = "Current Speed" ;
                current_speed:standard_name = "sea_water_speed" ;
                current_speed:short_name = "CSPD" ;
                current_speed:scale_factor = 1. ;
                current_speed:add_offset = 0. ;
                current_speed:_FillValue = -999.f ;
                current_speed:units = "m/s" ;
                current_speed:valid_range = 0.f, 10.f ;
                current_speed:calibration_coeffs = 0., 0.2933, 0., 0. ;
                                current_speed:epic_code = 300 ;
        byte current_speed_qc(time, depth, lat, lon) ;
                current_speed_qc:long_name = "Current Speed Quality" ;
                current_speed_qc:short_name = "CSPDQ" ;
                current_speed_qc:scale_factor = 1. ;
                current_speed_qc:add_offset = 0. ;
                current_speed_qc:_FillValue = -128b ;
                current_speed_qc:units = "none" ;
                current_speed_qc:valid_range = -127s, 127s ;
                current_speed_qc:quality_good = 0b ;
                current_speed_qc:sensor_nonfunctional = 1b ;
                current_speed_qc:outside_valid_range = 2b ;

If the data were only bad when it was outside the valid range, there would
be no
need for the quality variable, but there are cases where the data actually
is
within the valid range, but we still believe it to be bad. So right now I
basically
have a quality variable for each geophysical parameter that I track. I've
been kind
of wondering for a while how exactly this would fit into CF... I'm just
tacking
on the letters "_qc" for each quality variable, but having a CF attribute
formally identifying it as just a quality variable would certain be easy
to do. But I'm open to other suggestions, and am currently in the middle of
a push
to move my data into total CF compliance, so if there's a better way for me
to do
this, I'm all ears.

On Thu, 20 Mar 2003, Bryan Lawrence wrote:

>Hi Jonathan
>
>> variables. However, I am not convinced about introducing a linking
>> attribute, because I don't think the relation is so close that it needs
to
>> be given this special status.
>
>ok, I can accept that argument but ... see below
>
>> In this particular case, you can look for another variable with the same
>> spatiotemporal metadata and a standard_name which contains the original
one
>> as a pattern. That would be very general.
>
>... but I had got the thought from the previous emails that we didn't want
to
>enforce "error" into standard names ... and I think it would introduce a
new
>problem:
>
>My radiosonde example is still a good one, and it's real:
>
>For each measurement in a profile I have a number which is an error (which
>might be a physical error quantity but is more likely a percentage). So,
for
>an individual radisonde profile, say of humidity, then I have the
>measurement, and an error (let's say a percentage, which increases with
>height). So, I can see that I could have two variables in the file which
are
>the humidity and the error (percentage).
>
>I could then produce a climatology from a year's measurements, and now the
>error of interest is the standard error associated with the variance
between
>measurements within the month and it has rather different units than the
>error did in the previous case, still with the same names for the variables

>if we were trying to introduce an error standard name for every standard
name
>... (have i misundestood this point?).
>
>The key point here is that different communties have rather different idea
of
>what we mean by an error ... even with the same source data ... albeit at
>different stages of maturity.
>
>I would have thought it would have been cleaner to have an optional link
>between two variables ... software could simply fail gracefully if the link

>was broken. This is nothing more than a "data hyperlink" ...
>
>... but I dont feel strongly about this, as long as we can find a sensible
way
>of doing it ...
>
>Bryan
>
>

-- 
John Evans           
johnevans at acm.org   
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://www.cgd.ucar.edu/mailman/listinfo/cf-metadata

Received on Fri Mar 21 2003 - 05:51:32 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:40 BST