Hello Jim et al.
I have been considering your example and the implication of it some more, it has been most helpful.
I still find some attraction in the proposal I have put forward in
https://cf-pcmdi.llnl.gov/trac/ticket/98
Additionally, I wonder whether the umbrella variables, proposed in
https://cf-pcmdi.llnl.gov/trac/ticket/79 as a method to provide collections of data variables as part of a larger entity may bring functionality to meet your use case rather neatly.
That said, I have also been considering how to meet you requirement as stated for use of ancillary data whilst addressing potential issues with the current specification.
I am interested in providing the required functionality for users of ancillary variables, but I have concerns that the current loose definition essentially allows anything to be defined as ancillary data for anything else and this seems wrong to me; it brings significant complications for downstream applications that I am not convinced are worth addressing.
I wonder if I can use the following terms as I define them here:
- Domain:
- the physical and theoretical phenomena that the data is defined with respect to;
- Sampling:
- the regime of identified cells within the domain where data and metadata are defined;
then apply this to CF as follows:
The definitions of the collection of Coordinates and Auxiliary Coordinates define a domain which the Data Variable is sampled from: 'definition' in this case refers to the attributes of the coord including standard_name, units, coordinate_reference_system etc.
The dimensionality, size and values of the Coordinate and Auxiliary Coordinate arrays define the sampling regime from this domain.
Thus a CF data variable defines a phenomenon and a domain and sampling upon which data values for that phenomenon are provided.
Such a terminology set would allow a constraint to be placed on the ancillary data that it must be from the domain or a subset of the domain of the referencing data variable, although the sampling may differ.
I think this would allow the case you have presented, where the domain is shared but the sampling differs, while limiting some of the potentially impractical cases from being encoded.
I think the statement in the conventions:
'The nature of the relationship between variables associated via ancillary_variables must be determined by other attributes.'
is important, i think there should be enough guidance to enable this to occur. I would like to see some text detailing what the allowable 'relationships' should be and a statement that there should be a relationship.
Am I keen to mandate that cases such as a data variable defined with respect to latitude and longitude only doesn't have an ancillary variable defined with respect to time only; no relationship may be determined between these, but the conventions do not say anything about this not being allowed.
What do you think?
many thanks for your input on this topic
mark
________________________________
From: Jim Biard [jim.biard at noaa.gov]
Sent: 15 March 2013 12:58
To: Hedley, Mark
Cc: cf-metadata at cgd.ucar.edu
Subject: Re: [CF-metadata] Ancillary Data
Mark,
I'm looking at this from the standpoint of the use of an ancillary variable to provide "metadata about the individual values of another data variable". The way I think of it, the atmosphere profile used to produce a given <energy deposited> value falls under the category of metadata for the <energy deposited> value. I don't see any need to restrict this metadata to be a scalar value per primary variable value. So, in my example, there is a model atmosphere density profile that can be associated with each value of the <energy deposited> variable. I feel that the profile provides information about the <energy deposited> value. In fact, we already do this sort of thing with bounds variables, which to my mind are specializations of ancillary variables.
If there is no driving reason to restrict the definition of ancillary variables, I'd encourage folks to leave it as is. I think that with proper attention to use of names and coordinates, an ancillary variable that has multiple values for each primary variable can be made easily understandable.
Grace and peace,
Jim
Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001
jim.biard at noaa.gov<mailto:jim.biard at noaa.gov>
828-271-4900
On Mar 15, 2013, at 7:37 AM, "Hedley, Mark" <mark.hedley at metoffice.gov.uk<mailto:mark.hedley at metoffice.gov.uk>> wrote:
Hello Jim
you present an interesting case, thank you
I can see how I can relate
<energy deposited>(x,y,z,t)
to an ancillary variable of
<model atmosphere density profile>(x,y,z)
In this case, for any value in my <energy deposited> data array I can retrieve an appropriate value for my ancillary variable, <model atmosphere density profile>.
I am assuming that I can aggregate (integrate?) my <energy deposited> data over z,to result in the total energy deposited as a function of x, y and t.
The atmosphere density profile was an input to this calculation, I believe.
If i try to store this as an ancillary variable for my result I am no longer able to reference an appropriate value in the <model atmosphere density profile> data array for a given value in the <energy deposited>.
The conventions state
'The nature of the relationship between variables associated via ancillary_variables must be determined by other attributes.'
(
http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/ch03s04.html)
I am worried that in this case I can no longer understand this association. The ancillary variable is defined with respect to a different domain from the data variable.
I might happily choose to ship two data variables in the same file, but do I really mean that the <model atmosphere density profile> is an intrinsic part of my <energy deposited> dataset?
what do you think?
mark
________________________________
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu<mailto:cf-metadata-bounces at cgd.ucar.edu>] on behalf of Jim Biard [jim.biard at noaa.gov<mailto:jim.biard at noaa.gov>]
Sent: 14 March 2013 13:51
To: cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
Subject: Re: [CF-metadata] Ancillary Data
Mark,
I've been following this thread silently, but your statement in your last posting has prompted me to speak up. I can easily imagine a situation in which I might desire to have one variable reference another variable in which not all dimensions are shared. As an example, let's say that I have a variable a[X,Y,T], which contains the total energy deposited per day into the atmosphere by charged particles (from the solar wind & Van Allen belts) on a lon/lat grid as a function of time. (Something I used to work on many years ago.) Let's then say that I have another variable, b[X,Y,Z], which represents the model atmosphere density profile on the same lon/lat grid as a function of altitude. This density profile was used in the calculation of the energy deposition values stored in variable a. Variable b is clearly valid as an ancillary variable for a, isn't it?
Grace and peace,
Jim
Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001
jim.biard at noaa.gov<mailto:jim.biard at noaa.gov>
828-271-4900
On Mar 14, 2013, at 7:53 AM, "Hedley, Mark" <mark.hedley at metoffice.gov.uk<mailto:mark.hedley at metoffice.gov.uk>> wrote:
Thank you for the responses on this topic.
So far I have not found an example of ancillary variable use where
the ancillary variable references file dimensions which are not referenced by the data variable with the referencing ancillary_variables attribute.
I do not think this should be valid, as it is not practical to relate the ancillary variable data to the data variable. However, I do not think the Conventions are clear enough in banning this.
I have raised a trac ticket:
https://cf-pcmdi.llnl.gov/trac/ticket/98,
which requires a moderator; please may someone volunteer to moderate this ticket?
thank you
mark
________________________________________
From: CF-metadata [cf-metadata-bounces at cgd.ucar.edu<mailto:cf-metadata-bounces at cgd.ucar.edu>] on behalf of Hedley, Mark [mark.hedley at metoffice.gov.uk<mailto:mark.hedley at metoffice.gov.uk>]
Sent: 18 February 2013 10:45
To: andrew walsh
Cc: cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
Subject: Re: [CF-metadata] Ancillary Data
Hello Andrew
thank you for the response, my reading of your cases is that they are all 'a' or 'b' in the terms I stated.
There are no ancillary variables which reference a dimension of the netCDF file not referenced by the data variable with the ancillary_variables attribute.
I think these are types of case I am particularly concerned to find examples of, or, preferably, discount explicitly.
much obliged
mark
On 2/13/13 12:15 PM, Hedley, Mark wrote:
a. reference the same file dimensions as the data variable with the
ancillary_variables attribute references
b. reference a subset of the file dimensions referenced by the data
variable with the ancillary_variables attribute
c. reference file dimensions which are not referenced by the data variable
with the ancillary_variables attribute
-----Original Message-----
From: andrew walsh [mailto:awalsh at metoc.gov.au<
http://metoc.gov.au>]
Sent: Fri 2/15/2013 3:20 AM
To: Hedley, Mark
Cc: ngalbraith at whoi.edu<mailto:ngalbraith at whoi.edu>; cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
Subject: Re: [CF-metadata] Ancillary Data
Hi Mark,
We are using the ancilliary_variables attribute in a real world case for CTD
profile data
(1 netCDF per CTD profile). Not sure if our use case fits with with your
examples a,b,c but here is a abbreviated CDL version:-
dimensions:
pressure = UNLIMITED ; // (9 currently)
variables:
double time ;
time:standard_name = "time" ;
....
byte time_qc_flag;
time_qc_flag:long_name = "quality control flag for time (Level 1 flag)" ;
....
double latitude ;
latitude:standard_name = "latitude" ;
...
double longitude ;
longitude:standard_name = "longitude" ;
...
byte position_qc_flag;
position_qc_flag:long_name = "quality control flag for position (Level 1
flag)" ;
....
double pressure(pressure) ;
pressure:standard_name = "sea_water_pressure" ;
...
double temperature(pressure) ;
temperature:_FillValue = -99.99 ;
temperature:standard_name = "sea_water_temperature" ;
temperature:units = "degrees_C" ;
temperature:valid_min = -2 ;
temperature:valid_max = 40 ;
temperature:ancillary_variables = "temperature_whole_profile_flag
temperature_qc_flag temperature_sd_test" ;
temperature:coordinates = "time latitude longitude pressure" ;
byte temperature_whole_profile_flag ;
temperature_whole_profile_flag:long_name = "qc flag for whole temperature
profile (primary L1 flag)" ;
temperature_whole_profile_flag:quality_control_convention = "Proposed IODE qc
scheme March 2012" ;
temperature_whole_profile_flag:valid_min = 1 ;
temperature_whole_profile_flag:valid_max = 9 ;
temperature_whole_profile_flag:flag_values = 1b, 2b, 3b, 4b, 9b ;
temperature_whole_profile_flag:flag_meanings = "good not_evaluated_or_unknown
suspect bad missing" ;
byte temperature_qc_flag(pressure) ;
temperature_qc_flag:long_name = "quality control flag for temperature (primary
Level 1 flag)" ;
temperature_qc_flag:standard_name = "sea_water_temperature status_flag" ;
temperature_qc_flag:quality_control_convention = "Proposed IODE qc scheme
March 2012" ;
temperature_qc_flag:valid_min = 1 ;
temperature_qc_flag:valid_max = 9 ;
temperature_qc_flag:flag_values = 1b, 2b, 3b, 4b, 9b ;
temperature_qc_flag:flag_meanings = "good not_evaluated_or_unknown suspect bad
missing" ;
temperature_qc_flag:coordinates = "time latitude longitude pressure" ;
byte temperature_sd_test(pressure) ;
temperature_sd_test:long_name = "qc flag for monthly temperature standard
deviation test (secondary L2 flag)"
temperature_sd_test:quality_control_convention = "Proposed IODE qc scheme
March 2012" ;
temperature_sd_test:valid_min = 0 ;
temperature_sd_test:valid_max = 2 ;
temperature_sd_test:flag_values = 0b, 1b, 2b ;
temperature_sd_test:flag_meanings = "passed failed unknown" ;
temperature_sd_test:coordinates = "time latitude longitude pressure" ;
double salinity(pressure) ;
salinity:_FillValue = -99.99 ;
salinity:standard_name = "sea_water_practical_salinity" ;
salinity:units = "psu" ;
salinity:valid_min = 0 ;
salinity:valid_max = 45 ;
salinity:ancillary_variables = "salinity_whole_profile_flag salinity_qc_flag
salinity_sd_test"
salinity:coordinates = "time latitude longitude pressure" ;
byte salinity_whole_profile_flag ;
salinity_whole_profile_flag:long_name = "qc flag for whole salinity profile
(primary L1 flag)" ;
salinity_whole_profile_flag:quality_control_convention = "Proposed IODE qc
scheme March 2012" ;
salinity_whole_profile_flag:valid_min = 1 ;
salinity_whole_profile_flag:valid_max = 9 ;
salinity_whole_profile_flag:flag_values = 1b, 2b, 3b, 4b, 9b ;
salinity_whole_profile_flag:flag_meanings = "good not_evaluated_or_unknown
suspect bad missing" ;
byte salinity_qc_flag(pressure) ;
salinity_qc_flag:long_name = "quality control flag for salinity (primary Level
1 flag)" ;
salinity_qc_flag:standard_name = "sea_water_practical_salinity status_flag" ;
salinity_qc_flag:quality_control_convention = "Proposed IODE qc scheme March
2012" ;
salinity_qc_flag:valid_min = 1 ;
salinity_qc_flag:valid_max = 9 ;
salinity_qc_flag:flag_values = 1b, 2b, 3b, 4b, 9b ;
salinity_qc_flag:flag_meanings = "good not_evaluated_or_unknown suspect bad
missing" ;
salinity_qc_flag:coordinates = "time latitude longitude pressure" ;
byte salinity_sd_test(pressure) ;
salinity_sd_test:long_name = "qc flag for monthly salinity standard deviation
test (secondary L2 flag)"
salinity_sd_test:quality_control_convention = "Proposed IODE qc scheme March
2012" ;
salinity_sd_test:valid_min = 0 ;
salinity_sd_test:valid_max = 2 ;
salinity_sd_test:flag_values = 0b, 1b, 2b ;
salinity_sd_test:flag_meanings = "passed failed unknown" ;
salinity_sd_test:coordinates = "time latitude longitude pressure" ;
int profile ; //Unique integer to identify each profile
profile:long_name = "profile identifier"
....
Andrew Walsh
----- Original Message -----
From: "Hedley, Mark" <mark.hedley at metoffice.gov.uk<mailto:mark.hedley at metoffice.gov.uk>>
To: <ngalbraith at whoi.edu<mailto:ngalbraith at whoi.edu>>; <cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>>
Sent: Thursday, February 14, 2013 10:40 PM
Subject: Re: [CF-metadata] Ancillary Data
Hi Nan
I think I understand you approach, it seems logical and helpful to me.
It feels like all your examples are in my category b:
b. reference a subset of the file dimensions referenced by the data variable
with the ancillary_variables attribute
and thus relate to the data variable in the same way auxiliary_coordinates do,
just with different inferred semantics.
many thanks for the feedback
mark
-----Original Message-----
From: CF-metadata on behalf of Nan Galbraith
Sent: Wed 13/02/2013 19:55
To: cf-metadata at cgd.ucar.edu<mailto:cf-metadata at cgd.ucar.edu>
Subject: Re: [CF-metadata] Ancillary Data
I use a singleton variable to record the magnetic correction that's been
applied to my wind and/or current variables; it's dim(1) and is listed as
an ancillary to any variables that have been affected by the rotation.
It could be an attribute of each of those variables, but making it a
free-standing
variable lets me give it units and attributes. That's important because I
record the model name and version date, the URL of the NGDC calculator
site,
the inputs to the model (date and location for which the calculation was
done)
and the estimated rate of change.
Maybe this is just a technicality - but I also use an empty 'container'
variable for
instruments, which has ancillary variables with a depth dimension for
manufacturer,
model, serial number, and reference URL. The instrument variable is
tied to 'obs data'
variables via NODC's 'instrument' attribute, but has no dimensions of
its own; the
individual component variables (like serial number) have a dimension
that matches
the depth dimension of the obs data variables.
- Nan
On 2/13/13 12:15 PM, Hedley, Mark wrote:
Hello CF community
I have been perusing the CF conventions again, particularly the section on
Ancillary Data
http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/ch03s04.html
The conventions make the statement:
The nature of the relationship between variables associated via
ancillary_variables must be determined by other attributes.
The example given provides data variables which reference the same dimensions
in the file: thus they data arrays are the same size. However, the
conventions do not seem to mandate this.
I am interested in the use of the
ancillary_variables
attribute in real world datasets.
Do people have examples they can share of ancillary datasets which:
a. reference the same file dimensions as the data variable with the
ancillary_variables attribute references
b. reference a subset of the file dimensions referenced by the data
variable with the ancillary_variables attribute
c. reference file dimensions which are not referenced by the data variable
with the ancillary_variables attribute
I am particularly interested in examples of case c, as I feel this is
markedly different from cases a and b and requires a different kind of
support if it is in use.
many thanks
mark
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
*******************************************************
* Nan Galbraith Information Systems Specialist *
* Upper Ocean Processes Group Mail Stop 29 *
* Woods Hole Oceanographic Institution *
* Woods Hole, MA 02543 (508) 289-2444 *
*******************************************************
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu<mailto:CF-metadata at cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
CF-metadata at cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130404/695e7b22/attachment-0001.html>
Received on Thu Apr 04 2013 - 03:10:30 BST