⇐ ⇒

[CF-metadata] Usage of histogram_of_X_over_Z

From: martin.juckes at stfc.ac.uk <martin.juckes>
Date: Wed, 12 Oct 2016 18:05:06 +0000

Hello,

There are two standard names of the form histogram_of_..... in the CF Standard Name list (at version 36): histogram_of_backscattering_ratio_over_height_above_reference_ellipsoid and histogram_of_equivalent_reflectivity_factor_over_height_above_reference_ellipsoid. Both of these where used in CMIP5 and set to be used in CMIP6, but the usage does not appear to match the standard name desecriptions.

The possible confusion is over the role of different coordinates. The CF definitions say ''"histogram_of_X[_over_Z]" means histogram (i.e. number of counts for each range of X) of variations (over Z) of X.' This implies to me that you start with a function of Z and possibly other coordinates and end up with a function of X and the other coordinates. E.g. if the source data is X(lat,lon,Z), then the histogram data will be of the form frequency(lat,lon,X).

In the two CMIP5/CMIP6 draft variables (cfadLidarsr532, cfadDbze94) using these standard names the "Z" coordinate which is included in the standard name ("height_above_reference_ellipsoid") is one of the coordinates of the histogram data variable. Both these variables appear to be joint distributions (frequency of X and Y values) over sub-grid variability as a function of latitude, longitude and time.

I've been reviewing these existing definitions in some detail because there are some new distribution variables in the request and I'd like to make sure that we have a consistent approach.

If we need to described a variable which carries a joint distribution of X and Y, then the variable will have to use X and Y as coordinates, so perhaps we can simplify the process by leaving them out of the standard name. Similarly the "over_Z" part of the name would be better expressed as a cell_methods construct. This line of reasoning suggests using a new standard name such as "frequency_distribution" (units "1"). The only difficulty is that the frequency distribution might be a function of the quantities X and Y (scattering ratio and cloud top height for cfadLidarsr532) and also of latitude, longitude and time. There should be some way of distinguishing the different roles of these 5 coordinates: is is the distribution of X and Y as a function of latitude, longitude and time. I think this could be done conveniently by introducing a single new attribute, e.g. "bin_coords: X Y".

"frequency_distribution" could be used for single or joint distributions.

My questions to the list are:
(1) am I missing something in my interpretation of the existing histogram_of_... names?
(2) if not, is the adoption of a "frequency_distribution" standard name an appropriate way forward?

regards,
Martin

regards,
Martin
Received on Wed Oct 12 2016 - 12:05:06 BST

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:42 BST

⇐ ⇒