This page covers many of the most common questions asked about the Climate and Forecast conventions (and Standard Names). If you have a question that isn’t on this list, please ask it of the CF-metadata mail list, so that the CF community can respond. We will use that list as the basis for additional content for this set of questions.
Note that many links in this FAQ point to the CF 1.6 specification (the currently released version). However, others point to the in-progress CF 1.7 specification, which may provide better explanations or context, or more advanced capabilities. The CF 1.7 specification is still in development, but since the newer text generally is not in conflict with the CF 1.6 specification, it is a good source of best practices to follow.
The questions are organized by topic. Click on any question to go to its answer.
This section includes general background about the CF conventions.
Learning about and changing the CF convention.
The detailed and big picture concepts in CF.
General and specific information about purpose and mechanisms of standard names
These questions are not strictly part of CF, but CF depends on this understanding.
This section is about the meta-question of procedures involved to update CF standards documentation.
The conventions for CF (Climate and Forecast) metadata are designed to promote the processing and sharing of files created with the NetCDF API. The conventions define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities.
Principles of CF include self-describing data (no external tables needed for understanding); metadata equally readable by humans and software; minimum redundancy and maximum simplicity; and a development process focusing on existing needs.
The CF conventions are maintained by volunteers, led by a Governance Panel and assisted by the Conventions Committee and Standard Names Committee. (See CF Governance.) Changes to the conventions are proposed and settled by the community, using the CF-Metadata Mailing List and CF Metadata Trac site. Many of the principles of CF operations follow the proposals at these rules for CF conventions changes.
Work began on CF in 2001 and Version 1.0 was released in October 2003. Now at Version 1.6, it has been used for tens of thousands of distinct netCDF products, has an active discussion list with hundreds of participants, and is a mature technical specification. Because it is community-supported and community-driven, turnaround on questions and changes can take a little time, but are generally thoroughly considered.
CF is a convention built on top of the netCDF standard, and it generalizes and extends the netCDF COARDS conventions. Whereas the netCDF specification is designed to be domain-agnostic, the CF conventions were developed specifically to target climatology and weather forecasting domains. Since then, the CF conventions have targeted earth science domains more broadly, and expanded from a focus on models to include observational data.
The conventions of netCDF and COARDS are assumed and upheld by CF. Where COARDS is adequate, CF does not provide an alternative, while all of CF’s extensions to COARDS are optional and provide new functionality.
A motivation for developing CF was the need for extra features not found in netCDF or COARDS. These include conventions for grid-cell boundaries, horizontal grids other than latitude-longitude, recording common statistical operations, standardised identification of physical quantities, non-spatiotemporal axes, climatological statistics and data compression. These needs were driven by the original user community developing the CF conventions, the climatology and weather forecasting science community.
Not entirely; because CF is a netCDF convention, it assumes the netCDF standard is being followed. And it relies on the UDUNITS system of specifying units (see CF and COARDS units below). CF does not replicate the information from these other documents, so to adhere to CF you may need to become familiar with the other specifications as well, particularly the netCDF User’s Guide.
The CF Conventions were originally based on the netCDF convention called the COARDS conventions (named for their sponsor, the Cooperative Ocean/Atmosphere Research Data Service), developed in 1995. While there may be a few things in that document that are not documented in CF, working with the CF conventions does not require previous understanding of COARDS.
Aside from those references, a CF principle is to be self-contained. So for example the CF Standard Names attempt to be as general and well-defined as possible, so the reader does not have to access outside sources to understand the terms.
The two main ways to research CF questions are checking this FAQ, and visiting the mail archives, to see if your question has already been asked.
You can use the pipermail search window to see when your topic may have been raised over the years. To follow a particular subject thread, go to the year in which the discussion took place, click on the
by thread sorting option, and choose the first mail with that subject. The
next message link will then progress through each of the threads in order.
First, please see whether your question has already been answered (see question above). Questions about the CF Convention, including its Standard Names list, may be asked at the CF-Metadata Mailing List. CF community members usually respond within a day to simple questions, but allow more time if you have an advanced technical topic.
Changes to the CF standard and the Standard Names are generally proposed first on the CF-Metadata Mailing List. See How do I ask for a new standard name? to learn more about changes to the Standard Names list.
A change to the CF standard itself may be brought up on the mailing list, but must be presented and agreed to in detail on the CF Metadata Trac site, where the explicit change being requested can be refined.
The community discusses requests for changes via the mail list and trac site, and may ask questions or recommend changes. If no one raises objections or concerns about the change (modified as needed to address any issues) for the period of time required for that document, it is considered accepted. The moderators of the list typically make a final statement of acceptance once that stage has been reached.
More detailed information can be found in the Rules for CF Conventions Changes.
Errors have a simpler workflow, but still use a community process, as described in the Rules for Correcting Errors in CF Documents.
Changes to the CF Convention itself are grouped into major releases. Because the proposed changes are visible to the community pending the final release of the convention, major releases may take as long as a year or more to finalize, but users sometimes choose to follow the proposed changes in advance of their release date.
The compliance is determined by the version number you define in the
Conventions attribute within each file. If your file complies with the specifications of the CF version in that attribute, it stays in compliance with CF even as newer versions of the CF Conventions are released. As a general rule, tools that work with files following the CF Convention should support all versions of the convention.
If your vertical coordinate is some form of pressure, you won’t have to worry about the
positive attribute – increasing pressure is always ‘down’ (closer to the center of the earth).
If your vertical coordinate is anything else, you must provide a positive attribute. This takes a value of ‘up’ or ‘down’, indicating whether more positive values are further away from earth center (up), or toward earth center. Many standard names which could be used for vertical coordinates state the convention for positive in their definition. For example, height is defined to have positive direction up, while depth has positive direction down (depth > 0 is below sea level). However, in some data sets (particularly oceanographic ones) depth values take the opposite sign. If you specify a coordinate standard name of depth, and a positive attribute value of up, the variable should be interpreted as having an inverted depth direction. However, this is not recommended; it would be better to use a standard name of height instead.
Note that a standard name attribute is not required for the vertical coordinate, but the
positive attribute is required if the standard name is not ‘pressure’.
Reference: Trac ticket #109
There are just a few names in CF that are dedicated to specific coordinate directions. Beyond those special cases, many CF parameters have directional components (up/down, east/west, clockwise/counterclockwise, etc.). To indicate the positive direction of these parameters’ values, CF can include the direction in the standard_name attribute for the variable. These directional standard names are added only as each direction is requested, so you may see many ‘eastward’ standard names, but no ‘westward’ ones, for example. Because CF does not want to be prescriptive about how data is filtered, it will generally accept requests to add names ‘in the opposite direction’.
While it would be possible to separate the directionality of the values from the standard_name (and put it in a ‘direction’-style attribute like
positive for vertical coordinates), this has been avoided, both to simplify compliance and to make interpretation of the values easier for the user.
A list of typical directional components of standard names follows. These lists are not complete, but provide illustrations of the most common terms that are used.
Components of standard names that are implicitly signed (note that often there is no standard name for the opposing direction):
Some directional components are not necessarily signed, and so may not be specifying a positive direction per se. For example,
horizontal is indicating a plane rather than a direction, while
bidirectional indicates a directional mode.
Often data values in an enumerated list are given as string codes (“UP”, “GOOD”, “Warning”), yet it is more useful to encode these values as integers. CF’s flag_values mechanism can encode strings in numeric data variables, while defining flag_meanings to map the numbers to the meanings. The
flag_meanings attributes (and, if necessary, the
flag_masks attribute) describe a status flag consisting of mutually exclusive coded values. The
flag_values attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values. The
flag_meanings attribute is a string whose value is a blank-separated list of descriptive words or phrases, one for each flag value.
In NetCDF, a
coordinate variable is a one-dimensional variable with the same name as its dimension [e.g., time(time)]; is a numeric data type; has values that are ordered monotonically (always going in one direction); and has no missing values. If you have a variable that contains coordinate values but does not meet these criteria, in CF you can still indicate that it has coordinate values by naming it as an auxiliary coordinate variable.
The rules for creating and using auxiliary coordinate variables are described in the Coordinate Systems section of the Convention.
CF allows coordinate variables to be used for any quantity that you might regard as an independent variable on which your data variable depends.
CF offers a rich set of options for specifying coordinate axes. Here is a short list of possibilities; others may be appropriate.
|Degree-day integrals are described as integral_of_air_temperature_deficit||excess_wrt_time with a coordinate of air_temperature_threshold.|
There are several ways that multiple time coordinates may be handled; you may wish to review the details in this list message.
CF’s standard name for the valid or forecast time is
time (also used for the time of an observation). CF also has a standard name for the time the analysis was performed (its ‘run time’): forecast_reference_time. Very briefly, values in either or both of these axes may vary (a single run may have multiple forecast periods, or multiple runs may target a single period, or multiple runs may target multiple periods). If either axis contains just a single value, they are both specified as coordinates. If both are multi-valued, then they are each defined as one-dimensional auxiliary coordinate variables, with a common index dimension.
CF section 5.7 has an example of the first case, with a scalar coordinate variable for forecast_reference_time and a multivalued time axis for the valid time.
CF ticket #117 has an example of the second case, drawn from the email referenced above.
Discrete Sampling Geometries, addressed in Section 9 of the CF Conventions, were added to offer greater efficiency and clarity for storing a collection of ‘features’ in a single file. Here we define a feature by example: it can be a point, a time series, a trajectory, a profile, a time series (of) profile(s), or a trajectory (of) profile(s). All of these can be stored in CF-compliant netCDF files, but there was no consistent way to do so and people and programs could not leverage the features in the files.
You don’t have to worry about Discrete Sampling Geometries, or DSGs, in order to be CF-compliant. If you have data that correspond to one of these feature types, you can read the the Discrete Sampling Geometry section to learn how to represent those data so that others can fully leverage them. (Note: The
feature_type attribute is reserved for files that represent a Discrete Sampling Geometry.)
For example, if you have a rainfall accumulation value for a 24-hour period from 20140716 0600 to 20140717 0600, it’s obvious these should be the time bounds, but what time coordinate should be used? The answer calls for judgment, and depends on the data’s context. (The time coordinate might be used for plotting, and also for differentiating in time.) If the data are simple observations, using the midpoint is reasonable. (Of course if sensors have a measurement or reporting lag, this should be adjusted for in representing the time of the observation.) But if the calculation is performed in the context of a model, and the value is used to trigger calculations based on values at the end boundary, it makes more sense to use the endpoint as the time coordinate.
When there is no basis for setting the time to a particular point in the interval, the majority of posters seem to favor the midpoint.
The situation is complicated in the case of a climatology, where the total range of times might include discontinuities. For instance, specifying 19601201 to 19620301 in climatological bounds defines the northern hemisphere winters (DJF) 1960-1961 and 1961-1962. The middle of the bounds is the middle of July 1961, which would be a silly coordinate for plotting a winter statistic. Instead it should be the middle of the first time interval to which the climatological statistic applies, making it mid-January 1961. (Or, if the statistic is an accumulation over multiple years, perhaps the middle of the last time interval.) Use your good judgment!
Terms from this vocabulary may be used as specified in the CF Convention section 7.3.3 Statistics applying to portions of cells. However, it is also possible to describe a data variable by using a named quantity as a coordinate variable, and the area_type is often needed for such a purpose. The area_type can be attached as a dimensioned coordinate variable, or as a scalar coordinate.
If the area_type you need is not in the list, request a new area_type name just as you would a standard name (no units required).
This example adds the area_type as a dimensioned coordinate variable:
x=12; y=15; time=UNLIMITED; ntypes=3; maxlen=40; # holds any current attribute; can be smaller if your names are shorter lat(y,x); lon(y,x); # This is a coordinate variable of size 3 (ntypes) for surface type (values are in the `data` section): surface_type(ntypes,maxlen); surface_type:standard_name="area_type"; surface_temperature(time,ntypes,y,x); surface_temperature:coordinates = "lat lon surface_type"; data: # Values for surface_type are specified here surface_type="crops","natural_grasses","trees";
Alternatively, this example specifies a single surface_type for your variable, by using a scalar coordinate variable:
x=12; y=15; time=UNLIMITED; ntypes=3; maxlen=40; lat(y,x); lon(y,x); # This specifies a scalar coordinate variable for surface_type surface_type(maxlen); surface_type:standard_name="area_type"; surface_type="trees"; surface_temperature(time,y,x); surface_temperature:coordinates = "lat lon surface_type";
The CF site contains the official list of CF standard names. The XML document pointed to from that page is the primary reference, but the HTML and PDF documents are produced automatically from the XML, and should contain the same information.
Several other sites represent alternative views of knowledge artifacts of the standard names. See the Standard Names Tools section for more details.
The purpose of the
standard_name attribute is to provide a succinct and distinguishing description of a variable, in a way that encourages interoperability. (In this document we often use the phrase ‘standard name’ to refer to this attribute or its value.)
The standard name is useful for listing and discussing the contents of a file, providing the kind of answer an expert might give to the non-expert’s question “What is in that file?” This helps users share files across disciplines and over time.
The standard name also makee it possible for a computer to assess whether a variables is likely to be comparable to another, mathematically and semantically. This increases interoperability by enabling automated discovery. Variables with different standard names are presumably not directly comparable. (Variables with different (that is, incompatible) canonical units are not mathematically comparable, and so are required to have different standard names.) Of course users must review the details of variables, particularly their
source attributes, to assess whether they are truly interoperable.
To find standard names that describe your data, open up the latest Standard Name table (as HTML or XML) and search through it for words typically used for your data. (Because standard names contain no blanks, you may want to search for one word at a time, or even part of a word, rather than a full phrase like “air temperature”.) If you can not find any matches, you can browse the table to see the kinds of names that exist – names strongly lean toward environmental modeling and observation data, especially in atmosphere and ocean science.
If you can’t find any matches, send an email to the CF-Metadata list describing your variables. (See the question on asking for a new standard name.)
You ask for a new standard name by sending an email to the CF-Metadata Mailing List. You should sign up to the mailing list before sending your email, so you can follow the discussion of your request.
In the email specify the following for each standard name you want to request:
This depends on the application – there can be standard names for very narrowly defined quantities, and standard names for broad concepts. The appropriate choice depends on which distinctions need to be made to decide whether another quantity is comparable to the one being defined.
Of course, this broad guideline could result in extraordinarily detailed standard names that will rarely be useful to others. Because the goal of standard names is to encourage interoperability, there are several qualifier types that are actively discouraged.
A CF standard name is a unique text string, which is associated in the CF Standard Names table to other attributes. The text string is made up of two parts: the name (from the CF Standard Names table), and optionally, following the name and one or more blanks, a standard name modifier. The name contains no white space (underscores separate the words, in practice) and identifies the physical quantity. The modifier is used to describe a quantity which is related to another variable with the modified standard name. Details are provided in the convention section on Standard Name, and examples of modifiers are given in Appendix C.
Several attributes are required for every standard name: the canonical units, which are typical units of the physical quantity, and the description, which clarifies related quantities and meanings of the standard name (but is not strictly a definition per se). Older standard names may not have a description.
A good standard name will typically include several characteristics that, together, characterize your variable. Common characteristics, or facets, include (with examples in parentheses):
The order is not rule-based; the goal is to make the name as clear and natural as possible. An example standard name with most of the above is mole_concentration_of_atomic_nitrogen_in_air (quantity-transformation-state-substance-medium).
Several structural analyses have been performed on standard names. For more information, check out What can be described in a standard name?.
Most of the descriptive terms central to the nature of a substance or concept, including its relation to environmental context, can be described in the CF standard name. During the review process, the community attempts to normalize the terms to achieve the readability and interoperability goals of the vocabulary.
For an example, one list of the existing standard name facets, based on Raskin SWEET mapping and subsequent re-analysis by Graybeal, is as follows:
|expressed as (Substance or Property)||Fraction||Salinity||Temperature|
|Quantity||with respect to||defined by||ratio of|
|ratio to||(product) and||Process||Model|
|difference from||difference to||Angle|
|at (Surface or Condition)||in (Substance or Realm)||into||out of|
|Condition||assuming (Condition)||due to||excluding|
|for||by||reported on||Artifact State|
The standard name should not include:
In many cases the standard name is qualified by a specific detail, for example area_type, whose value may change from one set of observations to another or one observation to another. In these cases the definition for the standard name references one or more attributes or variables where the additional qualifying information may be found. (Standard name modifiers and cell_methods may also be used for this purpose.) In this way the divergence of the standard names is minimized, and interoperability increased.
Yes, there are phrases and patterns that reappear in different names. If you have to build a lot of standard names for different types of variables, some existing analyses may be helpful; send a note to the CF-Metadata list for guidance. If you are creating just a few standard names, it will be easiest to send an initial request using your best guess for the names; the list members will perform the needed comparison to existing usage.
There is no adopted grammar for the standard names. Many investigations or partial forays into a standard grammar have been made. Among these efforts:
This answer needs further development, to confirm these details and provide reference links. Please feel free to offer improvements.
Yes, perhaps most important of these is a mapping within the CF standard names vocabulary performed by a team at BODC. This provides SKOS-based relationships among CF terms, for example broader and narrower relations.
The CF standard names also have been mapped to the Global Change Master Directory science keywords, and to terms from the SWEET Ontology.
As of 2014, none of these mappings are regularly updated as new versions of the CF standard names are released.
In addition to the results mentioned in the mappings, other tools include:
These have been derived from the original XML, and as of this writing (2014) are being updated quickly whenever the original XML is changed. In fact, the NERC Vocabulary Server is updated simultaneously with the publication of the original XML document.
Standard names can be ‘deprecated’ to indicate they are no longer recommended for use. Existing uses of the name will not cause an error, but new applications should not use a deprecated name.
Standard names are deprecated when their use becomes ambiguous or confusing, or to say it another way, when they are replaced by one or more terms that are more appropriate (as determined by the standard names community).
The technical process involving deprecation of a standard name is that it is turned into an alias in the standard names XML file. The alias includes a pointer to the standard name most closely replacing the deprecated name. The alias is not shown in the HTML table of standard names. (As of August 2014, vocabulary servers typically do not show deprecated standard names in their term list, though the NERC Vocabulary Server has a separate list of the deprecated terms.)
UDUNITS was specified in the original COARDS convention (“Where possible the units attribute should be formatted as per the recommendations in the Unidata udunits package”), and is a widely used standard with many tools and libraries. The package contains an extensive unit database, which is in XML format and user-extensible (though the extensions will not be compliant with CF).
There are a few units CF allows that do not appear in UDUNITS; see the related FAQ question.
Note that CF depends on UDUNITS as a standard for formatting the units string, but not as a software package.
No, not exactly. If you have a variable with a standard name, its units must be compatible with (that is, convertible to) the canonical units of the standard name; but your variable’s units do not have to be the same as the canonical units. For example, a variable with standard name wind_speed could have units miles/hour, since those can be converted to the canonical units of meters/second.
If the units of the variable are not convertible to the standard name’s canonical units, this indicates the variable is not really of the same type as the standard name.
UDUNITS has a small set of base units and another set of ‘common’ aliases that can be used as base units. It also specifies a large set of prefixes that can be prepended to the base units. (All of the components can be specified by their full strings, or by their ‘symbol’ abbreviations.)
These combinations can be combined as follows in CF:
You can review basic examples in the UDUNITS documentation.
More complicated examples of units can be found in the CF Standard Names table, which lists the canonical units for each standard name.
UDUNITS terms can be found in XML on the UDUNITS github pages, specifically in the files udunits2-*.xml under the lib path. The terms can be easily viewed in the MMI ORR repository referenced in the UDUNITS resources question.
Most time units in CF are specified as being of the form ‘time-unit since timestamp’, where time-unit is often ‘seconds’, and the most often used timestamp is ‘1970-01-01T00:00:00’. The prefixes specified for UDUNITS prefixes may be applied to the time-unit, for example
milliseconds since 1970-01-01T00:00:00 is a valid unit of time.
This answer may need updating to reflect current content of UDUNITS.
There are two units acceptable to CF that are not in the UDUNITS library:
dB. These have been requested for inclusion in future versions of the UDUNITS library.
practical salinity unit was used for salinity terms in CF, but is no longer used; rather, this is considered a dimensionless quantity (unit of 1).
Details of the CF units not in UDUNITS:
The UDUNITS-2 GitHub repository contains working code and documentation.
The API-Guide contains some detailed information, but it is oriented entirely for developers.
A units conversion page on the ERDDAP site lets you try different unit strings, and provides additional context on UDUNITS (and UCUM units) further down the page.
The strings (names) corresponding to accepted UDUNITS can be found in the UDUNITS vocabulary entries at the MMI Ontology Registry and Repository:
The repository also contains codes for each of the defined units in UDUNITS, which can be used if a unique identifier is needed to refer to a specific UDUNITS unit.
Alison Pamment of the Science and Technologies Facility Council maintains the CF Standard Names documentation.
A team at Lawrence Livermore National Lab maintains documents and content on the CF web site; Matthew Harris is the primary updater of that site. As the site is maintained in a GitHub repository (see this item, other members of the community may contribute modifications for inclusion on the site.
The documentation is stored on this GitHub repository, and its format is converted using Jekyll to present it on the CF web site.
Yes, the repository is public and can be forked. We suggest you contact CF via the CF-metadata mail list before making pull requests, however. There are various maintenance processes going on behind the scenes to update the various CF content, so changing files directly may not produce the desired results.
Once you understand the procedure by which your suggested changes should be approved (e.g., email approval on the CF-metadata list, a trac ticket, or some other arrangement), you can submit suggested changes as a pull request on the appropriate content. However, as noted above, this should first be agreed with the person overseeing that content.