The attributes described in this section are used to provide a description of the content and the units of measurement for each variable.
We continue to support the use of the units
and long_name
attributes as defined in COARDS.
We extend COARDS by adding the optional standard_name
attribute which is used to provide unique identifiers for variables.
This is important for data exchange since one cannot necessarily identify a particular variable based on the name assigned to it by the institution that provided the data.
The standard_name
attribute can be used to identify variables that contain coordinate data.
But since it is an optional attribute, applications that implement these standards must continue to be able to identify coordinate types based on the COARDS conventions.
The units
attribute is required for all variables that represent dimensional quantities (except for boundary variables defined in [cell-boundaries] and climatology variables defined in [climatological-statistics]).
The units
attribute is permitted but not required for dimensionless quantities (see Section 3.1.1, "Dimensionless units").
The value of the units
attribute is a string that can be recognized by the UDUNITS package [UDUNITS], with the exceptions that are given in Section 3.1.1, "Dimensionless units" and Section 3.1.3, "Scale factors and offsets".
Note that case is significant in the units
strings.
Note also that CF depends on UDUNITS only for the definition of legal units
strings.
CF does not assume or require that the UDUNITS software will be used for units
conversion.
In most units
conversions, the sole operation on the data is multiplication by a scale factor.
Special treatment is required in converting the units
of variables that involve temperature (Section 3.1.2, "Temperature units") and the units
of time coordinate variables ([time-coordinate]).
The COARDS convention prohibits the unit degrees
altogether, but this unit is not forbidden by the CF convention because it may in fact be appropriate for a variable containing, say, solar zenith angle.
The unit degrees
is also allowed on coordinate variables such as the latitude and longitude coordinates of a transformed grid.
In this case the coordinate values are not true latitudes and longitudes, which must always be identified using the more specific forms of degrees
as described in [latitude-coordinate] and [longitude-coordinate].
A variable with no units
attribute is assumed to be dimensionless.
However, a units
attribute specifying a dimensionless unit may optionally be included.
The canonical unit (see also Section 3.3, "Standard Name") for dimensionless quantities that represent fractions, or parts of a whole, is 1
.
The UDUNITS package defines a few dimensionless units, such as percent
, ppm
(parts per million, 1e-6), and ppb
(parts per billion, 1e-9).
As an alternative to the canonical units
of 1
or some other unitless number, the units
for a dimensionless quantity may be given as a ratio of dimensional units, for instance mg kg-1
for a mass ratio of 1e-6, or microlitre litre-1
for a volume ratio of 1e-6. Data-producers are invited to consider whether this alternative would be more helpful to the users of their data.
The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as ppmv
and ppbv
.
These units are allowed in the units
attribute by CF only if the data variable has no standard_name
.
These units are prohibited by CF if there is a standard_name
, because the standard_name
defines whether the quantity is a volume ratio, so the units
are needed only to indicate a dimensionless number.
Information describing a dimensionless physical quantity itself (e.g.
"area fraction" or "probability") does not belong in the units
attribute, but should be given in the long_name
or standard_name
attributes (see Section 3.2, "Long Name" and Section 3.3, "Standard Name"), in the same way as for physical quantities with dimensional units.
As an exception, to maintain backwards compatibility with COARDS, the text strings level
, layer
, and sigma_level
are allowed in the units
attribute, in order to indicate dimensionless vertical coordinates.
This use of units
is not compatible with UDUNITS, and is deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are available (see [dimensionless-vertical-coordinate]).
The UDUNITS syntax that allows scale factors and offsets to be applied to a unit is not supported by this standard, except for case of specifying reference time, see section [time-coordinate].
The application of any scale factors or offsets to data should be indicated by the scale_factor
and add_offset
attributes.
Use of these attributes for data packing, which is their most important application, is discussed in detail in [packed-data].
The units
of temperature imply an origin (i.e. zero point) for the associated measurement scale.
When the temperature value is the degree of warmth with respect to the origin of the measurement scale, we call it an on-scale temperature.
When units
of on-scale temperature are converted, the data may require the addition of an offset as well as multiplication by a scale factor, because the physical meaning of a numerical value of zero for an on-scale temperature depends on the unit of measurement.
On-scale temperature is unique among quantities in the respect that the origin and the unit of measurement are both defined by the units
and therefore cannot be chosen independently.
For all other quantities, the origin and the unit of measurement are independent.
Converting the unit of measurement alone, without changing the origin, does not change the meaning of zero.
For example (using bold to indicate a numerical data value), 0 kilogram
is the same mass as 0 pound
, and 0 seconds since 1970-1-1
means the same as 0 days since 1970-1-1
, but 0 degC
is not the same temperature as 0 degF
(= -17.8 degC
), because these two temperature units
implicitly refer to measurement scales which have different origins.
On the other hand, when the temperature value is a temperature difference, which compares two on-scale temperatures with the same origin, the value of that origin is irrelevant as it cancels out when taking the difference.
Therefore to convert the units
of a temperature difference requires only multiplication by a scale factor, without the addition of an offset.
The units
attribute does not distinguish between on-scale temperatures and temperature differences.
This ambiguity also affects units of temperature raised to some power e.g. K^2
or multiplied by other units e.g. W m-2 K-1
, degF/foot
or degC m s-1
.
A standard_name
(Section 3.3, "Standard Name") or standard_name
modifier ([standard-name-modifiers]) may clarify the intention, but they are optional.
Some statistical operations described by the cell_methods
attribute ([cell-methods]; [appendix-cell-methods]) imply that temperature must be interpreted as temperature difference, but this attribute is optional too.
In order to convert the units
correctly, it is essential to know whether a temperature is on-scale or a difference.
Therefore this standard strongly recommends that any variable whose units
involve a temperature unit should also have a units_metadata
attribute to make the distinction.
This attribute must have one of the following three values: temperature: on_scale
, temperature: difference
, temperature: unknown
.
The units_metadata
attribute, standard_name
modifier ([standard-name-modifiers]) and cell_methods
attribute ([appendix-cell-methods]) must be consistent if present.
A variable must not have a units_metadata
attribute if it has no units
attribute or if its units
do not involve a temperature unit.
units_metadata
to distinguish temperature quantitiesvariables: float Tonscale; Tonscale:long_name="global-mean surface temperature"; Tonscale:standard_name="surface_temperature"; Tonscale:units="degC"; Tonscale:units_metadata="temperature: on_scale"; Tonscale:cell_methods="area: mean"; float Tdifference; Tdifference:long_name="change in global-mean surface temperature relative to pre-industrial"; Tdifference:standard_name="surface_temperature"; Tdifference:units="degC"; Tdifference:units_metadata="temperature: difference"; Tdifference:cell_methods="area: mean";
With temperature: unknown
, correct conversion of the units
cannot be guaranteed.
This value of units_metadata
indicates that the data-writer does not know whether the temperature is on-scale or a difference.
If the units_metadata
attribute is not present, the data-reader should assume temperature: unknown
.
The units_metadata
attribute was introduced in CF 1.11.
In data written according to versions before 1.11, temperature: unknown
should be assumed for all units
involving temperature, if it cannot be deduced from other metadata.
We note (for guidance only regarding temperature: unknown
, not as a CF convention) that the UDUNITS software assumes temperature: on_scale
for units
strings containing only a unit of temperature, and temperature: difference
for units
strings in which a unit of temperature is raised to any power other than unity, or multiplied or divided by any other unit.
With temperature: on_scale
, correct conversion can be guaranteed only for pure temperature units
.
If the quantity is an on-scale temperature multiplied by some other quantity, it is not possible to convert the data from the units
given to any other units
that involve a temperature with a different origin, given only the units
.
For instance, when temperature is on-scale, a value in kg degree_C m-2
can be converted to a value in kg K m-2
only if we know separately the values in degree_C
and kg m-2
of which it is the product.
UDUNITS recognises the SI prefixes shown in Prefixes for decimal multiples and submultiples of units for decimal multiples and submultiples of units, and allows them to be applied to non-SI units as well.
UDUNITS offers a syntax for indicating arbitrary scale factors and offsets to be applied to a unit.
(Note that this is different from the scale factors and offsets used for converting between units
, as discussed for temperature in Section 3.1.2, "Temperature units".)
This UDUNITS syntax for arbitrary transformation of units
is not supported by the CF standard, except for the case of specifying reference time ([time-coordinate]).
The application of any scale factors or offsets to data should be indicated by the scale_factor
and add_offset
attributes.
Use of these attributes for data packing, which is their most important application, is discussed in detail in [packed-data].
Factor | Prefix | Abbreviation | Factor | Prefix | Abbreviation | |
---|---|---|---|---|---|---|
1e1 |
deca,deka |
da |
1e-1 |
deci |
d |
|
1e2 |
hecto |
h |
1e-2 |
centi |
c |
|
1e3 |
kilo |
k |
1e-3 |
milli |
m |
|
1e6 |
mega |
M |
1e-6 |
micro |
u |
|
1e9 |
giga |
G |
1e-9 |
nano |
n |
|
1e12 |
tera |
T |
1e-12 |
pico |
p |
|
1e15 |
peta |
P |
1e-15 |
femto |
f |
|
1e18 |
exa |
E |
1e-18 |
atto |
a |
|
1e21 |
zetta |
Z |
1e-21 |
zepto |
z |
|
1e24 |
yotta |
Y |
1e-24 |
yocto |
y |
The long_name
attribute is defined by the NUG to contain a long descriptive name which may, for example, be used for labeling plots.
For backwards compatibility with COARDS this attribute is optional.
But it is highly recommended that either this or the standard_name
attribute defined in the next section be provided for all data variables and variables containing coordinate data, in order to make the file self-describing.
If a variable has no long_name
attribute then an application may use, as a default, the standard_name
if it exists, or the variable name itself.
A fundamental requirement for exchange of scientific data is the ability to describe precisely the physical quantities being represented.
To some extent this is the role of the long_name
attribute as defined in the NUG.
However, usage of long_name
is completely ad-hoc.
For many applications it is desirable to have a more definitive description of the quantity, which allows users of data from different sources (some of which might be models and others observational) to determine whether quantities are in fact comparable.
For this reason each variable may optionally be given a "standard name", whose meaning is defined by this convention.
There may be several variables in a dataset with any given standard name, and these may be distinguished by other metadata, such as coordinates ([coordinate-types]) and cell_methods
([cell-methods]).
A standard name is associated with a variable via the attribute standard_name
which takes a string value comprised of a standard name optionally followed by one or more blanks and a standard name modifier (a string value from [standard-name-modifiers]).
The set of permissible standard names is contained in the standard name table. The table entry for each standard name contains the following:
- standard name
-
The name used to identify the physical quantity. A standard name contains no whitespace and is case sensitive.
- canonical units
-
Representative units of the physical quantity. Unless it is dimensionless, a variable with a
standard_name
attribute must have units which are physically equivalent (not necessarily identical) to the canonical units, possibly modified by an operation specified by the standard name modifier (see below and [standard-name-modifiers]) or by thecell_methods
attribute (see [cell-methods] and [appendix-cell-methods]) or both.
Units of time coordinates ([time-coordinate]), whose units
attribute includes the word since
, are not physically equivalent to time units that do not include since
in the units
.
To mark this distinction, the canonical unit given for quantities used for time coordinates is s since 1958-1-1
.
The reference datetime in the canonical unit (the beginning of the day i.e. midnight on 1st January 1958 at 0 degrees_east
) is not restrictive; the time coordinate variable’s own units
may contain any reference datetime (after since
) that is valid in its calendar.
(We use 1958-1-1
because it is the beginning of International Atomic Time, and a valid datetime in all CF calendars; see also [leap-seconds].)
In both kinds of time units
attribute (with or without since
), any unit for measuring time can be used i.e. any unit which is physically equivalent to the SI base unit of time, namely the second.
- description
-
The description is meant to clarify the qualifiers of the fundamental quantities such as which surface a quantity is defined on or what the flux sign conventions are. We don’t attempt to provide precise definitions of fundumental physical quantities (e.g., temperature) which may be found in the literature. The description may define rules on the variable type, attributes and coordinates which must be complied with by any variable carrying that standard name (such as in Example 3.5).
When appropriate, the table entry also contains the corresponding GRIB parameter code(s) (from ECMWF and NCEP) and AMIP identifiers.
The standard name table is located at
https://cfconventions.org/Data/cf-standard-names/current/src/cf-standard-name-table.xml,
written in compliance with the XML format, as described in [standard-name-table-format].
Knowledge of the XML format is only necessary for application writers who plan to directly access the table.
A formatted text version of the table is provided at
https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html,
and this table may be consulted in order to find the standard name that should be assigned to a variable.
Some standard names (e.g. region
and area_type
) are used to indicate quantities which are permitted to take only certain standard values.
This is indicated in the definition of the quantity in the standard name table, accompanied by a list or a link to a list of the permitted values.
Standard names by themselves are not always sufficient to describe a quantity.
For example, a variable may contain data to which spatial or temporal operations have been applied.
Or the data may represent an uncertainty in the measurement of a quantity.
These quantity attributes are expressed as modifiers of the standard name.
Modifications due to common statistical operations are expressed via the cell_methods
attribute (see [cell-methods] and [appendix-cell-methods]).
Other types of quantity modifiers are expressed using the optional modifier part of the standard_name
attribute.
The permissible values of these modifiers are given in [standard-name-modifiers].
standard_name
float psl(lat,lon) ; psl:long_name = "mean sea level pressure" ; psl:units = "hPa" ; psl:standard_name = "air_pressure_at_sea_level" ;
The description in the standard name table entry for air_pressure_at_sea_level
clarifies that "sea level" refers to the mean sea level, which is close to the geoid in sea areas.
When one data variable provides metadata about the individual values of another data variable it may be desirable to express this association by providing a link between the variables.
For example, instrument data may have associated measures of uncertainty.
The attribute ancillary_variables
is used to express these types of relationships.
It is a string attribute whose value is a blank separated list of variable names.
The nature of the relationship between variables associated via ancillary_variables
must be determined by other attributes.
The variables listed by the ancillary_variables
attribute will often have the standard name of the variable which points to them including a modifier ([standard-name-modifiers]) to indicate the relationship.
The dimensions of an ancillary variable must be the same as or a subset of the dimensions of the variable to which it is related, but their order is not restricted, and with one exception:
If an ancillary variable of a data variable that has been compressed by gathering ([compression-by-gathering]) does not span the compressed dimension, then its dimensions may be any subset of the data variable’s uncompressed dimensions, i.e. any of the dimensions of the data variable except the compressed dimension, and any of the dimensions listed by the compress
attribute of the compressed coordinate variable.
float q(time) ; q:standard_name = "specific_humidity" ; q:units = "g/g" ; q:ancillary_variables = "q_error_limit q_detection_limit" ; float q_error_limit(time) q_error_limit:standard_name = "specific_humidity standard_error" ; q_error_limit:units = "g/g" ; float q_detection_limit(time) q_detection_limit:standard_name = "specific_humidity detection_minimum" ; q_detection_limit:units = "g/g" ;
Alternatively, ancillary_variables
may be used as status flags indicating the operational status of an instrument producing the data or as quality flags indicating the results of a quality control test, or some other quantitative quality assessment, performed against the measurements contained in the source variable.
In these cases, the flag variable will include a standard name that differs from that of the source variable and indicates the specific type of flag the variable represents.
The standard names table includes many names intended to be used in this situation, both general names meant to be used to flexibly represent any type of status or quality assessment, as well as names for specific quality control tests commonly applied to geophysical phenomena timeseries data. Several examples are listed below:
-
status_flag
andquality_flag
: general flag categories for instrument status or quality assessment -
climatology_test_quality_flag
,flat_line_test_quality_flag
,gap_test_quality_flag
,spike_test_quality_flag
: a subset of standard name flags used to indicate the results of commonly-used geophysical timeseries data quality control tests (consult the standard names table for a full list of published flags) -
aggregate_quality_flag
: flag indicating an aggregate summary of all quality tests performed on the data variable, both automated and manual (i.e. a master quality flag for a particular variable)
The following example illustrates the use of three of these flags to represent two independent quality control tests and an aggregate flag that combines the results of the two tests.
float salinity(time, z); salinity:units = "1"; salinity:long_name = "Salinity"; salinity:standard_name = "sea_water_practical_salinity"; salinity:ancillary_variables = "salinity_qc_generic salinity_qc_flat_line_test salinity_qc_agg"; int salinity_qc_generic(time, z); salinity_qc_generic:long_name = "Salinity Generic QC Process Flag"; salinity_qc_generic:standard_name = "quality_flag"; int salinity_qc_flat_line_test(time, z); salinity_qc_flat_line_test:long_name = "Salinity Flat Line Test Flag"; salinity_qc_flat_line_test:standard_name = "flat_line_test_quality_flag"; int salinity_qc_agg(time, z); salinity_qc_agg:long_name = "Salinity Aggregate Flag"; salinity_qc_agg:standard_name = "aggregate_quality_flag";
Note that the ancillary variables in this example are simplified to exclude flag_values
, flag_masks
and flag_meanings
attributes described in Section 3.5, "Flags" that they would ordinarily require
The attributes flag_values
, flag_masks
and flag_meanings
are intended to make variables that contain flag values self describing.
Status codes and Boolean (binary) condition flags may be expressed with different combinations of flag_values
and flag_masks
attribute definitions.
The flag_values
and flag_meanings
attributes describe a status flag consisting of mutually exclusive coded values.
The flag_values
attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values.
The flag_meanings
attribute is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value.
Each word or phrase should consist of characters from the alphanumeric set and the following five: '_', '-', '.', '+', '@'.
If multi-word phrases are used to describe the flag values, then the words within a phrase should be connected with underscores.
The following example illustrates the use of flag values to express a speed quality with an enumerated status code.
flag_values
byte current_speed_qc(time, depth, lat, lon) ; current_speed_qc:long_name = "Current Speed Quality" ; current_speed_qc:standard_name = "status_flag" ; current_speed_qc:_FillValue = -128b ; current_speed_qc:valid_range = 0b, 2b ; current_speed_qc:flag_values = 0b, 1b, 2b ; current_speed_qc:flag_meanings = "quality_good sensor_nonfunctional outside_valid_range" ;
Note that the data variable containing current speed has an ancillary_variables attribute with a value containing current_speed_qc.
The flag_masks and flag_meanings attributes describe a number of independent Boolean conditions using bit field notation by setting unique bits in each flag_masks value. The flag_masks attribute is the same type as the variable to which it is attached, and contains a list of values matching unique bit fields. The flag_meanings attribute is defined as above, one for each flag_masks value. A flagged condition is identified by performing a bitwise AND of the variable value and each flag_masks value; a non-zero result indicates a true condition. Thus, any or all of the flagged conditions may be true, depending on the variable bit settings. The following example illustrates the use of flag_masks to express six sensor status conditions.
flag_masks
byte sensor_status_qc(time, depth, lat, lon) ; sensor_status_qc:long_name = "Sensor Status" ; sensor_status_qc:standard_name = "status_flag" ; sensor_status_qc:_FillValue = 0b ; sensor_status_qc:valid_range = 1b, 63b ; sensor_status_qc:flag_masks = 1b, 2b, 4b, 8b, 16b, 32b ; sensor_status_qc:flag_meanings = "low_battery processor_fault memory_fault disk_fault software_fault maintenance_required" ;
A variable with standard name of region
, area_type
or any other standard name which requires string-valued values from a defined list may use flags together with flag_values
and flag_meanings
attributes to record the translation to the string values.
The following example illustrates this using integer flag values for a variable with standard name region
and flag_values
selected from the standardized region names (see section 6.1.1).
flag_values
int basin(lat, lon); standard_name: region; flag_values: 1, 2, 3; flag_meanings:"atlantic_arctic_ocean indo_pacific_ocean global_ocean"; data: basin: 1, 1, 1, 1, 2, ..... ;
The flag_masks
, flag_values
and flag_meanings
attributes, used together, describe a blend of independent Boolean conditions and enumerated status codes.
The flag_masks
and flag_values
attributes are both the same type as the variable to which they are attached.
A flagged condition is identified by a bitwise AND of the variable value and each flag_masks
value; a result that matches the flag_values
value indicates a true
condition.
Repeated flag_masks
define a bit field mask that identifies a number of status conditions with different flag_values
.
The flag_meanings
attribute is defined as above, one for each flag_masks
bit field and flag_values
definition.
Each flag_values
and flag_masks
value must coincide with a flag_meanings
value.
The following example illustrates the use of flag_masks
and flag_values
to express two sensor status conditions and one enumerated status code.
flag_masks
and flag_values
byte sensor_status_qc(time, depth, lat, lon) ; sensor_status_qc:long_name = "Sensor Status" ; sensor_status_qc:standard_name = "status_flag" ; sensor_status_qc:_FillValue = 0b ; sensor_status_qc:valid_range = 1b, 15b ; sensor_status_qc:flag_masks = 1b, 2b, 12b, 12b, 12b ; sensor_status_qc:flag_values = 1b, 2b, 4b, 8b, 12b ; sensor_status_qc:flag_meanings = "low_battery hardware_fault offline_mode calibration_mode maintenance_mode" ;
In this case, mutually exclusive values are blended with Boolean values to maximize use of the available bits in a flag value.
The table below represents the four binary digits (bits) expressed by the sensor_status_qc
variable in the previous example.
Bit 0 and Bit 1 are Boolean values indicating a low battery condition and a hardware fault, respectively. The next two bits (Bit 2 and Bit 3) express an enumeration indicating abnormal sensor operating modes. Thus, if Bit 0 is set, the battery is low and if Bit 1 is set, there is a hardware fault - independent of the current sensor operating mode.
Bit 3 (MSB) | Bit 2 | Bit 1 | Bit 0 (LSB) |
---|---|---|---|
H/W Fault |
Low Batt |
The remaining bits (Bit 2 and Bit 3) are decoded as follows:
Bit 3 | Bit 2 | Mode |
---|---|---|
0 |
1 |
offline_mode |
1 |
0 |
calibration_mode |
1 |
1 |
maintenance_mode |
The "12b" flag mask is repeated in the sensor_status_qc
flag_masks
definition to explicitly declare the recommended bit field masks to repeatedly AND with the variable value while searching for matching enumerated values.
An application determines if any of the conditions declared in the flag_meanings
list are true
by simply iterating through each of the flag_masks
and AND’ing them with the variable.
When a result is equal to the corresponding flag_values
element, that condition is true
.
The repeated flag_masks
enable a simple mechanism for clients to detect all possible conditions.