Background:
As you all know, the vast majority of standard_names are for numeric
variables and have an associated "Canonical Units".
However, there are some existing standard_names for string variables
(e.g., area_type, institution, land_cover, land_cover_lccs,
platform_id, platform_name, region, sensor_band_identifier, source,
and surface_cover). They do not have associated Canonical Units.
If the "data_type=string" and "charset" attributes are accepted by CF
and thus we can clearly identify String variables and know their
character encoding, I would like to propose that we add several
additional standard_names that identify/describe the String variables
in the same way that other standard_names describe numeric variables.
I want to get your comments and suggestions before I formally propose them.
Here are the possible additional String standard_names,
with definitions and [comments in brackets]. I have current
needs/uses for almost of these (most exceptions are noted below),
e.g., I have a tabular dataset where each row has information
about a different project at NOAA.
doi
Each String specifies a single Digital Object Identifier.
email_address
Each String specifies a single email address.
phone_number
Each String specifies a single,
voice (thus not including fax numbers),
international (thus starting with +countryCode)
phone number.
The E.164 format is required:
+countryCode subscriberNumberIncludingAreaCode
e.g., "+1 202 456 1111" (The White House!)
https://en.wikipedia.org/wiki/E.164
Spaces between the country code, the area code, the prefix,
and the number are strongly encouraged by not required.
Parentheses and dashes are discouraged.
uri
Each String specifies a single URI.
url
Each String specifies a complete, single URL.
It must start with a "scheme" (http:// , https:// , ftp:// , etc.).
[It would be possible in the future add related
standard_names by appending a specific subtype,
e.g., url_project_webpage, url_iso19115_2, url_image
if there is a need and if people think it's a good idea.]
html_document
Each String specifies a complete HTML document.
[I am not sure about this one.
I admit I don't have a current use case, but I think it is
important to distinguish a complete HTML document from a snippet.]
html_description
Each String is a snippet of text using HTML markup
which describes something [e.g., a project, a buoy,
the condition of a beached whale, ...]
html_snippet
Each String is a snippet of text using HTML markup
tags that isn't a complete HTML document.
This is to be used for html snippets whenever there isn't
a suitable, more specific variant, //italics
e.g., html_description
[I'm open to words other than "snippet".]
json
Each String is JSON-text: a JSON object, array, number, string,
or one of the following three literal names: false, null, true.
See
http://www.rfc-editor.org/rfc/rfc7159.txt
json_geojson
Each String is GeoJSON, as specified by
https://tools.ietf.org/html/rfc7946
wkt_geometry
Each String specifies a complete WKT geometry
as specified in the ISO/IEC 13249-3:2016 standard,
"Information technology ? Database languages ? SQL multimedia
and application packages ? Part 3: Spatial" (SQL/MM).
[If additional variants need to be specified in the future,
we can append _*subtype*, e.g., wkt_geometry_iso13249_3_2016.
NOTE that the use of wkt_geometry with a String variable
(a multidimensional char with a charset attribute) doesn't
preclude other methods of storing geometries.]
wkt_crs
Each String specifies a WKT CRS as specified by
ISO 19162:2015, "Geographic information ? Well-known text
representation of coordinate reference systems".
xml_document
Each String specifies a complete XML document.
Use this only if there isn't a suitable, more specific variant,
e.g., xml_iso19115_2.
[I am not sure about this one.
I admit I don't have a current use case, but I think it is
important to distinguish a complete XML document from a snippet.]
xml_iso19115_2
Each String specifies a complete ISO 19115-2 / ISO 19139 XML
document.
[Ted Habermann: does this make your day? :-) ]
xml_iso19115_1
Each String specifies a complete ISO 19115-1 XML document.
[My need for this is not immediate, but I know it is coming.]
Additional Comments
These are somewhat different than the current standard_names.
Here is the reasoning behind them:
As with existing standard_names, the goal was short, human-readable
names which follow the CF naming convention.
syntax_meaning -
Although MIME types are too general for our purposes and only
apply to entire documents, I like their use of type/subtype
(although I used '_' as the separator instead of '/')
and I like that the "type" prefix can serve a software-related function
(e.g., all standard_names above that start with "xml" indicate
that the content can be parsed with an XML parser).
So when relevant, the proposed standard_names specify syntax
and meaning, using for format *syntax_meaning*, e.g., xml_iso19115_2.
Interestingly, I think the actual ISO 19115-2 document just
specifies the meaning/content, while ISO 19139 specified the XML
representation of that content, so it is a good example of the
need for *syntax_meaning* notation.
text_plain -
I didn't include anything like "text_plain" because that is,
in a practical sense, the default for Strings, and because it
is implied by more specific standard_names like existing
platform_name, region, source.
single vs. plural -
For many standard_names, I specified that each String specify
a single item. I'm open to allowing multiple values if the
separator is specified in the standard_names definition.
Thank you for considering these names.
--
Sincerely,
Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St., Suite 255A (New!)
Monterey, CA 93940 (New!)
Phone: (831)333-9878 (New!)
Fax: (831)648-8440
Email: bob.simons at noaa.gov
The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <><
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20170210/d362a3b4/attachment.html>
Received on Fri Feb 10 2017 - 13:18:10 GMT