⇐ ⇒

[CF-metadata] [CF Metadata] #94: Proposal for a CF String Syntax (CFSS)

From: Cecelia DeLuca <cecelia.deluca>
Date: Fri, 02 Nov 2012 11:29:11 -0600

Hi Chris,

This is still very much a proposal.

Below are some of the topics that were raised in a sanity-check
pre-ticket review and could use discussion here:
- additional use cases/applications for such metadata aggregations
(besides semantic mediation during model run-time)
- specifics of the syntax: readability, possible ambiguities
- extensibility of the syntax as a grammar
- representation in JSON

For our application in semantic mediation, the separators chosen don't
matter much, as long as the syntax is unambiguous, extensible, and
backward compatible with standard names of data.
The proposed syntax favors economy and a familiar CF style, as
advised by Jonathan. (You did split it up right :-) )

- Cecelia

On 11/2/2012 10:08 AM, Chris Barker wrote:
> I know I should be commenting in TRAC, but I don't think I have a login...
>
> First -- if this is already well established, and simply being
> codified here, then "never mind" but if there is still room for
> discussion:
>
>> CFSS strings are structured as follows:
>>
>> {{{
>> <standard name of data 1>, <standard name of data 2>, ... ,<standard name
>> of data n>
>> <standard name of coordinate or cell method 1>: <value 1> [<unit 1>]
>> <standard name of coordinate or cell method 2>: <value 2> [<unit 2>] ...
>> <standard name of coordinate or cell method m>: <value m> [<unit m>]
>> }}}
>> Examples of compliant strings using CFSS are:
>>
>> x_wind
>>
>> x_wind height: 10 m
>>
>> x_wind height: 10 m time: mean region: atlantic_ocean
>>
>> x_wind, y_wind height: 10 m time: mean region: atlantic_ocean
>>
>> height: 10 m time: mean region: atlantic_ocean
> These strick me as being a pain to parse. For example:
>
> "x_wind, y_wind height: 10 m time: mean region: atlantic_ocean"
>
> there are three delimiters there, commas, colons and whitespace. But
> whitespace can also be used to separate the units from the value.
> Also, can there be white space in any of the values (probably not
> names). To parse this, I guess I would:
>
> look for colons
> look for whitespace before the colon, what's between is the cell name?
> look at what's at the beginning, before the whitespace, before the cell name.
> split that on commas, giving me the variable names.
> look for the next colon, then look before that for whitespace.
> between the whitespace and the colon is the next cell name.
> between the previous colon and the cell name is the value and units.
> ...
>
> I'm sure there is a smarter way to write that code, but I even find it
> hard to parse with my eyes.
>
> So I suggest another delimiter -- does netcdf allow line feeds?
>
> {{{
> x_wind, y_wind
> height: 10 m
> time: mean
> region: atlantic_ocean
> }}}
>
> or maybe semi-colons?
>
> {{{
> "x_wind, y_wind; height: 10 m; time: mean; region: atlantic_ocean"
> }}}
>
> If I've split that example up wrong, then it really proves my point!
>
> Just my $0.2
>
>
> -Chris
>
>
>

-- 
===================================================================
Cecelia DeLuca
NESII/CIRES/NOAA Earth System Research Laboratory
325 Broadway, Boulder 80305-337
Email: cecelia.deluca at noaa.gov
Phone: 303-497-3604
Received on Fri Nov 02 2012 - 11:29:11 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST

⇐ ⇒