[CF-metadata] Swath observational data from Bryan Lawrence on 2009-11-20 (Archive of CF discussions from 2002 to 2019 on the cf-metadata mailing list)

From: Bryan Lawrence <bryan.lawrence>
Date: Fri, 20 Nov 2009 06:23:10 +0000

On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote:
> > ... In some cases, referencing attributes such as
> > "coordinates" and "ancillary_variables" would, ideally, point to a
> > variable in a different dataset.
>
> This is a general problem to which CF doesn't have a solution because it was
> conceived as a convention for single netCDF files. However we need a solution
> as often several files should be treated as a single dataset.
>
> If the files don't overlap i.e. their contents are complementary, I think it
> should be satisfactory to allow variables in one file to be pointed to by name
> from another file, with no other mechanism being required within the file. I
> don't like the idea of naming one file within another file, as that would be
> very fragile. Instead, I think the file aggregation should be implied by
> simply defining the group of files which are to be treated as one file e.g.
> by putting them in one directory.

It's the old ones that are the best ones :-) :-) this issue keeps on coming back ... :-) :-) and we keep trying to ignore it ...

I think we agree that an actual physical filename including path is useless. We need both a relative link which relies on the preservation of a group of files in a particular arrangement ... AND an internal identifier so more robust linking mechanisms can be used when (if) the data ends up in a managed environment.

I think it's crucial in this situation to ensure that each file has a unique identifier within it (created, for example, with uuid), because all solutions which rely on packaging are fragile (SAFE is probably better than most), but the bottom line is that users move files around ... and we need some way of ensuring that we/they can validate the links that are in place are the ones that were originally intended.

So relative links would also include the identifier of the intended target as well as the relative path in operating system agnostic terms.

That identifier can be used in two ways: to validate the link (my software can always check that the variable that I just opened following a link from another one is the one that was expected by checking the container identifier), and b) to produce an identifier resolver service for the situation where the packaging has had to be broken (which might occur for performance reasons or ...)

CF could recommend something like this ...

Bryan

-- 
Bryan Lawrence
Director of Environmental Archival and Associated Research
(NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC)
STFC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; 
Web: home.badc.rl.ac.uk/lawrence

Received on Thu Nov 19 2009 - 23:23:10 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST