--- Dr Stephen Emsley??????? ???????????????????????????????????????????????????????????Tel: +44 (0)1752 764 289 ? ARGANS Limited????????????????????????????????????????????????????? ????????Mobile: +44 (0)7912 515 418 -----Original Message----- From: cf-metadata-bounces at cgd.ucar.edu [mailto:cf-metadata-bounces at cgd.ucar.edu] On Behalf Of John Caron Sent: 20 November 2009 12:30 To: cf-metadata at cgd.ucar.edu Subject: [CF-metadata] Multiple file datasets (was: Swath observational data) This topic deserves its own heading, so here it is. Perhaps we should gather current practices and ideas. I think Balaji's gridspec has a proposal about this. Can anyone summarize what SAFE does? Im imagining how this is actually used, eg: float data(y,x); data:coordinates = "lat at file1 lon at file2"; ???? John Graybeal wrote: > I like Bryan's recommendation for a UUID or similar. > > Now I'm going to be annoying and suggest the UUID *could* be a URI, or > these days, an IRI (International ..). > > And I think the way of 'locating' the file should be neither in > packaging nor in local resolution; it should be in global namespace > resolution. This is the way of the future, and is already more > 'permanent' than either packaging or local resolution, IMHO. > > There is one form of URI in particular that is already resolvable: a > URL. OK, that's an old song, but I'm gonna stick to it for a while > longer. That form meets all the other requirements: it can be > registered in a resolver, it can be guaranteed unique (to the same > authority level as a UUID, anyway), and it is a unique string that can > be used to validate the link). And it has the obvious benefit of being > resolvable right now, for as long as the domain is held and properly > maintained (Good URLs don't die). > > Since the last paragraph risks starting another unique identifier war, I > promise not to re-engage unless someone asks me to. Meanwhile, I like > > John > > > On Nov 19, 2009, at 22:23, Bryan Lawrence wrote: > >> On Thursday 19 November 2009 19:40:08 Jonathan Gregory wrote: >>>> ... In some cases, referencing attributes such as >>>> "coordinates" and "ancillary_variables" would, ideally, point to a >>>> variable in a different dataset. >>> >>> This is a general problem to which CF doesn't have a solution because >>> it was >>> conceived as a convention for single netCDF files. However we need a >>> solution >>> as often several files should be treated as a single dataset. >>> >>> If the files don't overlap i.e. their contents are complementary, I >>> think it >>> should be satisfactory to allow variables in one file to be pointed >>> to by name >>> from another file, with no other mechanism being required within the >>> file. I >>> don't like the idea of naming one file within another file, as that >>> would be >>> very fragile. Instead, I think the file aggregation should be implied by >>> simply defining the group of files which are to be treated as one >>> file e.g. >>> by putting them in one directory. >> >> It's the old ones that are the best ones :-) :-) this issue keeps on >> coming back ... :-) :-) and we keep trying to ignore it ... >> >> I think we agree that an actual physical filename including path is >> useless. We need both a relative link which relies on the >> preservation of a group of files in a particular arrangement ... AND >> an internal identifier so more robust linking mechanisms can be used >> when (if) the data ends up in a managed environment. >> >> I think it's crucial in this situation to ensure that each file has a >> unique identifier within it (created, for example, with uuid), because >> all solutions which rely on packaging are fragile (SAFE is probably >> better than most), but the bottom line is that users move files around >> ... and we need some way of ensuring that we/they can validate the >> links that are in place are the ones that were originally intended. >> >> So relative links would also include the identifier of the intended >> target as well as the relative path in operating system agnostic terms. >> >> That identifier can be used in two ways: to validate the link (my >> software can always check that the variable that I just opened >> following a link from another one is the one that was expected by >> checking the container identifier), and b) to produce an identifier >> resolver service for the situation where the packaging has had to be >> broken (which might occur for performance reasons or ...) >> >> CF could recommend something like this ... >> >> Bryan >> >> -- >> Bryan Lawrence >> Director of Environmental Archival and Associated Research >> (NCAS/British Atmospheric Data Centre and NCEO/NERC NEODC) >> STFC, Rutherford Appleton Laboratory >> Phone +44 1235 445012; Fax ... 5848; >> Web: home.badc.rl.ac.uk/lawrence >> _______________________________________________ >> CF-metadata mailing list >> CF-metadata at cgd.ucar.edu >> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > -------------- > I have my new work email address: jgraybeal at ucsd.edu > -------------- > > John Graybeal <mailto:jgraybeal at ucsd.edu> > phone: 858-534-2162 > Development Manager > Ocean Observatories Initiative Cyberinfrastructure Project: > http://ci.oceanobservatories.org > Marine Metadata Interoperability Project: http://marinemetadata.org > > _______________________________________________ > CF-metadata mailing list > CF-metadata at cgd.ucar.edu > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata _______________________________________________ CF-metadata mailing list CF-metadata at cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadataReceived on Fri Nov 20 2009 - 06:12:09 GMT
This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:41 BST