Well said, John!
There are many good reasons for storing discovery and provenance metadata within our netCDF files. You can make an argument for storing such metadata only elsewhere, but I think this is only justified if you have a publicly accessible database system in place to serve the information, can rely on it to be there for the long haul, and you register every file you generate with that system. Even if you have such a system, I think the ability to recover the database by crawling your netCDF files is valuable.
Grace and peace,
Jim
Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001
jim.biard at noaa.gov
828-271-4900
Follow us on Facebook!
On May 23, 2013, at 12:00 PM, John Graybeal <graybeal at marinemetadata.org> wrote:
> +1 Martin. I am bugged (not the technical term) by the conclusions here, which seem to be: Because people design systems badly, I must constrain my own system to accommodate their failures.
>
> The use cases for storing the summary information with the file are: (A) It's faster to access, which in some circumstances affect a user (or the cost of computer cycles), whether due to large files or lots of files. (B) In some circumstance (I don't have a netCDF file mangler app sitting in hand), it's the only reasonable way to access.
>
> If someone is writing a subsetting or aggregating utility, and that utility is blindly copying over every metadata item it sees, then a whole lot of metadata is going to be wrong. (Publisher, Provenance, Last Updated, Time and/or Geospatial Range, Min/Max Values, LIcensing Permission, to name a few) This metadata isn't fragile, it's a function of the content. The person who writes the transform utility must either create all new metadata, or to understand the kind of metadata they are copying over and make any necessary changes.
>
> John
>
> On May 23, 2013, at 08:10, "Schultz, Martin" <m.schultz at fz-juelich.de> wrote:
>
>>>> ... but computing min & max on the fly can also be very expensive.
>>>> We have aggregated model output datasets where each variable is more
>>>> than 1TB!
>>
>>> Sure, I can see that that's useful metadata about the dataset, and that
>>> there's value in caching it somewhere. I just don't think it belongs with
>>> the metadata inside the netcdf file. What's the use case for storing it
>>> there?
>>
>> Dear all,
>>
>> that may be an issue of "style", or more technically speaking the way you set-up your system(s). I do think there is use for this as soon as you take a file out of an interoperable context. However, it's a very good and valid point to say that this information can (very) easily get corrupted. Thus it may be good to define some way of marking "fragile" metadata (i.e. metadata that can be corrupted by slicing or aggregating data from a file -- maybe even from several files). In fact this is related to the issue of tracking metadata information in the data model -- that has been brought up in the track ticket but was referred to the implementation...
>>
>> Cheers,
>>
>> Martin
>>
>>
>>
>>
>> ------------------------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------------------------
>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
>> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Prof. Dr. Sebastian M. Schmidt
>> ------------------------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------------------------
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>
>
> ---------------
> John Graybeal
> Marine Metadata Interoperability Project: http://marinemetadata.org
> graybeal at marinemetadata.org
>
>
>
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130523/b2c64b84/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CicsLogoTiny.png
Type: image/png
Size: 15784 bytes
Desc: not available
URL: <
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20130523/b2c64b84/attachment-0001.png>
Received on Thu May 23 2013 - 10:31:49 BST