⇐ ⇒

[CF-metadata] Provide input on draft of STAC "Datacube" Extension

From: Sean Arms <sarms>
Date: Tue, 22 Jan 2019 08:57:05 -0700

Greetings Ryan!

I'll be the first to admit that we do not do a good job delineating the
different pieces covered by the THREDDS umbrella (e.g. Siphon, the catalog
spec, netCDF-java, the THREDDS Data Server (TDS), etc.), many of which are
in one monolithic repository (will be tackling this soon), and I totally
understand thinking "java" when I hear THREDDS. The TDS does have a RESTful
catalog API, which allows for linking catalogs, performing catalog
subsetting, and generating client catalogs and their associated html views
for ease of browsing. If useful, I could see implementing the STAC catalog
API as well to help bridge communities. The TDS is currently capable of
generating THREDDS client catalogs for collections stored in more cloud
friendly environments as well, such as S3, and we are looking at zarr
support at the netCDF-java level. That said, I'd like to hold off a bit to
see if STAC begins to align better with schema.org (I see there is an issue
open about this, and it looks promising); I've recently been looking at
extending the THREDDS metadata elements to better align with the schema.org
Dataset and DataCatalog objects so that we can produce a richer mapping.

I'm happy to take this discussion to another venue, as I don't want to take
away from the topic at hand. Perhaps we could meet up at the EarthCube All
Hands meeting if you will be in town.

Cheers!

Sean


On Mon, Jan 21, 2019 at 4:16 AM Ryan Abernathey <ryan.abernathey at gmail.com>
wrote:

> Ryan,
>
> That's a very good question. I'm not sure there IS a definitive advantage
> of STAC over THREDDS. My main goal here is just to have some discussion,
> since STAC is growing quickly in the adjacent world of geospatial imagery.
> The conclusion may very well be, we don't need STAC.
>
> Within Pangeo, our interest in STAC is in the context of putting netCDF
> (or more accurately, netCDF-like Zarr) data into cloud storage as static
> assets. We are looking for a way to catalog those assets, also in a static
> file, and STAC is one thing we are considering. TBH, it never even occurred
> to me that we could generate a static THREDDS catalog.xml file and use this
> to describe our assets. In my mind, THREDDS catalogs are intimately bound
> to opendap servers, and the THREDDS server in particular. But after reading
> the THREDDS spec document you sent, I see that I am wrong about that.
>
> Comparing the STAC vs. THREDDS specs, the main difference to me is the
> relative high complexity of the THREDDS spec. I could pretty easy figure
> out how to write a STAC catalog for my data in a text editor. I can't say
> the same for THREDDS. But that is not a dealbreaker.
>
> Beyond that, the STAC principles
> <https://github.com/radiantearth/stac-spec/blob/master/principles.md> are
> pretty appealing:
>
> - Creation and evolution of specs in Github, using Open Source principles
> - JSON + REST + HTTP at the core
> - Small Reusable Pieces Loosely Coupled
> - Specify in OpenAPI 3.0 (formerly known as Swagger) specification
> - Focus on the developer. Specifications should aim for implementability
> - Working code required. Proposed changes should be accompanied by working
> code (ideally with a link to an online service running the code)
> - Design for scale
>
> The first one--evolution of specs in Github--is particularly important is
> we explore new cloud-native approaches to data sharing. Perhaps the THREDDS
> spec is indeed managed in such a way, but I couldn't find the repo. For the
> second (JSON vs. XML), the THREDDS stack to me feels pretty java-centric in
> general, based on an older generation of web technology. I don't want to
> have to learn java. But again, I recognize it is possible to just use the
> spec and not necessarily the software around it.
>
> Best,
> Ryan
>
>
>
>
>
>
>
> On Sun, Jan 20, 2019 at 11:40 PM Ryan May <rmay at ucar.edu> wrote:
>
>> Ryan,
>>
>> Can you enumerate some of the advances that STAC brings over the existing
>> THREDDS catalog spec?
>> https://www.unidata.ucar.edu/software/thredds/v4.6/tds/catalog/InvCatalogSpec.html
>>
>> From what I can tell, a lot of the concepts are similar, and the existing
>> spec is already supported by tools beyond THREDDS, like Hyrax and ERDDAP. I
>> think JSON is great and all, but I'm curious what adding a new standard
>> brings us rather than extending an existing, already-supported standard?
>>
>> Ryan
>>
>> On Fri, Jan 18, 2019 at 7:23 AM Ryan Abernathey <
>> ryan.abernathey at gmail.com> wrote:
>>
>>> Dear CF Conventions People,
>>>
>>> Some of you may be aware of the Spatio-Temporal Asset Catalog (STAC)
>>> project.
>>> https://github.com/radiantearth/stac-spec/
>>> STAC is basically a .json specification which aims to standardize the
>>> way geospatial assets are exposed online and queried. It originated from
>>> the geospatial imaging community, and is described in this blog post by
>>> Chris Holmes:
>>>
>>> https://medium.com/radiant-earth-insights/announcing-the-spatiotemporal-asset-catalog-stac-specification-1db58820b9cf
>>>
>>> There is currently some discussion on the STAC repo about how STAC could
>>> be useful for netCDF-type data.
>>> https://github.com/radiantearth/stac-spec/issues/366
>>> This is a technology space that is currently occupied by THREDDS and
>>> OpenDAP. But something like STAC could be very useful for our community,
>>> especially as more netCDF-style data moves into the cloud.
>>>
>>> In particular, there is a proposed extension to STAC to describe "data
>>> cubes," which is roughly what geospatial imaging people call netCDF-type
>>> gridded datasets:
>>> https://github.com/radiantearth/stac-spec/pull/361
>>>
>>> I wanted to email this mailing list to see if someone from the CF
>>> community could weigh in on this proposed standard. More generally, if the
>>> CF / netCDF community wanted to engage with the STAC people more broadly, I
>>> think there is quite a bit of potential.
>>>
>>> Cheers,
>>> Ryan Abernathey
>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata at cgd.ucar.edu
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>
>>
>> --
>> Ryan May, Ph.D.
>> Software Engineer
>> UCAR/Unidata
>> Boulder, CO
>>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20190122/781457d5/attachment-0001.html>
Received on Tue Jan 22 2019 - 08:57:05 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:43 BST

⇐ ⇒