⇐ ⇒

[CF-metadata] Provide input on draft of STAC "Datacube" Extension

From: Ryan Abernathey <ryan.abernathey>
Date: Sun, 27 Jan 2019 11:02:53 +0100

Hi Sean et al,

Glad to hear that you folks are interested in continuing the discussion. I
personally think there is a lot of value in breaking these catalog /
metadata standards apart from the language-specific implementations. As you
said, this may already be done, but the online docs don't always make this
clear.

I propose we continue the discussion in
https://github.com/radiantearth/stac-spec/issues/366#issuecomment-457875806
in order to avoid further traffic on the cf-metadata mailing list.

In that thread, Chris Holmes said the following:

> Hyrax looks pretty awesome and the creators of it definitely sound like a
group we should be talking to. Any chance of introducing us? That's awesome
they have JSON-LD markup, as it's one of our main next goals, see #378 I
think we'd be interested in learning from them, to help make STAC better.

So they are really looking for input. If you or anyone else wants to chime
in, you would be very welcome. We are currently discussing scheduling a
call for early February to address THREDDS / OpenDAP / STAC integration.

Cheers,
Ryan


On Tue, Jan 22, 2019 at 4:57 PM Sean Arms <sarms at ucar.edu> wrote:

> Greetings Ryan!
>
> I'll be the first to admit that we do not do a good job delineating the
> different pieces covered by the THREDDS umbrella (e.g. Siphon, the catalog
> spec, netCDF-java, the THREDDS Data Server (TDS), etc.), many of which are
> in one monolithic repository (will be tackling this soon), and I totally
> understand thinking "java" when I hear THREDDS. The TDS does have a RESTful
> catalog API, which allows for linking catalogs, performing catalog
> subsetting, and generating client catalogs and their associated html views
> for ease of browsing. If useful, I could see implementing the STAC catalog
> API as well to help bridge communities. The TDS is currently capable of
> generating THREDDS client catalogs for collections stored in more cloud
> friendly environments as well, such as S3, and we are looking at zarr
> support at the netCDF-java level. That said, I'd like to hold off a bit to
> see if STAC begins to align better with schema.org (I see there is an
> issue open about this, and it looks promising); I've recently been looking
> at extending the THREDDS metadata elements to better align with the
> schema.org Dataset and DataCatalog objects so that we can produce a
> richer mapping.
>
> I'm happy to take this discussion to another venue, as I don't want to
> take away from the topic at hand. Perhaps we could meet up at the EarthCube
> All Hands meeting if you will be in town.
>
> Cheers!
>
> Sean
>
>
> On Mon, Jan 21, 2019 at 4:16 AM Ryan Abernathey <ryan.abernathey at gmail.com>
> wrote:
>
>> Ryan,
>>
>> That's a very good question. I'm not sure there IS a definitive advantage
>> of STAC over THREDDS. My main goal here is just to have some discussion,
>> since STAC is growing quickly in the adjacent world of geospatial imagery.
>> The conclusion may very well be, we don't need STAC.
>>
>> Within Pangeo, our interest in STAC is in the context of putting netCDF
>> (or more accurately, netCDF-like Zarr) data into cloud storage as static
>> assets. We are looking for a way to catalog those assets, also in a static
>> file, and STAC is one thing we are considering. TBH, it never even occurred
>> to me that we could generate a static THREDDS catalog.xml file and use this
>> to describe our assets. In my mind, THREDDS catalogs are intimately bound
>> to opendap servers, and the THREDDS server in particular. But after reading
>> the THREDDS spec document you sent, I see that I am wrong about that.
>>
>> Comparing the STAC vs. THREDDS specs, the main difference to me is the
>> relative high complexity of the THREDDS spec. I could pretty easy figure
>> out how to write a STAC catalog for my data in a text editor. I can't say
>> the same for THREDDS. But that is not a dealbreaker.
>>
>> Beyond that, the STAC principles
>> <https://github.com/radiantearth/stac-spec/blob/master/principles.md>
>> are pretty appealing:
>>
>> - Creation and evolution of specs in Github, using Open Source principles
>> - JSON + REST + HTTP at the core
>> - Small Reusable Pieces Loosely Coupled
>> - Specify in OpenAPI 3.0 (formerly known as Swagger) specification
>> - Focus on the developer. Specifications should aim for implementability
>> - Working code required. Proposed changes should be accompanied by
>> working code (ideally with a link to an online service running the code)
>> - Design for scale
>>
>> The first one--evolution of specs in Github--is particularly important is
>> we explore new cloud-native approaches to data sharing. Perhaps the THREDDS
>> spec is indeed managed in such a way, but I couldn't find the repo. For the
>> second (JSON vs. XML), the THREDDS stack to me feels pretty java-centric in
>> general, based on an older generation of web technology. I don't want to
>> have to learn java. But again, I recognize it is possible to just use the
>> spec and not necessarily the software around it.
>>
>> Best,
>> Ryan
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Jan 20, 2019 at 11:40 PM Ryan May <rmay at ucar.edu> wrote:
>>
>>> Ryan,
>>>
>>> Can you enumerate some of the advances that STAC brings over the
>>> existing THREDDS catalog spec?
>>> https://www.unidata.ucar.edu/software/thredds/v4.6/tds/catalog/InvCatalogSpec.html
>>>
>>> From what I can tell, a lot of the concepts are similar, and the
>>> existing spec is already supported by tools beyond THREDDS, like Hyrax and
>>> ERDDAP. I think JSON is great and all, but I'm curious what adding a new
>>> standard brings us rather than extending an existing, already-supported
>>> standard?
>>>
>>> Ryan
>>>
>>> On Fri, Jan 18, 2019 at 7:23 AM Ryan Abernathey <
>>> ryan.abernathey at gmail.com> wrote:
>>>
>>>> Dear CF Conventions People,
>>>>
>>>> Some of you may be aware of the Spatio-Temporal Asset Catalog (STAC)
>>>> project.
>>>> https://github.com/radiantearth/stac-spec/
>>>> STAC is basically a .json specification which aims to standardize the
>>>> way geospatial assets are exposed online and queried. It originated from
>>>> the geospatial imaging community, and is described in this blog post by
>>>> Chris Holmes:
>>>>
>>>> https://medium.com/radiant-earth-insights/announcing-the-spatiotemporal-asset-catalog-stac-specification-1db58820b9cf
>>>>
>>>> There is currently some discussion on the STAC repo about how STAC
>>>> could be useful for netCDF-type data.
>>>> https://github.com/radiantearth/stac-spec/issues/366
>>>> This is a technology space that is currently occupied by THREDDS and
>>>> OpenDAP. But something like STAC could be very useful for our community,
>>>> especially as more netCDF-style data moves into the cloud.
>>>>
>>>> In particular, there is a proposed extension to STAC to describe "data
>>>> cubes," which is roughly what geospatial imaging people call netCDF-type
>>>> gridded datasets:
>>>> https://github.com/radiantearth/stac-spec/pull/361
>>>>
>>>> I wanted to email this mailing list to see if someone from the CF
>>>> community could weigh in on this proposed standard. More generally, if the
>>>> CF / netCDF community wanted to engage with the STAC people more broadly, I
>>>> think there is quite a bit of potential.
>>>>
>>>> Cheers,
>>>> Ryan Abernathey
>>>>
>>>> _______________________________________________
>>>> CF-metadata mailing list
>>>> CF-metadata at cgd.ucar.edu
>>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>>
>>>
>>>
>>> --
>>> Ryan May, Ph.D.
>>> Software Engineer
>>> UCAR/Unidata
>>> Boulder, CO
>>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata at cgd.ucar.edu
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cgd.ucar.edu/pipermail/cf-metadata/attachments/20190127/384066e0/attachment-0001.html>
Received on Sun Jan 27 2019 - 03:02:53 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:43 BST

⇐ ⇒