⇐ ⇒

[CF-metadata] Provide input on draft of STAC "Datacube" Extension

From: Roy Mendelssohn - NOAA Federal <roy.mendelssohn>
Date: Mon, 28 Jan 2019 08:08:11 -0800

Before taking this off-line (now that I am back at work and can use my work email), can I just add that ERDDAP also has many of these features. In my opinion, a lot of the new efforts are reinventing the wheel. Perhaps taking time to see what is out there, what they do, and why they are doing things in certain ways, would be a worthwhile effort.

-Roy


> On Jan 27, 2019, at 2:02 AM, Ryan Abernathey <ryan.abernathey at gmail.com> wrote:
>
> Hi Sean et al,
>
> Glad to hear that you folks are interested in continuing the discussion. I personally think there is a lot of value in breaking these catalog / metadata standards apart from the language-specific implementations. As you said, this may already be done, but the online docs don't always make this clear.
>
> I propose we continue the discussion in
> https://github.com/radiantearth/stac-spec/issues/366#issuecomment-457875806
> in order to avoid further traffic on the cf-metadata mailing list.
>
> In that thread, Chris Holmes said the following:
>
> > Hyrax looks pretty awesome and the creators of it definitely sound like a group we should be talking to. Any chance of introducing us? That's awesome they have JSON-LD markup, as it's one of our main next goals, see #378 I think we'd be interested in learning from them, to help make STAC better.
>
> So they are really looking for input. If you or anyone else wants to chime in, you would be very welcome. We are currently discussing scheduling a call for early February to address THREDDS / OpenDAP / STAC integration.
>
> Cheers,
> Ryan
>
>
> On Tue, Jan 22, 2019 at 4:57 PM Sean Arms <sarms at ucar.edu> wrote:
> Greetings Ryan!
>
> I'll be the first to admit that we do not do a good job delineating the different pieces covered by the THREDDS umbrella (e.g. Siphon, the catalog spec, netCDF-java, the THREDDS Data Server (TDS), etc.), many of which are in one monolithic repository (will be tackling this soon), and I totally understand thinking "java" when I hear THREDDS. The TDS does have a RESTful catalog API, which allows for linking catalogs, performing catalog subsetting, and generating client catalogs and their associated html views for ease of browsing. If useful, I could see implementing the STAC catalog API as well to help bridge communities. The TDS is currently capable of generating THREDDS client catalogs for collections stored in more cloud friendly environments as well, such as S3, and we are looking at zarr support at the netCDF-java level. That said, I'd like to hold off a bit to see if STAC begins to align better with schema.org (I see there is an issue open about this, and it looks promising); I've recently been looking
 at extending the THREDDS metadata elements to better align with the schema.org Dataset and DataCatalog objects so that we can produce a richer mapping.
>
> I'm happy to take this discussion to another venue, as I don't want to take away from the topic at hand. Perhaps we could meet up at the EarthCube All Hands meeting if you will be in town.
>
> Cheers!
>
> Sean
>
>
> On Mon, Jan 21, 2019 at 4:16 AM Ryan Abernathey <ryan.abernathey at gmail.com> wrote:
> Ryan,
>
> That's a very good question. I'm not sure there IS a definitive advantage of STAC over THREDDS. My main goal here is just to have some discussion, since STAC is growing quickly in the adjacent world of geospatial imagery. The conclusion may very well be, we don't need STAC.
>
> Within Pangeo, our interest in STAC is in the context of putting netCDF (or more accurately, netCDF-like Zarr) data into cloud storage as static assets. We are looking for a way to catalog those assets, also in a static file, and STAC is one thing we are considering. TBH, it never even occurred to me that we could generate a static THREDDS catalog.xml file and use this to describe our assets. In my mind, THREDDS catalogs are intimately bound to opendap servers, and the THREDDS server in particular. But after reading the THREDDS spec document you sent, I see that I am wrong about that.
>
> Comparing the STAC vs. THREDDS specs, the main difference to me is the relative high complexity of the THREDDS spec. I could pretty easy figure out how to write a STAC catalog for my data in a text editor. I can't say the same for THREDDS. But that is not a dealbreaker.
>
> Beyond that, the STAC principles are pretty appealing:
>
> - Creation and evolution of specs in Github, using Open Source principles
> - JSON + REST + HTTP at the core
> - Small Reusable Pieces Loosely Coupled
> - Specify in OpenAPI 3.0 (formerly known as Swagger) specification
> - Focus on the developer. Specifications should aim for implementability
> - Working code required. Proposed changes should be accompanied by working code (ideally with a link to an online service running the code)
> - Design for scale
>
> The first one--evolution of specs in Github--is particularly important is we explore new cloud-native approaches to data sharing. Perhaps the THREDDS spec is indeed managed in such a way, but I couldn't find the repo. For the second (JSON vs. XML), the THREDDS stack to me feels pretty java-centric in general, based on an older generation of web technology. I don't want to have to learn java. But again, I recognize it is possible to just use the spec and not necessarily the software around it.
>
> Best,
> Ryan
>
>
>
>
>
>
>
> On Sun, Jan 20, 2019 at 11:40 PM Ryan May <rmay at ucar.edu> wrote:
> Ryan,
>
> Can you enumerate some of the advances that STAC brings over the existing THREDDS catalog spec? https://www.unidata.ucar.edu/software/thredds/v4.6/tds/catalog/InvCatalogSpec.html
>
> From what I can tell, a lot of the concepts are similar, and the existing spec is already supported by tools beyond THREDDS, like Hyrax and ERDDAP. I think JSON is great and all, but I'm curious what adding a new standard brings us rather than extending an existing, already-supported standard?
>
> Ryan
>
> On Fri, Jan 18, 2019 at 7:23 AM Ryan Abernathey <ryan.abernathey at gmail.com> wrote:
> Dear CF Conventions People,
>
> Some of you may be aware of the Spatio-Temporal Asset Catalog (STAC) project.
> https://github.com/radiantearth/stac-spec/
> STAC is basically a .json specification which aims to standardize the way geospatial assets are exposed online and queried. It originated from the geospatial imaging community, and is described in this blog post by Chris Holmes:
> https://medium.com/radiant-earth-insights/announcing-the-spatiotemporal-asset-catalog-stac-specification-1db58820b9cf
>
> There is currently some discussion on the STAC repo about how STAC could be useful for netCDF-type data.
> https://github.com/radiantearth/stac-spec/issues/366
> This is a technology space that is currently occupied by THREDDS and OpenDAP. But something like STAC could be very useful for our community, especially as more netCDF-style data moves into the cloud.
>
> In particular, there is a proposed extension to STAC to describe "data cubes," which is roughly what geospatial imaging people call netCDF-type gridded datasets:
> https://github.com/radiantearth/stac-spec/pull/361
>
> I wanted to email this mailing list to see if someone from the CF community could weigh in on this proposed standard. More generally, if the CF / netCDF community wanted to engage with the STAC people more broadly, I think there is quite a bit of potential.
>
> Cheers,
> Ryan Abernathey
>
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>
>
> --
> Ryan May, Ph.D.
> Software Engineer
> UCAR/Unidata
> Boulder, CO
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> _______________________________________________
> CF-metadata mailing list
> CF-metadata at cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
Received on Mon Jan 28 2019 - 09:08:11 GMT

This archive was generated by hypermail 2.3.0 : Tue Sep 13 2022 - 23:02:43 BST

⇐ ⇒