Awesome Google Earth Engine Community Catalog

Samapriya Roy
5 min readOct 7, 2022

--

Samapriya Roy, Valerie Pasquarella and Erin Dawn Trochim

About two years ago a simple movement toward data commons and the need for accessibility to publishing open datasets was born over a data request. The idea was a simple one, what if Google Earth Engine users, community actors, researchers, and domain experts could help enrich the ever-growing need for more data? All this while building a searchable data catalog for their own needs. The awesome GEE community catalog lives and serves alongside the Google Earth Engine data catalog and also houses datasets that are often requested by the community and under a variety of open licenses. Still drawing inspiration from the core ideas of how commons across different societies are managed by those who build and are a part of them. You can read more about how we started here.

I like to believe that “Communities are what communities build together”. Thank you for all the support we hope if you find this work useful you will ⭐ the GitHub repository it really supports us and lets others know that this is a well-used resource in the community.

Getting down to Numbers

For the last two years, the community catalog has grown with over 952 datasets and over 100+ TB of community-contributed and available datasets. We have made improvements and redesigned how to make this catalog more searchable and usable by most users. This includes a lot of feedback and learning from the community. This article focuses on the current structure of the community catalog, improvements, navigation, and getting to where you need to be. We also dive deeper into how you can contributeto the catalog in more than one way and help this grow.

Community Catalog Stats from 2022–10–03

Finding a home

We have a new website. Behind the scenes, most of the data crunching, cleanup, and analysis to get the data ready for the community catalog happens using National Science Foundation, funded Jetstream 2 project. More details to follow behind the scenes in a later article.

Awesome GEE Community Catalog logo

For now, you can head over to the community catalog at

https://gee-community-catalog.org/

Getting Started

For months we have focused on streamlining how users can contribute to and navigate the catalog. We have made improvements based on these, and we are hoping the getting started section allows you to focus on navigation and provides direct links to templates we use for interaction. We have new templates for everything from adding new datasets, reporting bugs, and helping us build a series of examples fitting these datasets. Find the templates and more in the Getting started section.

Sample Template to add new datasets to the community catalog

The Catalog Examples Repo in GEE

An obvious thing about learning from great examples is knowing how the world expects you to build what works. We learned from the Examples folder in GEE and wanted to push ourselves to collect and make available all the datasets through examples. Now you can add the GEE repository and have it updated automatically as we add new datasets.

The repository should appear in your reader’s repo list, and you might have to refresh once if you don’t see it. Use this link to add the examples repo from the community catalog to your GEE repos

Adding and exploring the GEE community catalog repo

Finding the right fit: Searching the Catalog

The datasets are grouped under domains; you can expand them to get to an individual dataset of interest.

Using domains and groups to look for data

You can also search using the search bar with keywords, tags, and even author names in the paper and sources of these datasets.

Using the search tool to look for keywords, tags, authors & more

Anatomy of a dataset page

The dataset pages in the catalog have a general outline. Usually, a page is split into

  • Description: Includes description of the dataset(s) and links to the paper and the source files for the datasets
  • Dataset citation: We love what community members produce, so citations are crucial and critical. This can be both a paper citation as well a dataset citation that is provided
  • Earth Engine snippets: This is designed to introduce you to all assets that are part of the dataset. This could be all collections, tables, images, and so on.
  • Sample Script: It was essential to add a way for users to get started with these datasets, so the sample scripts are designed to help with just that.
  • License: There are hundreds of grants available across different datasets, and it is important to highlight which datasets are covered are which specific license before use.
  • Keywords & Curation: While some users have ingested their datasets into GEE, curation information and who curates the datasets are essential in case their access is modified or things change.
Anatomy of a single dataset (ASTER GDEM v3)

Release & Citation

While we keep adding data continuously, we hope to create releases once every month, which would tie to our catalog citations. We have had many questions on how to cite our catalog, and we hope this allows users like you a quick way to do just that apart from using the data citations listed and highlighted in the dataset pages.

Samapriya Roy, Erin Trochim, Alec L. Robitaille, & Valerie Pasquarella. (2022). samapriya/awesome-gee-community-datasets: Community Catalog(1.0.0). Zenodo. https://doi.org/10.5281/zenodo.7144934

This has been a labor of love, and we hope to continue building and enriching the community catalog. There will be lessons we learn, improvements we make, and, more importantly, people we meet and friends we make, just like at the Geo for Good conference this year, 2022. Look out for the Geo for Good session video on this and more to be posted here. Also, look for the medium series on how to get your data into Google Earth Engine lessons from the community catalog series to follow.

--

--

Samapriya Roy

Remote sensing applications, large scale data processing and management, API applications along with network analysis and geostatistical methods