How ClimateEngine.org and Awesome GEE Community Catalog are Expanding Open Geospatial Commons

Samapriya Roy
7 min readOct 5, 2023

--

In conversation with Justin Huntington and Eric Jensen

What are we if not storytellers with data? There are more narratives hidden in a time series than collective memories, more than just raw cold data.

No one knows how to tell stories better with data than the Desert Research Institute (DRI). As Justin Huntington, a Research Professor at DRI and Climate Engine project lead, once explained, “If you’re trying to study climate and environmental change, nothing beats the combination of maps and time series.

Climate Engine Banner from ClimateEngine.org

ClimateEngine.org was founded in 2014 on the belief that open access to Earth observations data can empower discoveries that improve lives. Its public platform now makes decades of satellite imagery and climate data available for use without the need to code. Now, the ClimateEngine.org partnership between Desert Research Institute (DRI) and the University of California, Merced (UC-Merced) is expanding its contribution to the open data commons by making its internal Google Earth Engine datasets publicly available through the Awesome GEE community catalog.

When asked about Climate Engine’s origin story, Justin recalls, “In 2014 we got a Google Faculty grant to bring drought and evapotranspiration data into Earth Engine. This allowed agencies to run algorithms at scale instead of moving data locally.” Justin adds, “Everyone was looking to scale what they were doing. It drove rapid adoption beyond just drought and ET.

The Awesome GEE Community Catalog was started in 2020 by Samapriya Roy to promote the openness and accessibility of geospatial data. This unfunded open-source project has grown rapidly to over 320TB of data and 1350+ datasets across diverse themes.

Awesome GEE Community Catalog

Just as shared norms create value in offline commons, principles of openness reduce friction for downstream users of digital commons. Awesome GEE community catalog is a manifestation of the belief that, as I have always said,

“Communities are what communities build together.”

Recently, DRI researchers Justin Huntington and Eric Jensen reached out to collaborate with me on getting many of the datasets at ClimateEngine.org over to the Awesome Google Earth Engine (GEE) Community Catalog, resulting in the addition of 29 new datasets. By contributing national and global datasets on precipitation, fire, evapotranspiration, subseasonal to seasonal climate and drought forecasts, and more, we are all hoping to unlock new potential uses of these resources. The next couple of sections are some questions and thoughts I got to ask Eric and Justin during our conversations. To find all the datasets that were contributed check out the changelog

Dataset subsets from ClimateEngine.org contributions to the Awesome GEE Community Catalog

Sam: Earth Engine is typically effective at aggregating datasets. However, agencies often face challenges in recruiting and training staff to utilize it. Climate Engine.org seems to address this issue by serving as a bridge. What is your perspective on this?

Eric: We receive funding from Federal agencies, each with unique requirements and objectives for their datasets. Many of these agencies have sponsored various projects that have produced datasets, but the current approach of providing multiple tools and websites for agency staff to access these datasets isn’t the most efficient. This is where ClimateEngine.org plays a crucial role by providing a unified platform for performing analysis. This platform not only offers a centralized location to access common analysis tools and curated datasets but also simplifies the work involved in applying these resources, including creating derivative products such as automated reports.

By adding operational vegetation cover datasets like the Rangeland Analysis (RAP) and the Rangeland Condition Monitoring Assessment and Projection (RCMAP) to Climate Engine, it reduces friction for resource managers and researchers to use these datasets to inform decisions and make findings. This map shows tree cover reduction after a sagebrush restoration project to support sage grouse habitat in Idaho, US.

According to Eric, a key motivation was allowing users worldwide the ability to access publically-funded data easily through a common platform. This reduced barriers for researchers and decision-makers alike. Climate Engine’s tools and datasets are now relied upon globally by thousands each month.

Sam: What do you envision as the next steps? It’s evident that there’s a drive to incorporate more data and foster more collaborations. What developments are you observing?

Justin : “As we incorporated additional datasets, we recognized the potential to perform numerous calculations within the app, resulting in variables or datasets that could stand on their own. One example was producing maps of anomalies of the normalized difference vegetation index (NDVI) from Landsat or MODIS, which blew our minds at the time — we realized we could make those calculations available easily within the app. Our aim is to modularize and provide code and variables that users can easily access. The next phase in this gradual evolution is linking data to methodologies and providing code and notebook tutorials. As a hydrologist, I never thought I’d see papers citing precipitation and NDVI data created with the Climate Engine app for urban poverty and human health studies, but it’s happening more and more. There are just so many needs, and now with all these multidisciplinary data and calculation options at our fingertips, the important stories to tell are endless”

The Climate Engine team discovered while developing the app that they could make mapping NDVI anomalies easy for users. When paired with real-time drought information, also in the app, it unlocks the ability to perform powerful analysis of vegetation sensitivity to drought stress.

Eric : “We were also surprised to find 70% of our website traffic comes from outside the US, with significant traffic coming from some of the most biologically rich nations in the world, including India, Colombia, Indonesia, Mexico, Brazil, and Argentina. It validates the tremendous value of global open data.”

Sam: Let’s delve into the data aspect. I see various themes like Evapotranspiration, Drought, Climate Grids, and more. To kick things off, could you tell me about the Drought outlook dataset, especially considering my role in maintaining the United States Drought Monitor weekly data?

Justin: “Certainly, the Seasonal Drought Outlook dataset extends large-scale drought trends guided by statistical and dynamical forecasts. It provides a forward-looking view for the next 3-months, offering an outlook map that predicts whether drought conditions are expected to improve, worsen, or remain the same. Using this outlook in combination with more detailed current condition maps and downscaled forecast indices available in Climate Engine, such as gridMET-CFS evaporative demand forecast anomalies, is super powerful for drought early warning.”

Sam: With your extensive list of approximately 29 datasets, can you highlight some of your personal favorites?

Justin: “One dataset I find particularly interesting is the US Drought Monitor and advanced calculations that you can now perform easily with a few clicks, like median, trends, and probabilities over the period of record. Other noteworthy datasets are those that include potential or reference evapotranspiration, such as MERRA2, AgERA5, and the Forecast Reference Crop Evapotranspiration (FRET) dataset. It’s fascinating because it combines multiple input variables like solar radiation, air temperature, windspeed, and humidity. The real power of these datasets shines when we can create visual narratives within seconds using maps and time series that combine land surface observations like MODIS land surface temperature, with climate anomalies, and more, to hypothesize and describe complex land surface and atmospheric feedback.”

Eric: “We’ve been operationally ingesting and working with various climate and hydrology datasets, some of which were requested by NOAA, such as the NOAA U.S. Climate Gridded Dataset (NClimGrid), Applied Climate Information System (ACIS), and SNOw Data Assimilation System (SNODAS). Many of these datasets involve automated processes through Google Cloud Functions jobs. The Google Cloud Functions retrieve data from public data repositories and websites and write them to ClimateEngine.org buckets, which we then ingest as Earth Engine datasets for use in the ClimateEngine.org app and, now, as publicly available endpoints available through the catalog.”

SNODAS provides 1 km resolution daily estimates of snow depth and snow water equivalent (SWE), which can be applied to use cases such as near real-time snowpack monitoring or long-term mapping and analysis of trends in the annual maximum snowpack (see EE script for producing trend map here)

Justin: “Lastly, having land surface model fluxes in Climate Engine has truly been eye opening. Currently, we are thinking of how we can ingest and add Western Land Data Assimilation System (WLDAS). It provides information on soil moisture at various depths, snowpack data, and components of the entire water cycle at a 1km resolution. These models offer a comprehensive view of Earth’s surface processes.”

Thank you both for contributing to the community catalog and for taking the time to chat with me today

The collaborations between organizations like DRI, UC-Merced, and ClimateEngine.org and grassroots efforts like the Awesome GEE Community Catalog demonstrate the power of an open data community coming together for the greater good. When researchers and advocates join forces to make scientific datasets freely accessible, it unlocks new potential for discovery and innovation.

Open-source ecosystems thrive through these kinds of mutually beneficial partnerships. Contributors of open data benefit from seeing how others build on their work to advance science. Open access also garners more citations and recognition for the researchers who produced the datasets. For those using open data, the ability to leverage existing research accelerates innovation and discovery.

--

--

Samapriya Roy
Samapriya Roy

Written by Samapriya Roy

Remote sensing applications, large scale data processing and management, API applications along with network analysis and geostatistical methods

No responses yet