Shantar Islands and part of Uda Bay in the western Sea of Okhotsk (Landsat 8 Courtesy NASA)

Open Ocean Data: with Aqualink & pyaqua

Measuring near-real-time data over open oceans over the last couple of decades has been mostly challenging. Over 80% of our oceans are unexplored and as the impetus to expand and understand these cyclically connected systems grow, getting to data is going to be critical. While overland systems are more easily observable ocean dynamics have proven to be complex and required a complex array of investment into both hardware and software architecture.

These include modes of measurement that are expected to survive some of the harshest environments on our planet. By measuring parameters like wind height and ocean temperature open ocean systems provide a much-needed understanding and capability to predict massive shifts and changes to marine ecosystems.

Introducing Aqualink

Over the last summer, I was introduced to a philanthropic engineering organization working on building ocean conservation technology and working with partners like Sofar Ocean to build an open ocean data network. You can read their own medium post: Introducing Aqualink here. The best part, they decided to keep the data completely open and available for everyone to use.

Getting started with Aqualink Map

Aqualink provides direct access to a dashboard map hybrid. You can sign up for a free account right now but it can be used completely without any signups required.

Aqualink Signup

With or without the signup process, click on the view map to show you a quick representation of the overall sites of interest for now. Note how there is a slider button that allows you to only show sites with actively deployed buoys.

Aqualink map and deployed buoys only filter

Deployed buoys have near real-time temperatures relayed back from these devices which generate a sea surface as well as the temperature at depth readings.

pyaqua: Simple Command Line Interface (CLI) for Aqualink org

The pyaqua command-line tool written in python is not an official tool but an unofficial one that allows users to interact with the map and is designed to work with sites and locations with buoys. The motivation to build this tool was to allow users to programmatically interact with the aqualink data dashboard and map as well as allow users to explore and filter data as needed for further analysis.

The blog performs a side by side example application of both the map and the command-line tool for the user to follow

pyaqua command-line tool

This tool is useful if you have more than one site of interest and includes some useful tools for interacting with and exporting data. Since it is not built on any available or standard API, some things might break over time and I will try my best to maintain and improve on this over time as well. You can find the GitHub page for the project here and the most updated version of the readme on the readme page here. You can star and follow the project for more updates as updates and releases are made.

Installing pyaqua

This assumes that you have native python & pip installed in your system, you can test this by going to the terminal (or windows command prompt) and trying

python and then pip list

pyaqua only supports Python v3.7 or higher

You can install using two methods.

pip install pyaqua

or you can also try

git clone
cd pyaqua
python install

For Linux use sudo or try pip install pyaqua --user.

I recommend installation within a virtual environment. Find more information on creating virtual environments here.

Accessing site level data on aqualink map

Site-level data are available for multiple sites and again here we are selecting from the list of deployed buoys to make sure we have both remote sensing estimates as well as in situ measurements of ocean conditions at locations.

site level information from the aqualink map

Accessing site level data using pyaqua (Site list ,Info & Live tools)

The site-list and site-info tools are designed to filter for and find sites with buoys as well as provide information about a specific site.

pyaqua site list

The site-list tool returns a site name and site id pair and the rest of the tools rely on this site id to perform operations, so keep that one handy. Site-info tool can now use the site id to get information about a specific site including data like historical information as well as site admin data for use.

pyaqua site info for a given site

Site live tool simply generates a quick snapshot for your site and returns a JSON output for the site

pyaqua site-live tool

Exporting site level time-series data using aqualink

One of the best features of the overall setup is the capability for the user to choose a time range of choice and then be able to export out the dataset from these sites as a CSV file. For now, the export is limited to Sea Surface Temperature (SST), and for sites with spotters or buoys with temperature measurement at two depths, you get both BottomTemp as well as Top Temp.

Exporting time series data for a given site with a year range of data

You can modify the date range to get longer queried results depending on when the spotter was deployed and or removals for maintenance or any other similar purpose.

Exporting site level time-series data using pyaqua

A site-timeseries tool is included as part of pyaqua and designed to allow the users to export time-series datasets for varying time periods. The results are written as CSV files to a given folder across different parameter types. The tool additionally allows you to export not just temperature but also wind/wave and other data type information are available for the given site. You can also choose to only export specific data type subsets like wind/wave/temp/sat_temp and so on

pyaqua time series tool

Accessing Heat Stress & site alerts in aqualink map

One of the best features on aqualink is the Heat Stress alert dashboard that gives you a glimpse of all sites categorized by heat stress and alert levels. You can go to the heat stress dashboard here. If you are signed in you can create your own collections and instructions can be found here

Heat stress tracker @

The sites include both sites with and without buoys based on satellite estimation of heat stress for all sites and can be sorted and additional information per site is available for users to dig into the specific location.

Accessing Heat Stress & site alerts using pyaqua

The pyaqua tool connects to the heat stress dashboard and filters out and selects only those tools with buoys. The tool can be extended to include all sites in the future but for now, this is to maintain a consistency that all of pyaqua is designed for only those sites with buoys.

pyaqua site alert tool

This can be used as a helper tool along with the site list tool to list only those sites which have an alert level greater than a specific value for example on a given day.

Over the last couple of months, I have tried to take deeper dives in understanding open oceans better, these massive and complex systems that shape climate patterns to formations and degradations of landmasses and ecosystems. This follows my first article on another tool called pycoral to gather data from Allen Coral Atlas, you can read about Chasing Coral data here and pyspotter for SofarOcean spotters. You can read about my open oceans project here. Building a better understanding of oceans is critical, it is tied to measuring challenging amounts of data from harsh environments.

Access isn’t the same as accessibility

There is always more than one way for users to access, read and then follow up on the data. Asking the right questions, tinkering on solutions, and building as a community are what allow growth in different domains. As a bonus the time series export of these datasets can be brought directly into platforms like Google Earth Engine to mix and match with other datasets

pyaqua exported Temperature data from a single buoy in Google Earth Engine (GEE)

Efforts like the aqualink project are creating an opportunity for users to be part of the network. It is helping bridge the gap between data-rich and data-poor geographies and allowing for hopefully a more sustainable model. I can only hope that many others will follow the same approach. For now, my open ocean toolkit grows as I contribute to the pyaqua tool and many more.




Remote sensing applications, large scale data processing and management, API applications along with network analysis and geostatistical methods

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Overview Python (and non-Python) Mapping Tools for Data Scientists

Machine Learning During the Coronavirus

Subways in NYC: Who has Access?

A mathematical model and forecast for the coronavirus disease COVID-19 in the World

How I passed the Microsoft Azure Fundamentals Exam

Microsoft data analytics

Example of the Prediction recipe

Using Machine Learning to Solve Data Reconciliation Challenges in Financial Services

Tableau’s Buffer Function

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Samapriya Roy

Samapriya Roy

Remote sensing applications, large scale data processing and management, API applications along with network analysis and geostatistical methods

More from Medium

Vanilla RNN vs LSTM for Flood Forecasting

Generating 2D geo channels: stohastic object-based modeling

Open Ocean Data: with Argofloats Tool

Spatially align a time-series stack of ICEYE SAR images with a dockerized ESA SNAP routine

Image by Ajda ATZ on Unsplash