Jupyter Notebooks for Capacity Development Webinar
Posted on July 30, 2021 (Last modified on November 8, 2023) • 4 min read • 771 wordsThe CEOS Working Group on Capacity Building and Data Democracy and the Working Group on Information Systems and Services ran a joint webinar on Jupyter notebooks for Capacity Development.
CEDA represents UKSA/NCEO on the Committee Earth Observation Satellites (CEOS) - this group hosted a virtual event called: Jupyter Notebooks for Capacity Development Webinar in July 2021.
Further details about the event are below.
You can see a recording from the webinar here.
The CEOS Working Group on Capacity Building and Data Democracy and the Working Group on Information Systems and Services ran a joint webinar on Jupyter notebooks for Capacity Development. The webinar aimed to introduce space agencies and environmental organisations worldwide to Jupyter Notebooks and take a tour of emerging services from CEOS agencies and their applications. We illustrated how they can be used to support capacity development and the exploitation of Earth Observation data by a broad range of users. There were two sessions via zoom to allow for global attendance.
What are Jupyter Notebooks, and why they have the potential to support capacity development?
We will provide an overview of what a Jupyter Notebook is and the benefits of using one. We will demonstrate using a simple example of plotting sea surface temperature data, navigating an archive and adjusting the colour scale. We continue by describing the ways it is supported by open-source resources and by different types of platform/environment. We then discuss how collaborative research and activities such as international Hackathons can also be supported.
Jupyter Hub and Notebooks on Data Analysis Platforms:
We take a look at two examples from the UK’s JASMIN Jupyter notebook service, which can access over 20 Petabytes of data on the CEDA archive. We take a look at the Sentinel 5p global archive of data and demonstrate how by using a very basic notebook, we can explore questions such as how did pollution levels change in large cities during the Pandemic. We then continue by looking at the smaller scale specialist example of regional NCEO biomass maps. We can demonstrate how in addition to obtaining domain-specific information from data, we can also train users on some technical aspects such as libraries, modules and shapefiles.
Open Data Cube and Google Earth Engine – A Jupyter Notebook Sandbox Demonstration
The Open Data Cube (ODC) Google Sandbox is a free and open programming interface that connects users to Google Earth Engine datasets. The open-source tool allows users to run Python application algorithms using Google’s Colab notebook environment. This demonstration will look at two examples Landsat applications focused on scene-based cloud statistics and historic water extent. Basic operation of the tool will support unlimited users for small-scale analyses and training but can also be scaled in size and scope with Google Cloud resources to support enhanced user needs.
ESA PGDS data cube and Times series Data
The ESA PDGS Data Cube is a pixel-based access service that enables human and machine-to-machine interfaces for Heritage Missions (HM), Third-Party Missions (TPM) and Earth Explorer (EE) datasets. The pixel-based access service provides the users with advanced retrieval capabilities such as time series extraction, data subsetting, mosaicking, band combinations and indexes generation (e.g. NDVI, anomalies, …) directly from the EO-SIP packages with no need of data duplication or data preparation.
In addition to the Explorer web-based graphic user interface, the ESA PDGS Data Cube service also provides the Jupyter processing environment to allow users to import, write and execute code that runs close to the data. This demonstration will showcase how to retrieve Soil Moisture time-series using the Jupyter environment in order to generate thematic maps (monthly anomalies map) over an area of interest. The benefit of using the pixel-based service with respect to traditional access services in term of resources usage will be also highlighted.
Earth Analytics and Interoperability Lab – Big Data Processing
The CEOS Earth Analytics Interoperability Lab (EAIL) is a platform for CEOS projects to test interoperability in a live EO ecosystem. EAIL is hosted on Amazon Web Services and includes facilities for Jupyter notebooks, scalable compute infrastructure for integrated analysis and data pipelines that can connect to new and existing CEOS data discovery and access services. This demonstration will show how we use Jupyter notebooks with the python Dask library to efficiently compute and perform large-scale analyses (10s GB) with interactive plotting and scalable compute resources in EAIL.
Capacity Development Panel and Discussion
Kenton Ross from NASA’s Capacity Building and Applied Sciences Program will lead discussions with a panel of international experts. We will explore the needs of EO data users worldwide and how this can be supported by CEOS agencies.
Speakers: Kenton Ross (NASA), Yousuke Ikehata (JAXA), Esther Conway (NCEO/UKSA), Brian Killough (SEO/NASA), Giuseppe Troina (ESA), Matt Paget (CSIRO)