EGI Federation Home
Updated 24/11/2023

Scaling New Heights with Pangeo & OpenEO: PANGEO@EOSC meets Earth Observation Experts at BiDS 2023

A Data Adventure from Space to Workshop Space

Guest post by Anne Fouilloux and Tina Odaka (PANGEO)

Big Data from Space 2023 (BiDS), now in its 6th edition, focuses on impactful themes, including the intersection of big data with society's grand challenges like climate change and key policies such as the EU Green Deal and UN 2030 Sustainable Development Goals. More information can be found on the BiDS’23 website.

Pangeo & OpenEO Tutorial at BiDS’23

This year, BiDS’23 kicked off with a dedicated day of Satellite Events on Monday, November 6th where Pangeo and openEO offered a full-day training on the Pangeo and openEO ecosystems for developing efficient Big Earth science data pipelines. The training material was collaboratively created from past training material and compiled together in one single training available under CC-BY-4 License:

Emphasis was put on the complementarities of the two ecosystems with the goal of teaching attendees how to fully exploit both frameworks to run complex data workflows. Attendees learned about open, reproducible, and scalable Earth science. The workshop was designed to help anyone interested in starting their journey with Pangeo and OpenEO while avoiding common pitfalls.

The full day training was organised in 3 parts:

  • Part-1: Introduction to Pangeo
  • Part-2: Introduction to OpenEO
  • Part-3: Unlocking the Power of Space Data with Pangeo & OpenEO

The complete timeline can be found here.

Challenges of Teaching Big Data to 30+ Enthusiastic Learners in the Era of Scale and Speed

Developing collaboratively training material between two communities was the first challenge we faced. While OpenEO and Pangeo are complementary when it comes to software stack and approaches, it is still difficult to agree on what should be taught and how it should be presented. However, we believe we successfully compiled relevant content for participants.

The second challenge we faced was to develop training material that participants could re-use in many existing platforms. In preparation forof the workshop, we tried a number of existing platforms:

Most of the Pangeo & openEO training materials can be executed on the 3 different platforms. However, the Copernicus Data Space Ecosystem is very limited in terms of compute & storage (for instance, there is no Dask Gateway) and OpenEO platform is not free of charge (only 30- day trial or sponsored license). Therefore, to show how to scale with the Dask Gateway, Pangeo@EOSC became the most cost- effective and relevant platform to use for teaching.

In the era of scale and speed, it seems easy to deliver training to 30+ participants in a conference. However, it is important to have an infrastructure that can serve a sufficient number of users who will all do the same exercises and access the same data at the same time.

Pangeo@EOSC can offer on-demand the needed resources for delivering training. We call it “Pangeo Training Infrastructure as a Service (PTIaaS)”, a dedicated infrastructure for conducting training sessions and workshops on the Pangeo ecosystem. More information on the PANGEO@EOSC platform and PTIaaS can be found in the BiDS’23 proceedings

During the workshop, the bottleneck was not Pangeo@EOSC itself but the low internet speed provided by the conference venue. Some exercises took longer than expected but overall all 30+ participants managed to try out exercises and we had lively discussions.

Impact

The training was well attended with 35 participants. At the end of the training, we asked participants to fill a feedback form to help us improve the development and delivery of such training. Below are some statistics (20 attendees answered among the 35 participants) about the training:

Likely: 12 responses – 60%

Very likely: 5 responses – 25%

Neither likely nor unlikely: 2 responses – 10%

Very unlikely: 1 response – 5%

Yes: 12 responses – 60%

Partially: 7 responses – 45%

No: 1 response – 5 %

Somewhat satisfied: 10 responses – 50%

Very satisfied: 9 responses – 45%

Somewhat dissatisfied: 1 response – 5%

Services provided by EGI

The Pangeo & OpenEO communities have successfully run a workshop using PANGEO@EOSC that makes use of the following EGI services:

EGI Federation member: CESNET

EGI Federation member: CESNET

EGI Federation member: GRNET

Read more

OpenEO

OpenEO is a procurement action funded by ESA that aims to establish an European cloud...