Sandro Fiore, Donatello Elia, Fabrizio Antonio, CMCC and Tobias Weigel, DKRZ
Ignacio Blanquer, Miguel Caballer, UPV
Giuseppe La Rocca, EGI
In the latest release of the Elastic Cloud Compute Cluster (EC3) platform, tailored to support the EGI Applications on Demand (AoD) service, a new cluster configuration type is now available for researchers interested to deploy ECAS cluster on the EGI Infrastructure.
The ENES Climate Analytics Service (ECAS), setup in the context of the EOSC-Hub project by CMCC and DKRZ, enables scientific end-users to perform data analysis experiments on large volumes of multidimensional data (e.g. NetCDF data format), by exploiting server-side, in-memory and parallel approach.
The service aims at providing a paradigm shift for the ENES community with a strong focus on data intensive analysis, provenance management, and server-side approaches as opposed to the current ones that are mostly client-based, sequential and with limited end-to-end analytics capabilities. ECAS consists of multiple integrated components, centered around the Ophidia High Performance Data Analytics framework, which has been integrated with B2DROP, ESGF, IAM, Onedata (DataHub), EGI Federated Cloud, JupyterHub, and the ECAS-Lab web portal.
Thanks to the EC3 platform, operated by the Polytechnic University of Valencia (UPV), researchers will be able to exploit the EGI Cloud Compute service to deploy on demand ECAS clusters without worrying about the complexity of the underlying Infrastructure.
The integration of ECAS in the cloud-based resources provided by EGI allows users to easily deploy a full ECAS elastic cluster (composed of multiple nodes) in the cloud resources of the EGI Federation customized to their requirements. The EC3 service will take care of automatically installing and configuring the whole ECAS environment stack, including services and tools such as JupyterHub, PyOphidia, a rich set of data science Python libraries, the Ophidia HPDA framework, as well as a comprehensive set of Jupyter Notebooks for training.
The ECAS cluster allows scientists to:
Furthermore, through an Ansible recipe, EC3 can elastically scale up/down the ECAS cluster size according to the current user workload. It is possible to configure and deploy an ECAS Virtual Elastic Cluster using EC3aaS from the EC3 platform front page.
An ECAS virtual appliance is also available from the EGI AppDB as a single, ready-to-use, Virtual Machine Image (VMI). The VMI has been assigned to a set of trusted VOs in order to be deployed on the EGI FedCloud.
In this way a user can easily deploy a pre-built, self-contained, ECAS VMI and access it via SSH to start its own data analysis activities.
Finally, it is worth mentioning the relevance of such features in the EGI FedCloud for training activities as well the added value and scientific environment it can deliver to the wider climate scientists community.
Do you want to deploy a virtual cluster with the ECAS environment in EGI?