14 EGI-ACE use cases currently supported, submit yours now!

As you may have noticed, the EGI-ACE project continuously runs open calls to offer access to infrastructure and platform services, and dedicated user support and training. Since the first open call (cut off date April 15, 2021) the project has received a total of 20 use cases. 14 of the submitted applications have already been accepted and supported.

Click on the name of the use case to learn more about the objective, how the EGI-ACE project supports and who the targeted beneficiaries are.

AiiDAlab

Objective

Running advanced scientific software requires expert knowledge which comes along with several challenges. To mitigate these, the AiiDAlab web platform has been created, allowing any interested researcher to work with advanced simulation tools. AiiDAlab provides an infrastructure for computational scientists to develop, execute, and share computational workflows as intuitive web services based on Jupyter Notebooks. 

 

How EGI-ACE supports

EGI-ACE will be able to help procuring the resources needed to rapidly scale up this demonstration service such that it can be comfortably advertised to the general scientific community. The expectation is to support at least 100 concurrently active user sessions at peak times and on the order of 500 user accounts in total (a ten-fold increase). The EGI Notebooks service will also be expanded to add AiiDAlab as a workflow layer.

 

Beneficiaries

Currently, AiiDAlab applications are focused on Materials Science. Therefore, the researchers from this domain will mostly benefit from the freely available service.

AMBER-based modelling of SARS-CoV-2 Spike protein (CNR)

Objective

The SARS-CoV-2 Spike protein is a glycoprotein that plays a key role in the receptor recognition, viral attachment, and entry into host cells. The AMBER-based modelling study evaluates the occupancy of the spike protein and will identify glycan holes, providing opportunities for inactivators to bind.

 

How EGI-ACE supports

Implementation  of  the  proposed  use  case  is  bound  to  access  modern  and  powerful GPU devices as well as suitable storage and transfer data services. In particular, to implement the use case 10 virtual machines (VMs) would be needed provided by  the  EGI cloud  federation.  Of  these  VMs,  7  of  them  will  be  dedicated  to  run  dynamic simulations  on  known  drug  protein  complexes,  whereas  the  other  3  VMs  will simulate dynamics of polyanionic polymers protein complexes.

 

Beneficiaries

First beneficiaries of the use case will be researchers of a consortium of three Italian research teams, respectively working at the Institute of Biomedical Technologies of the National Research Council (CNR ITB), at the University of Brescia (UniBs), and at the University of Genoa (UniGe). Results obtained through the implementation of the proposed use case will be the basis for the implementation of specific research studies aimed at identifying antiviral drugs and anionic polymers targeted to inactivate the Spike protein.

ARTICONF project (Agilia)

Objective

The objective of the use case is to create an AI-driven mobile app that mixes car sharing and carpooling models allowing cost sharing and safe travel through direct trustworthy transactions among passengers.

 

How EGI-ACE supports

For the blockchain network the availability of services need to be tested along with the maximum throughput achieved in the network. A total of 5 VMs will be needed.

 

Beneficiaries

Agilia Center SL will be the main beneficiary akong with the service end users. This will be used to test our use case with the market through end users testing campaigns. The end users are mainly people who need to drive a car for a period of time or want to share a ride with a driver. A new market will be targeted (peer to peer car sharing and carpooling market) with an expected growth this year of about 11%.

eHoney platform (University of Bologna)

Objective

The  main  scientific  objective  of  the  use  case  is  to  implement  an  innovative computational procedure to indirectly describe the effect of climate change on biodiversity using environmental DNA (eDNA) information (data and metadata), taking advantage from peculiar archived eDNA collector specimens, i.e. honey samples produced over the last decade.

 

How EGI-ACE supports

The datasets locally available includes about 8-10TB of metagenomics raw data that  will  be  processed. Including  the  outputted  results,  an  estimated  total  of 30TB will be required by the project. Considering that each analysis can require up  to  40GB  of  ram  and  8CPUs,  the  EGI  Cloud  Compute  or  the  EGI High-Throughput  Compute  service  would  be  among  the  best  options.  For  data transferring the EGI Data Transfer service could support.

 

Beneficiaries

The beneficiaries of the use case will be the Animal and Food Genomics Group of  the Department  of Agricultural  and  Food  Sciences  of  the  University  of Bologna  (Bologna,  Italy).  The use case will set a new avenue to analyse the effect of climate change on the environmental biodiversity using retrospectively archived honey samples. This approach is highly innovative and will contribute to Open Access and FAIR with  the  unique  next generation  sequencing of large  datasets  from  honey environmental DNA, the developed pipelines and the constructed databases that will  be  generated  by  the  use  case.

Large sample testing of high-resolution distributed hydrological model

Objective

The objective is to perform large sample testing of high-resolution distributed hydrological models to identify where process description or used datasets (forcing, geofabric, etc) fall short and need improvement. This in light of obtaining a detailed understanding on how the seasonal scale forecasts and/or climate change scenarios impact the hydrological response, its interactions with the atmosphere and the occurrence of floods and drought.

 

How EGI-ACE supports

There is a need for the EGI-ACE project to support converting the Azure pipeline to the EOSC to be able to execute the workflows in parallel independently. This will require compute time, fast data storage, access for multiple persons including MSc and PhD students.

 

Beneficiaries

The setup will be fully reproducible, and results will be made available according to the FAIR principle (including access to the dockerized model and workflow). The aim is to involve students from various universities in the research activities. Hereby, training the next generation of computational hydrologists.

OGC Sensor Things API for Citizen Science (Cos4Cloud project)

Objective

In Citizen Science, very many different APIs and data models are used that hinder the easy use of data across different operators. In Cos4Cloud, we strive to create a federation of Citizen Science portals and an Expert Portal that exchanges citizens’ observations for the purpose of quality assurance. The scientific objective is to study whether the developed extension to the OGC Sensor Things API for Citizen Science is fit for purpose for large data sets coming from different domains such as environmental and biodiversity

 

How EGI-ACE supports

The project will offer support by loading Balancer with Kubernetes or Docker enabled nodes plus high speed storage (data buckets) with the Postgres Database System (Master – Slave). There is a need for high ingress and egress bandwidth to support large scale data loading and performance tests.

 

Beneficiaries

The beneficiaries will be international research and academia as the international exchange of citizen science data supports global research on use cases of common interest.

OpenBioMaps

Objective

The  basic  goal  of  OpenBioMaps  is  to  build  relationships  and remove   technical barriers   between communities   that   produce   and   use biodiversity  data.  It  provides free  services  and  open-source software  for researchers   and   conservationists.

 

How EGI-ACE supports

Three virtual servers have been requested to support, a OpenBioMaps database server and two computational servers.

 

Beneficiaries

The new OpenBioMaps database and computational nodes will target new research teams and conservation institutes from Croatia, Slovakia, Poland, Greece, the United Kingdom and Germany. In Hungary, OpenBioMaps is used in 9 National Parks, which have many international nature conservation and scientific connections with foreign nature conservation institutions and research sites.

Perovskite material studies

Objective

The scientific objective concerns the characterisation of halide perovskite materials, interfaces and defects using molecular dynamics (MD). These materials are very promising for solar cells, but they are degrading fast. This study will lead to a more clear understanding of the stability and the degradation mechanisms in these materials. Ion migration is of particular interest and will be the main focus of MD simulations.

 

How EGI-ACE supports

In order to process multiple instances by MD calculations, the research requires an extension of the current resources which the EGI-ACE project can offer.

 

Beneficiaries

The beneficiary of the MD calculations will be the group at the Reykjavik University (the Nanophysics Center) as well as the group from DFCTI/IFIN-HH, Romania, and the group from the National Institute of Material Physics, Romania within the SEE project “Towards perovskite large area photovoltaics”.

Protein pKa and isoelectric point calculations

Objective

The objective is to Establish an easy-to-use cloud service that allows for fast pKa and isoelectric point calculations using user-provided protein structures or those obtained from the Protein Data Bank. The goal is to make PypKa the go-to solution for these calculations, building upon its high accuracy and computational speed. Additionally, the research aims at building a large dataset of pKa values and isoelectric points, which will be pivotal to train machine-learning algorithms.

 

How EGI-ACE supports

A request has been made for access to 200 virtual cores to be shared between the PypKa server and the pKPDB database. 5 TB of disk storage are also needed mostly to accommodate the database.

 

Beneficiaries

The PypKa publicly available cloud service will benefit a variety of scientific researchers world-wide, with an emphasis on structural biologists and bioinformaticians. Since the tool can estimate the isoelectric point of proteins, it is also useful to experimentalists working in the field of protein sciences.

Scalable Jupyter backends to complement data holdings in the CS3mesh4EOSC mesh

Objective

Starting from scientific data already held in the node of the CS3mesh4EOSC consortium, there is a need to ensure that users can select the correct VREs and scientific workflows to operate on this data, seamlessly. Given that EGI is the eInfra partner for the operation of the VRE instance of many ESFRIs, it stands to reason to try to connect the CS3mesh4EOSC data holdings with a generic test case for EGI’s method of running VREs.

 

How EGI-ACE supports

The project has been requested to provide access to a Jupyter notebooks cluster with at least 200 cores, a minimum of 5 co-located VMs (4CPU, 8 GB min) that can run interoperability middleware (to be experimented with between EGI and CS3mesh4EOSC), federated authentication harmonised between CS3mesh4EOSC and EGI identities, a sandboxed part of the namespace on the EGI cvmfs stratum-0 (for definition of VRE environments and distribution to Jupyter) and DODAS / Apache Spark, to experiment with jobs started at a jupyter in site-A and using spark parallelism at site-B (as a means of compute scale-out).

 

Beneficiaries

The compute platform has the potential to be rolled out to the entire CS3mesh4EOSC constituency – currently the entire constituencies of SURF, DeIC, CESNET, PSNC, CERN, SWITCH, AARNet and the European Commission’s JRC.

Fermi-LAT data analysis and interface with Italian mirror data archive at SSDC

Objective

The analysis of scientific data taken with a generic gamma-ray telescope is dominated by the low statistics, each photon must be treated as a single particle with a determinate energy and an incoming direction. This makes the gamma-ray data analysis very time-consuming especially when the integrated observation is about years, then billions of photons, as the case of the Fermi Large Area Telescope onboard the Fermi Gamma-ray Science Telescope (Fermi-LAT). Resources devoted to the optimization of the time data analysis is fundamental for the Fermi-LAT science. In this particular science case with new resources, we can easily manage the integration of years of data and interface directly with the official Italian Mirror data archive of Fermi-LAT data hosted at the Space Science Data Center in Rome (SSDC).

 

How EGI-ACE supports

The beneficiaries of the use case will be Italian group of Fermi-LAT collaboration composed of INFN and University researchers. Moreover the proposed use case should be beneficial for either training new students and to establish a reusable analysis model for further activities such as those related to the new satellite-born gamma-ray observatories (i.e. AMEGO). Those are strategic assets because the team is proposing a very flexible solution to improve the data processing of FERMI-LAT and this will allow it to easily extend the pipeline to new paradigms based on Machine Learning. On a large perspective, this use case can be extended to the whole international community interested in astrophysical data analysis. Furthermore the release of new gamma-ray sources catalogs with a dedicated database of science-ready Fermi-LAT products will be publicly available.

 

Beneficiaries

The beneficiaries of the use case will be Italian group of Fermi-LAT collaboration composed of INFN and University researchers. Moreover the proposed use case should be beneficial for either training new students and to establish a reusable analysis model for further activities such as those related to the new satellite-born gamma-ray observatories (i.e. AMEGO). Those are strategic assets because the team is proposing a very flexible solution to improve the data processing of FERMI-LAT and this will allow it to easily extend the pipeline to new paradigms based on Machine Learning.

MATRYCS

Objective

MATRYCS aims to develop a data-driven Reference Architecture for AI-based scalable big data management & analytics in smart energy-efficient buildings.

 

How EGI-ACE supports

EGI-ACE will contribute offering advanced solutions (e.g. : EGI Cloud Compute and EGI Container Cloud Compute services) and allocating computing resources (25-30 VMs, 100-130 vCPU cores, 200-230 GB of RAM, 0.5 TB of local disk storage, and 4-5 TB of block storage) to facilitate the development of the data driven reference architecture. 

 

Beneficiaries

More than 160 organisations have been listed as potential supporters, promoters and even advocates of the MATRYCS. These have been categorised into the following target groups and related main benefits:

  • Policy Makers & Standardisation Organisation
  • Energy sector
  • Buildings and construction sector
  • Financial sector
  • Industry Associations & Technology Clusters
  • IT industry Players and SMEs, and
  • Researchers and Academia
MINKE project

Objective

The H2020 MINKE project integrates key European marine metrology research infrastructures, to coordinate their use and development and propose an innovative framework of “quality of oceanographic data” for the different European actors in charge of monitoring and managing the marine ecosystems. MINKE proposes a new vision in the design of marine monitoring networks considering two dimensions of data quality, accuracy and completeness, as the driving components of quality in data acquisition.

 

How the EGI-ACE project supports

The following baseline resources on cloud-based infrastructure are requested; however, additional services such as authentication and authorization infrastructure and cloud-based notebooks will be investigated for integration.

  • Resources in two geo-distributed data centres, with at least one of them delivering GPU-based resources (for AI-based workloads).
  • Approximately 16 virtual machines in each data centre delivering 96 vCPUs, 192GB RAM, totalling 192 vCPUs and 384TB RAM. The baseline configuration for most of the VMs is expected to be 8 vCPUs/16GB RAM.
  • 2TB of initial storage per data centre, totalling 4TB. The storage capacity may need to be increased but it is not expected to go beyond 10TB. 
  • At least two VMs will be required to be on GPU-enabled resources.

 

Beneficiaries

MINKE (combining metrological and participatory systems) are of European and worldwide interest. Their involvement in many national, European and international projects or ERICs with academic and private partners, and the support provided by their high-level scientists, guarantees their attractiveness to significant numbers of users from the wider scientific communities in Europe and elsewhere where research for improving marine observation sustainability is limited by the lack of appropriate RI.

PLOCAN

Objective

Collect acoustic recordings from Atlantic stations, study the noise pollution and produce a noise-monitoring platform. This platform will be offered to end-users to produce noise maps and risk assessment of the ocean.

 

How EGI-ACE supports

The EGI-ACE project will contribute provisioning cloud compute resources and provide technical support for hosting the noise-monitoring platform. More specifically, this platform will be composed of a Kubernetes cluster and several JupyterHub notebooks for processing the data collected from the different stations and displaying the noise map. Access to this noise-monitoring platform will be secured via the EGI AAI Check-In service.

 

Beneficiaries

The main beneficiaries of this application are the stakeholders of the JONAS project including the scientific community and the EU government agencies.

 

Next cut-off date

Interested to submit your use case? The next cut-off date will be on the 15th of December 2021. Please note that the cut-off dates are set every two months and the call will continue to be open for the entire duration of the project.