Promoting Open Science through the operation of a truly innovative Data Infrastructure service

D4Science is an organisation that has been offering data infrastructure since 2014, with very powerful capabilities that consist of more than 3,650 CPU cores. It connects more than 18,000 scientists from 50 countries and integrates more than 50 heterogeneous data providers. Thanks to this, D4Science hosts more than 175 Virtual Research Environments (VREs), which serve the biological, ecological, environmental, social mining, cultural heritage, and statistical communities worldwide.

D4Science is hosted by the Istituto di Scienze e Tecnologie dell’Informazione (ISTI) of the National Research Council (Italy).

The Challenge

The communities targeted by the D4Science Infrastructure comprise a number of research initiatives and innovative projects such as iMarine, SoBigData, Blue-Cloud, OpenAIRE, and EOSC-Pillar, amongst others, which perform data analysis in a wide variety of disciplines, including biological, social, geothermal, satellite and environmental fields. The broad nature of these initiatives means that they have very diverse requirements, but they share common computing needs. With EGI’s collaboration, the communities supported by D4Science can benefit from a European-based Authentication and Authorisation Infrastructure (AAI) that is integrated with a large number of research and academic institutions, to verify the digital identities of their users. This establishes a common AAI framework for the authorisation of resource use. EGI also enhances D4Science researchers with specialised data analysis solutions for advanced science through EGI Notebooks, which are managed and provided from European infrastructures that are close to the source datasets.

The solution

Authentication and Authorisation Framework

D4Science has benefited from EGI Check-in to establish an Authentication and Authorisation framework. During this collaboration, D4Science has been integrated with EGI Check-in as a “Community AAI”, enabling the use of D4Science credentials to access other systems, such as, for example, the EOSC portal.

EGI Notebooks

The collaboration with D4Science has also contributed to helping EGI gather valuable input from the scientific communities as well as driving the technological evolution of the EGI Notebooks service. Thanks to this mutual collaboration, the EGI Notebooks service is today one of the most demanded EGI services by the scientific community, and provides advanced data analysis capabilities.

Virtual appliances

D4Science has also used EGI Cloud compute and storage resources to deploy some specific D4Science Virtual Appliances (which are Virtual Machine images with pre-configured software), which are hosted in the EGI AppDB, EGI’s software repository.

Services provided by EGI

Collaboration Projects

As a hybrid data infrastructure capable of dynamically deploying Virtual Research Environments (VREs), D4Science provides user-friendly application environments for user communities. By integrating EGI services, D4Science is able to provide more advanced service solutions to meet scientific communities’ needs.


AGINFRA+ (2017-2019) was a H2020 project to overcome interoperability  problems in the fields of Food and Agriculture. It addressed the existing drawbacks of having heterogeneous research data scattered in disparate data repositories, tools relying on local computing environments, and non-unified workflows, by implementing Virtual Research Environments (VRE). These VRE are Web-based working environments, specialised in research communities with similar interests and goals, and supported by computing infrastructures that provide data services, computing power and hosting machines, which were  offered by EGI and D4Science.



SoBigData++ is a four-year project (2020-2024) that brings together 31 partners from across Europe to provide a multi-disciplinary research infrastructure for data mining and big data analytics in the context of social data.

SoBigData++ aims to deliver concrete tools that operationalise ethics with value-sensitive design, incorporating values and norms for privacy protection, fairness, transparency and pluralism.

The role of EGI is to support the operations activities of the SoBigData++ e-infrastructure, integrating new and existing services and enhancing them quantitatively and qualitatively.


140 users per month

From the Netherlands, Italy, Spain and Germany

1,2 millions (Cloud) CPU hours

Over the last six years. Check metrics on the EGI Accounting Portal

"Digital infrastructures are increasingly complex and interconnected to offer high quality of service with an affordable cost of operation. The collaboration between EGI and D4Science is a perfect example of this type of collaboration. The continuous growth of users recorded by D4Science is also the result of this collaboration which allows users to log in with their institutional credentials and then use services operated with high reliability. This is the case of Jupyter Notebooks, which is operated on resources provided by D4Science with technology and expertise provided by EGI. A long collaboration that will expand in the near future with other EGI services that will be integrated and made accessible by D4Science." - Pasquale Pagano, D4Science Technical Director

