What we learned from the EOSCpilot Science Demonstrators

Giuseppe La Rocca summarizes the recommendations for EOSC

The EOSCpilot project was set up to support the first phase in the development of the European Open Science Cloud. As part of this ambition, the project selected 15 Science Demonstrators from different domains to pilot actual implementations of the EOSC Service Portfolio.

The demonstrators were chosen to provide insight on technical and policy needs and prioritise the integration of the EOSC services to meet requirements.

The Science Demonstrator activity involved many experts from 17 institutions working together to analyse requirements, link research communities to compute and storage providers and to manage the technical infrastructures needed for co-design and piloting.

The activity brought many benefits for the communities involved. For example: the EPOS-VERCE and ViSIVO Science Demonstrators shared best practices to connect their frameworks with the EGI Federated Cloud Infrastructure and allow users to run scientific workflows on the cloud computing infrastructure using their federated credentials. Dedicated Virtual Research Environments (VREs) were set up to help members of the Social Science Communities to share and visualize media files on the web and implement the semantic enrichment of text sources.

The EOSC cloud infrastructure was adopted to scale up the execution of scientific workflows and pipelines. In particular, it contributed to facilitate the set up of a cloud-based workflow system to execute genomics analysis that, ultimately, can contribute to improve patients health care, and enable users of the fusion community to reproduce science efficiently.

Many Science Demonstrators focused on the implementation of the FAIR principles.

  • CryoEM extended the Scipion framework to support  reproducibility by sharing of detailed information on cryo-electronic microscopy image processing workflows.
  • EGA Life Science Datasets managed to reproduce and remaster biological pipelines combining Docker container solutions with Nextflow, an emerging language aimed to ease the interpretation of scientific workflows.
Recommendations for EOSC

The impact of the Science Demonstrators goes beyond the positive effect they had on their communities of practice: what we, the data, services and e-Infrastructure providers, learned from the activity is of great value to the implementation efforts of the EOSC.

And from this work we can list the following recommendations:

  • Implement a distributed data and compute infrastructure providing high performance access to data transfer, mirroring and caching, supported by high speed network connectivity.
  • Procure and offer EOSC as a federated infrastructure that integrates existing community resources (data, applications, software, storage & computing) and provides additional adequate capacity to scale up existing in-house IT infrastructures. Procure EOSC as a high-capacity system that meets the demands of data intensive science.
  • Provide easy-to-use environments such as scientific gateways, Virtual Research Environments, as managed services to provide integration as turn-key solution; make service descriptions discoverable; offer ready to use integrated bundles of services with low-barrier procurement processes.
  • Provide a federated Authentication & Authorisation solution to allow users to access services and resources from different providers with the same credentials.
  • Provide and sustain human networks through competency centres of experts working with scientific application developers in close cooperation.
  • Provide support for running standard-based workflows.
  • Promote tight integration between services and providers.
  • Extend the FAIR concepts currently applied to data to IT services. Propose a set of recommendations for making services FAIR, or to further enable services to make data FAIR.
  • Include analysis of network requirements, specifically when designing the interoperability of services across sites and organisations.

Although the work on EOSC is still very much a work in progress, we can already see some of these recommendations coming into play. For example: federated log-ins are available on the EOSC Marketplace and  many pilot services developed during the EOSCpilot project have enabled federated authentication access for their end-users and have started to offer EOSC services to their scientific communities. As result of the EOSCpilot project, many pilots started the registration in the EOSC Marketplace as service provider.

We hope to see more following soon.

Science Demonstrators moving to pre-production

Towards the end of the project the following demonstrators were selected to continue their work-plans for three additional months  in order to move their pilot services into a pre-production phase:

  • TextCrowd (Social Sciences): a text mining solution to semantically enrich text sources and make them available on the EOSC.
  • Photon & Neutron (Physics): aims to create a virtual platform where data and analysis tools can be made available to scientists all over the world.
  • Prominence (Energy Research): provides access to HPC class nodes for the Fusion Research community through a cloud interface.
  • LOFAR (Astronomy): provides easy access to LOFAR data and knowledge extraction through Open Science Cloud.
  • VisualMedia (Social Sciences & Humanities): a service for sharing and visualizing media files on the web.
More information:

Giuseppe La Rocca is part of the EGI Foundation User Support team and led the EOSCpilot Science Demonstrators activity.

EOSCpilot project

The Science Demonstrators

[mc4wp_form id=”48866″]