EOSCpilot Science Demonstrators: one year later

Gergely Sipos outlines the status of the science demonstrators at the end of the first project year

The EOSCpilot project has recently passed its mid-term review in Brussels, receiving positive feedback from the panel of experts. One of the key activities of the project was the support of science demonstrators that act as early adopters of ‘to-be EOSC’ services. By the end of the first project year, 15 science demonstrators have been selected from diverse scientific domains. Each received advice on available services and was provided with technical support and training to integrate scientific applications with services that fit their purpose. The demonstrators provided feedback to the project and promoted their work to various audiences beyond the consortium.

During this period, the EGI Foundation and several members of the EGI federation were involved in the selection and support of science demonstrators. Liaison and involvement of service and technology providers happened both within Europe and overseas, for example with Compute Canada.

Here is the status of the first 10 science demonstrators:

  • TextCrowd (social sciences): a virtual research environment for implementing the Natural Language Processing (NLP) encoding/metadata enrichment of textual archaeological reports has been set up. The service allows researchers to store textual documents in a cloud folder, perform NLP operations, trigger the semantic enrichment of the text and get information in RDF format.
  • PanCancer (life sciences): Ported the Butler application for large scale processing of cancer genomes onto Compute Canada and selected sites of the EGI Federated Cloud.
  • Photon/Neutron (physics): Two applications (OnDA and Crystfel) have been containerized and tested on HPC and cloud platforms at DESY.
  • DPHEP (high-energy physics): Assessed whether CERN’s own preservation system could be replaced with ‘out-of-the-box’ services combined of CVMFS, B2SAFE and Trustworthy Digital Repository. Data ingestion/replication of small datasets from CERN to CINES was put in operation with an average speed of 800Mbits/sec.
    Data retrieved from CERN was ingested into CINES and replicated to CINECA for open access.
  • ERFI (environmental & earth science): Used the EGI Open Data Platform to develop a data integration framework for sharing datasets between the ICOS and the IS-ENES research infrastructures.
  • EPOS/VERCE (earth sciences): Integrated three cloud providers of the EGI Federation with a Virtual Research Environment to compute realistic scenario of earthquakes shaking using misfit calculations.
  • PROMINENCE (energy research): Deployed SLURM clusters on the EGI Federated Cloud to run containerized MPI-based applications both on a hybrid EGI-commercial cloud platform.
  • LOFAR (physical sciences): Enhanced three pipelines using container technology. Used Common Workflow Language (CWL) as workflow engine to link the different pipelines. Provided access to SURFsara supercomputer and HPC-cloud resources, and intend to expand the tests to FZJ/PSNC sites in the remaining months.
  • CryoEM (life sciences): Extended a workflow editor environment to enable workflow import-export capabilities, enabling researchers to work more according to the FAIR principles.
  • EGA Datasets (genome research): Provided access to the B2FIND instance to deposit metadata produced by genomics pipelines.
More information

Gergely Sipos is the Customer and Technical Outreach Manager of the EGI Foundation. He is involved in the WP4 (Science Demonstrators), WP5 (Services) and WP7 (Skills) work packages of the EOSCpilot project.