The EGI Central Operations Team has defined its high-level objectives for 2019. These objectives are designed to ensure that high-quality services continue to be delivered to the different user communities and to ensure that we continue to improve our service delivery. In this article, we provide an overview of the ways we are doing this by taking a look at some of the high-level objectives for this year.
This year, we have introduced a new way of ensuring that EGI Operational Tools (e.g. monitoring, accounting, VMops Dashboard) meet user requirements. This is being done by the tool developers in cooperation with the EGI Operations Management Board where the National Grid Initiatives (NGIs) are represented. By providing a forum where development roadmaps and priorities are discussed, the EGI Operations Team will help to ensure that these tools remain fit for purpose for the EGI community into the future.
Continuity of service remains a top priority for the EGI Operations Team, especially when software reaches its end of life. In the light of the recent announcement that the CREAM-CE service will cease to be maintained at the end of the EOSC-hub project, the EGI Operations Team in coordination with the CERN WLCG Operations Team are working hard to ensure that user communities still using CREAM-CE can migrate to alternatives as easily as possible. ARC-CE is already well supported within EGI and during 2019 we are working to ensure that HTCondorCE will be equally well supported.
In any complex e-Infrastructure, monitoring when things go wrong is essential.
An effective testing framework typically needs to be constantly tweaked and maintained to ensure that the right people are alerted when failures happen. However, it is often challenging to understand abnormal patterns of behaviour when different user communities are using the e-Infrastructure in many different ways.
The easiest way to deal with this is by using common tests which monitor the basic functionality of services, an already existing activity within the EGI Federation (e.g. there are over a hundred different tests running in ARGO). Nevertheless, a smarter way of monitoring is by better understanding the specific work of the different communities and introducing new tests to alert when things go wrong. This is why the EGI Operations Team is now spending time introducing specialised monitoring.
Internal Services delivered by EGI have Operations Level Agreements (OLAs) are in place between EGI Foundation and service providers. This enables EGI to broker customers’ requirements with services being delivered as part of the Federation. The EGI Central Operations Team is reviewing OLAs to ensure that they are aligned with user expectations and will be looking for areas that can be improved. The team is also reviewing procedures to ensure that any problems of operational level targets are adequately dealt with.