Call for requirements, a SLURM survey and more in the EGI operations community's agenda
The Operations Management Board met on the 29th of November. The main topics involved in the discussion were: the collection of new middleware requirements for 2012, the extension of the tests generating alarms in the Operations Dashboard, and the NGI Availability/Reliability reports.
A call for operational requirements addressed to site administrators and NGI operations teams is now open until the 15th of January. Requirements can be easily submitted in RT [instructions]. This new call for requirements gives an opportunity to all NGIs and site administrators to contribute to the evolution of our infrastructure. Thanks to requirements already submitted during 2011 several products have been improved, so this is a unique opportunity to provide feedback to our technology providers.
The Service Availability Monitoring tool is being constantly improved. A number of new tests were recently integrated in SAM (Update 7 and Update 11). The OMB proposed to integrate them in the Operations Dashboard in order to get a more comprehensive and detailed view of issues affecting the deployed software. Thanks to the Operations Dashboard NGI operators can proactively support site administrators in the identification and solution of problems. Also site administrators can have access to a customized view of the Operations Dashboard, which displays the current Availability and Reliability of their own site, and the status of the related grid services.
A proposal for the computation of Availability and Reliability performance statistics for NGIs was discussed. There's an increasing need of better availability of grid services operated by NGIs. This proposal - which will be finally discussed in December - is a step towards the improvement of the quality delivered to end-users.
Last but not least, site administrators are invited to participate to a survey about the Simple Linux Resource Manager (SLRUM). You have time until the 14th of December. If you have experience with SLURM, we are interested in your feedback and you can reply to this blog post to share your experience with the operations community.
For more information see the OMB meeting agenda.