Standard Operating Procedures for SAGrid.

Coordination

Announcements and Communication

Communication is fundamental to the proper functioning of the grid services and to the user experience. To properly channel information to the right place at the right time, please follow the communications SOP below.

Site intervention and downtime announcements

Downtimes or interventions on any service on your site have to be announced with the relevant notice period.
Incident ChannelAfricaGrid? ROC - AfricaGrid? Regional Operations Centre - https://roc.africa-grid.org 1. AfricaGrid? GOCDB - regional Operations Centre Database for AfricaGrid? ">(1) Sort Minimum Notice period
service failure AfricaROC ? immediately
site service restart SAGrid Ops mailing list 30 minutes
planned site downtime AfricaGrid ROC ? 1-2 days
service upgrade SAGrid Ops mailing list 1 week
new VO SLA GGUS,AfricaGrid ROC 1 month

Site Admin availability or replacement

Meeting scheduling

Procedure to schedule site operations meetings.

Issue reporting, routing and management

procedure to report issues, route and assign them to the responsible support unit, and

Checklists

Site Operator On Shift Checklist

Core Services Operator On Shift Checklist todo.gif

  • StartOfCoreServicesShiftChecklist?
  • Service Checklists
    • WMSChecklist
    • BDIIChecklist
    • LFCChecklist
    • AMGAChecklist
  • EndOfShiftChecklist?

User Support On Shift Checklist todo.gif

Application Support On Shift Checklist todo.gif

Site Deployment and integration

Site deployment is usually done only once, and should be done, as far as possible, according to the best practices of the federation. The outline for deployment should be done according to SiteDeploymentSOP?

Follow EMI middleware deployment guide ?

Site service testing todo.gif

Site BDII

Compute element

Storage element

Site integration todo.gif

  1. add line to bdii.conf
  2. check to see whether site appears in top-bdii
  3. add site to gocdb
  4. add services to site entry in gocdb

Site Upgrade todo.gif

SAGrid Operations Meetings

As a member of the SAGrid Operations team, you are required to participate to weekly meetings, and provide your input on the issues at your site. There

Reference Documentation and Manuals

ALERT! How to use these references ALERT!

It's a big, complicated world and things can seem confusing sometimes. The documentation linked below is under permanent construction as usage and technology of the grid changes. It is even not uncommon to find blatant conflicts between what one set of documentation or reference suggests and what another suggests. Here are some guiding principles to help you through the confusion.

Federation and Interoperability : Site Priorities

The shared and collaborative nature of the grid means that your site and the services that it providesa re often used by many different groups of people, which result in some conflicting requests. You should take decisions and apply procedures based on the priorities of your site. These are generally hierarchical and

  • Your site should be configured and maintained in as close as possible a way to the other sites in SAGrid.

Grid Infrastructure Reference Documentation

  • EGEE Operational Procedures for ROCs and Sites ?
  • EGEE Operational Procedures for Regional Operator on Duty ?
  • EGI Operations Manuals ?
  • EGI Operations Best Practices ?
  • EUMedGrid Site Deployment Guide ?

Middlware and Service Reference Documentation

-- BruceBecker - 01 Aug 2011

Notes

1 : The channel is the support platform where the announcement should be made. These are

  1. GGUS - central instance of GGUS - https://ggus.eu
  2. AfricaGrid? ROC - AfricaGrid? Regional Operations Centre - https://roc.africa-grid.org
  3. AfricaGrid? GOCDB - regional Operations Centre Database for AfricaGrid?


Topic revision: r7 - 03 Oct 2011 - 08:42:10 - BruceBecker
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback