Welcome to the Middleware section
gLite 3.1
Available Information
This document is aimed at the South African research community. Therefore this will contain all the configuration information needed to connect to the SOUTH AFRICAN services of the grid.
In the context of this document, an institute will be referred to as a site.
South African mirror of gLite
The gLite middleware repositories of the CNAF have been mirrored using mrepo and are available for South African sites at
http://glite.mirror.ac.za
Introduction
The gLite distribution is an integrated set of components designed to enable resource sharing. In other words, gLite is the middleware used for building a grid.
The gLite middleware is produced by the
EGEE project. In addition to code developed within the project, the gLite distribution pulls together contributions from many other projects, including
LCG and
VDT. The distribution model is to construct different services ('node-types') from these components and then ensure easy installation and configuration on the chosen platforms (currently
Scientific Linux versions 3,4 and 5).
gLite middleware is currently deployed on hundreds of sites as part of the
EGEE project and enables global science in a number of disciplines, notably serving the
LCG project.
Components Needed
To connect to the grid, a site needs to install and configure at least the following components at their local institute.
- Berkeley Database Information Index (BDII)
- Computing Element (CE)
- Storage Element (SE)
- Worker Nodes (WN)
- User Interface (UI) - Optional
For an overview of the gLite installation instructions please see the gLite installation Overview page.
Hardware Requirements
The BDII,CE,SE and UI is usually not too hardware intensive. Therefore these services can be visualized and run on a single host.
For each of the services (BDII,CE,SE and UI) the following minimum hardware configuration is recommended:
- 1 x 1.86Ghz Single Core CPU
- 1 x 1GB Random Access Memory (RAM)
- 1 x 1Gb Network Interface Card
- 30Gb Free Space (For the OS partition, additional space may be required for specific Virtual Organization (VO) software etc.)
If you choose to install the central site services on a single host, this single host will require the sum of the above hardware configuration.
For instance a host that runs virtual machines for the BDII,CE,SE and UI will require a Quad Core/CPU machine with 4Gb RAM and about
120Gb Free Space.
The Worker Node (WN) on the other hand, is more resource intensive. These are the machines that will be used to execute the jobs and thus, more
resources should be allocated to these machines.
A typical configuration outline:
- 2 x CPU (Single- or Multi- Core)
- 1 x 2GB RAM per Core (Thus an 8 Core machine will have 16GB RAM+)
- 2 x 72GB Serial Attached SCSI (SAS) Harddrives with striping (RAID0)
- 2 x 1Gb Network Interface Cards
The specifications above are only guidelines and should be seen as such.
Berkeley Database Information Index
The Berkeley Database Information Index (BDII) holds a Lightweight Directory Access Protocol (LDAP) database.
The BDII is mainly used to:
- store information regarding the available resources of a site
- store information regarding the services provided by a site
Main Communication Paths:
- SE(BDII) & CE(BDII) <--> Site-BDII <--> Top-BDII <--> WMS
Every Site needs a Site-BDII which holds records for the entire site's resources. A site may have more than one BDII, but
only one Site-BDII is recommended per site.
A BDII will synchronize it's available resources with the Site-BDII and the Site-BDII (local to the site) will then synchronize with the Top-BDII (national/international).
A Top-BDII is installed on a national level. The Top-BDII holds records of all the available resources of the Site-BDIIs that are connected to the Top-BDII.
A site does not need to install a Top-BDII, but only needs to define the Top-BDII that should be used. When a Workload Management Server (WMS) requests resources
for a job, it queries the Top-BDII to see which CE has the required resources for the job to run.
It is the norm to have the Site-BDII installed on the same host as the CE, but this is not a requirement. The Site-BDII may reside on a different host.
Compute Element
The Computing Element (CE) is part of the central service of a site.
It's main functionalities are:
- manage the jobs (job submission, job control to the worker nodes)
- update the status of the jobs to the WMS
- publish all site information (site location, queues, about the CPUs status) via LDAP (BDII services)
The CE can run with several kinds of batch systems:
- Torque + MAUI
- LSF
- SGE
- Condor
Main Communication Paths:
- CE(Job) <--> WMS <--> CE(Job can Run) <--> WN(Job Execution)
Storage Element
The Storage Element (SE) is part of the central services of a site.
The SE is mainly used to:
- store the software needed by the VOs
- store user input/output data
Three types of SE can be installed:
- SE Disk only (A Storage Element that only stores files on it)
- SE with MySQL support (Space reservations and pending file copies are stored in a MySQL Database)
- SE with Oracle support (Space reservations and pending file copies are stored in a Oracle Database)
Main Communication Paths:
- UI/WN(Upload File) <--> SE (Store File) <--> LFC (Update File Location) <--> SE/UI (Download File)
The SE holds its own BDII. This BDII will update the availability of resources to the Site-BDII.
Worker Node
The Worker Node (WN) is the "cluster node" that will physically execute the jobs.
The WN is mainly used to:
- Execute the grid jobs
- Copy the results to the CE when jobs are done
Different types of installations are available, including:
- Worker Node Only (No Torque nor Andrew File System - AFS)
- Worker Node with Torque (Torque modules are included)
- Worker Node with AFS support
Main Communication Paths:
- CE (Submit Job to Node) --> WN (Execute Job) --> CE (Results) --> WMS
User Interface
From the user's perspective, a simple user interface is the first point of access to the grid and all of its services. There are several user interfaces, apart from the "real" user interface provided by the gLite repositories:
- The Virtual User Interface
- The Plug 'n' Play User Interface
- The web-based user interface (GENUIS portal)
Main Communication Paths:
- UI (Submit Job) --> WMS (Available Resources?) --> CE (Resources Satisfy Job requirements) --> WN(Job Executes) --> CE (Results) --> WMS --> UI (Retrieve Results)
The user interface is therefore the entry point into the grid. An user will have his/her own user account on the UI to log into. From there on the user
will make use of a personal certificate as authentication method on the grid.
Installation Instructions
Install and configure the following services (in this order):
For the Berkeley Database Information Index installation instructions please follow
the BDII installation instructions.
For the Compute Element installation instructions please follow
the CE installation instructions.
For the Storage Element installation instructions please follow
the SE installation instructions.
For the Worker Node installation instructions please follow
the WN installation instructions.
For the User Interface installation instructions please follow
the UI installation instructions.