Welcome to the Middleware section

gLite 3.1

Available Information

This document is aimed at the South African research community. Therefore this will contain all the configuration information needed to connect to the SOUTH AFRICAN services of the grid. In the context of this document, an institute will be referred to as a site.

South African mirror of gLite

The gLite middleware repositories of the CNAF have been mirrored using mrepo and are available for South African sites at http://glite.mirror.ac.za

Introduction

The gLite distribution is an integrated set of components designed to enable resource sharing. In other words, gLite is the middleware used for building a grid. The gLite middleware is produced by the EGEE project. In addition to code developed within the project, the gLite distribution pulls together contributions from many other projects, including LCG and VDT. The distribution model is to construct different services ('node-types') from these components and then ensure easy installation and configuration on the chosen platforms (currently Scientific Linux versions 3,4 and 5).

gLite middleware is currently deployed on hundreds of sites as part of the EGEE project and enables global science in a number of disciplines, notably serving the LCG project.

Components Needed

To connect to the grid, a site needs to install and configure at least the following components at their local institute.

  • Berkeley Database Information Index (BDII)
  • Computing Element (CE)
  • Storage Element (SE)
  • Worker Nodes (WN)

  • User Interface (UI) - Optional

IDEA! For an overview of the gLite installation instructions please see the gLite installation Overview page.

Hardware Requirements

The BDII,CE,SE and UI is usually not too hardware intensive. Therefore these services can be visualized and run on a single host. For each of the services (BDII,CE,SE and UI) the following minimum hardware configuration is recommended:
  • 1 x 1.86Ghz Single Core CPU
  • 1 x 1GB Random Access Memory (RAM)
  • 1 x 1Gb Network Interface Card
  • 30Gb Free Space (For the OS partition, additional space may be required for specific Virtual Organization (VO) software etc.)

If you choose to install the central site services on a single host, this single host will require the sum of the above hardware configuration. For instance a host that runs virtual machines for the BDII,CE,SE and UI will require a Quad Core/CPU machine with 4Gb RAM and about 120Gb Free Space.

The Worker Node (WN) on the other hand, is more resource intensive. These are the machines that will be used to execute the jobs and thus, more resources should be allocated to these machines.

A typical configuration outline:

  • 2 x CPU (Single- or Multi- Core)
  • 1 x 2GB RAM per Core (Thus an 8 Core machine will have 16GB RAM+)
  • 2 x 72GB Serial Attached SCSI (SAS) Harddrives with striping (RAID0)
  • 2 x 1Gb Network Interface Cards

IDEA! The specifications above are only guidelines and should be seen as such.

Berkeley Database Information Index

The Berkeley Database Information Index (BDII) holds a Lightweight Directory Access Protocol (LDAP) database. The BDII is mainly used to:
  • store information regarding the available resources of a site
  • store information regarding the services provided by a site

Main Communication Paths:

  • SE(BDII) & CE(BDII) <--> Site-BDII <--> Top-BDII <--> WMS

Every Site needs a Site-BDII which holds records for the entire site's resources. A site may have more than one BDII, but only one Site-BDII is recommended per site. A BDII will synchronize it's available resources with the Site-BDII and the Site-BDII (local to the site) will then synchronize with the Top-BDII (national/international).

A Top-BDII is installed on a national level. The Top-BDII holds records of all the available resources of the Site-BDIIs that are connected to the Top-BDII. A site does not need to install a Top-BDII, but only needs to define the Top-BDII that should be used. When a Workload Management Server (WMS) requests resources for a job, it queries the Top-BDII to see which CE has the required resources for the job to run.

IDEA! It is the norm to have the Site-BDII installed on the same host as the CE, but this is not a requirement. The Site-BDII may reside on a different host.

Compute Element

The Computing Element (CE) is part of the central service of a site. It's main functionalities are:
  • manage the jobs (job submission, job control to the worker nodes)
  • update the status of the jobs to the WMS
  • publish all site information (site location, queues, about the CPUs status) via LDAP (BDII services)

The CE can run with several kinds of batch systems:

  • Torque + MAUI
  • LSF
  • SGE
  • Condor

Main Communication Paths:

  • CE(Job) <--> WMS <--> CE(Job can Run) <--> WN(Job Execution)

Storage Element

The Storage Element (SE) is part of the central services of a site. The SE is mainly used to:
  • store the software needed by the VOs
  • store user input/output data

Three types of SE can be installed:

  • SE Disk only (A Storage Element that only stores files on it)
  • SE with MySQL support (Space reservations and pending file copies are stored in a MySQL Database)
  • SE with Oracle support (Space reservations and pending file copies are stored in a Oracle Database)

Main Communication Paths:

  • UI/WN(Upload File) <--> SE (Store File) <--> LFC (Update File Location) <--> SE/UI (Download File)

The SE holds its own BDII. This BDII will update the availability of resources to the Site-BDII.

Worker Node

The Worker Node (WN) is the "cluster node" that will physically execute the jobs. The WN is mainly used to:
  • Execute the grid jobs
  • Copy the results to the CE when jobs are done

Different types of installations are available, including:

  • Worker Node Only (No Torque nor Andrew File System - AFS)
  • Worker Node with Torque (Torque modules are included)
  • Worker Node with AFS support

Main Communication Paths:

  • CE (Submit Job to Node) --> WN (Execute Job) --> CE (Results) --> WMS

User Interface

From the user's perspective, a simple user interface is the first point of access to the grid and all of its services. There are several user interfaces, apart from the "real" user interface provided by the gLite repositories:

Main Communication Paths:

  • UI (Submit Job) --> WMS (Available Resources?) --> CE (Resources Satisfy Job requirements) --> WN(Job Executes) --> CE (Results) --> WMS --> UI (Retrieve Results)

The user interface is therefore the entry point into the grid. An user will have his/her own user account on the UI to log into. From there on the user will make use of a personal certificate as authentication method on the grid.

Installation Instructions

Install and configure the following services (in this order):

For the Berkeley Database Information Index installation instructions please follow the BDII installation instructions.

For the Compute Element installation instructions please follow the CE installation instructions.

For the Storage Element installation instructions please follow the SE installation instructions.

For the Worker Node installation instructions please follow the WN installation instructions.

For the User Interface installation instructions please follow the UI installation instructions.

Topic revision: r21 - 08 Jun 2009 - 14:35:34 - VanEckA
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback