SGE Sun Grid Engine

This service is cancled

What is the Sun Grid Engine and who is allowed to use it?

The Sun Grid Engine (SGE) is a so called batch system. Users can send compute jobs to the SGE and it distributes these jobs to the machine or machines best suited for the task. It does this according to a predefined set of rules.

The SGE is usable by students and employees of the Institut für Informatik and CIS. No additional registration (aside from your CIP-Pool account) is required. Just ensure that your jobs use the project “Stud”.

There are no restrictions on when to submit jobs or which machines you can choose for the jobs. Keep in mind that the SGE will not deploy jobs on workstaions being used interactively and it will suspend jobs on a workstation iif a user logs in. The machines in the Amalienstr. (Luna) always behave like this. All other machine exhibit this behaviour between 8:00 and 22:00 from Mo. to Sa. This means that the machines will be available to the SGE during the night when the buildings are closed to the public even if users are still logged in.

How does the Sun Grid Engine work?

The Sun Grid Engine is devided in 5 operational sections:

The first ones are the machines which really compute a job. Usually only the powerfull machines of a network get used for this.

The Submit-Hosts are machines from which it is possible to submit new jobs to the SGE.

The Admin-Hosts allow admins and sufficiently privileged users to make changes to the SGEs configuration.

The Master-Host manages the computational resources of the SGE (himalia in our case). The Execution-Hosts send their current load situation to the Master every 40 sec. The Master distributes new jobs to free Execution-Hosts based on this information.

The Shadow-Master-Host is ab backup for the Master-Host. It takes over if the Master-Host is gone for more than 10 min.

Is there a GUI?

Yes. With the command qmon you can start a graphical user interface which gives you access to an overview of configured queues, running jobs, connected computers, the cluster config and all other pertinent information. It also provides a graphical way to start new jobs and to manipulate your already submitted jobs.

Which computers are connected?

All machines in the CIP-Pool rooms with the exception of the Theresienstr. (Deneb) are connected to the SGE.

How do I submit a job?

There are two ways of submitting a job to the SGE. First you can use the command line tool qsub and second you can use the graphical interface provided by qmon.

In both cases you have to provide a shell-script which sets the required parameters.

First you need to select the jobscript you want to submit. Then you need to decide on some parameters for the execution. Most settings are self-explanatory but note the the Checkbox “Start Job Immediately” does not mean that the job should be started now (which the SGE will try anyway) but that the job will be started now (if possible) or never (if not). The Notify job option should only be used if you have written your own signal handler for your job script. This signal handler needs to process the signales SIGUSR1 (denoting an imminent queue suspend) and SIGUSR2 (denoting an imminent queue shutdown). It is important to specify the project the job shall run under. If you do not have a special prject given to you by the RBG just use “Stud”.

The option “Current Working Directory” sets the directory qmon has been called from as the CWD for the job. All files not provided with full path names will be looked for or created here. This is also true for the files which the SGE uses for Standard-Output and Standard-Error of the job. If these files are not explicitly named SGE defaults to $JOBNAME.o$JOBID for stdout and $JOBNAME.e$JOBID for stderr. If the CWD is not set these files will be created in the users homedir. The execution shell can be set unix style in the job script as explained above. In this case a line like “#$ -S /bin/bash” should be included. This tells the SGE in its internal syntax to use the bash-shell.

The most important options on the Advanced Tab are queues selection and mail to. You can configure which status changes you wan to be notified of via email to which address. If you need to ensure compareable runtime for your jobs you can limit the queues your jobs may run on to those with identical hardware in the participation machines. To get the hardware specs of the machines please contact the RBG.

Additionly it is possible to use DRMAA commands in several programming languages to submit jobs directly from a programm. More information can be found here

Attention: you need to set the project to “Stud” before submitting a job if you do not have received your own project from the RBG.!!!!”

How can I modify a submitted job?

What do I do with errors?

What kind of tasks is the SGE suited for?

How do I submit Blender-Jobs?

Blender jobs can be submitted through a special script which is intended to ease the use of the SGE and demonstrate the use of DRMAA with python.

To use this script to render an animation with the SGE use the following command in the directory containing the .blend file. Please make sure to set the output path in blender and save the file before executing the command.

blender -b -P /home/proj/sun_grid_engine/blend_with_sungrid.py

Commandline Overview

On the commandline you can perform these tasks:

Commandline programm overview: