Jobmanagement (`jobManagement`)¶

JobManagement provides a set of classes to executes different types of jobs on a workstation or a cluster. It needs a bash environment on the workstation and PBS(portable bash system) on the cluster. It consists on a central job list class which serves as controller for the jobmanagement. A job list instance owns a graph which displays the dependencies between the jobs in the list and to resources instances which manages the number of processors and nodes to use. The job class is build as factory pattern which creates instances of the job you need in dependency of your needs. You can put a job list in a job list if you want to, but for the correct handling of the resources you need to pass the resources instances of the root job list. For every job a output file will be created automatically, when no path is given.

To work on a cluster you need a ASCII file /etc/CLUSTERNAME containing the name of the cluster or an environment variable CLUSTERNAME with the name of the cluster.

Usage¶

import:

from mojo.jobManagement.jobs.jobList import JobList
from mojo.jobManagement.jobs.job import Job

Simple example with three jobs with easy dependencies:

myJobList = Joblist(nNodesAvailable = 0, nProcsAvailable = 6)  # We want to compute local and use six processors
job1 = Job("<path to a cleanup script>")
job2 = Job("<path to a TRACE executable>", nProcs = 4, args = ["<a argument>", "<another argument>"])
job3 = Job("<path to a POST executable>", args = ["<a argument>", "<another argument>"])
job4 = Job("<path to a analysis script>")
myJobList.addJob(job1)
myJoblist.addJob(job2, [job1.id])   # pass a list with the parents ids to get the dependencies right
myJobList.addJob(job3, [job2.id])
myJobList.addJob(job4, [job3.id])
myJobList.viewJobs()  # if you want to
myJobList.runJoblist()
myJobList.cleanup()  # if you want to; removes the output files of all successful jobs

If we were on a cluster we could do the same, but for job1 and job4 we would pass executeOnMaster = True to not allocate a own node for the small scripts. We would pass a number of nodes we wanted to use at the same time.

A other example with a job list in a job list:

myJobList = Joblist(nNodesAvailable = 20, nProcsAvailable = 2, name = myJoblist)  # a name for the joblist
job1 = Job("<path to a cleanup script>")
myJobList.addJob(job1)
# passing the management
ajoblist = JobList(nodeManagement = myJobList.nodeManagement, procManagement = myJoblist.procManagement)
job2 = Job(..., executeOnMaster = True)  # if you are on a cluster the job will be executed on the master
job3 = Job(print, args = ["Hello World!"])  # a MethodJob
ajoblist.addJob(job2)
ajoblist.addJob(job3, [job2.id])
myJobList.addJob(ajoblist, [job1.id])

A example executing a single job:

myJob = Job("<path to executable>", runInBackground = False)
myJob.runJob()
myJob.cleanup(False)  # removes the output file even if the job failed; the optional False is available for
                      # the joblist too

Running a MethodJob and receiving the result via a instance variable:

def myMethod(a, b):
    return a + b

myJob = Job(myMethod, args = [42, 3141], runInBackground = False)
myJob.runJob()

print myJob.returnValue

Attention

You will only get pickable return values!

For all available joblist or job options please view the further documentation at the bottom.

Attention

Set runInBackground to false only if you want to execute the job without a job list

Modules¶

JobList¶

Class to set up a job management system for among themself dependent jobs. In addition you can put a Job list itself as job into another job list. After a successful execution of the job list you can get a list of the jobs which were successful, failed and were not started because of failed parent jobs by accessing joblist.successfulJobs, joblist.failedJobs and joblist.jobs.values().

Attention

Declaring dependencies between jobs in different job lists will generate a error!
class JobList(nNodesAvailable=None, nProcsAvailable=None, nTecplotLicenses=None, abortByError=False, name='anonymousJoblist', verbosity=<Verbosity.DEFAULT: 2>, resourcesDict=None, usedResources=None, additionalProcessStatus=False, retries=0, deactivateJobGrouping=True)[source]¶
Class to set up a system of among themselves dependent jobs. JobList uses a instance of graph to control the dependencies between the jobs and a instance of resources to manage the nodes available.

Parameters

nNodesAvailable (integer) – number of nodes available for the execution of the job list

nProcsAvailable (integer) – number of processors to use on the master

nTecplotLicenses (integer) – number of tecplot licenses needed by the job

abortByError (boolean) – flag if the whole list should became stopped when one job fails

name (string) – a name for the job list

resourcesDict (dictionary) – a dictionary containing resources available in the job list (key: resource, value: quantity of the resource)

verbosity (integer) – verbosity level of the job list

usedResources (dictionary) – resources the job list consumes itself when processing

retries (integer) – number of times the job is retried when failing

deactivateJobGrouping (boolean) – disable running several small jobs simultaneously on cluster nodes

Inherited from AbstractJob:
start(self)
_writeStatus(self, status, message="")
addJob(job, parents=None, weight=None)[source]¶

Adds a new job to the job list and builds the dependencies

Parameters

job (instance of a class derived from AbstractJob) – the job to add to the job list

parents (list of int) – list of the parentsIds of the job

weight (int) – weight of the job in processes * seconds

property allJobs¶

Returns all jobs in this JobList

Warning: Some jobs can be included multiple times in the returned iterator

calculatePriorities()[source]¶

Caluclate the weights of the job lists dependency graph.

cleanup(force=False, **kwargs)[source]¶

Remove created scripts, output and error files

Parameters

force (bool) – flag whether cleanup is enforced

removeScript (bool) – flag whether files should be removed

nJobsOfStatus(statusList=None)[source]¶

Returns if the jobs status is in the list of status.

Overwrites method in AbstractJob.

Parameters

statusList (Optional[iterable]) – list of valid status or None

Returns

number of jobs of this status

Return type

int

runJob()[source]¶

Method to fulfill the interface set by AbstractJob.

runJoblist(printOverview=False)[source]¶

Executes a job list.

While not terminated jobs or jobs without dead parent in the list:

when jobs ready to start exist and enough nodes exist start new jobs

print the status of all running jobs into the log file and to sys.stdout

if a job has the status done or error free the nodes and resolve the dependencies

static sigterm_handler(_signo, _stack_frame)[source]¶

Handle a TERM signal as SystemExit exception.

stop()[source]¶

Stops all currently running jobs. Called by a parent job list or through a keyboard interrupt or system exit.

update(subject, event)[source]¶

Method that gets called on events

updateStatus()[source]¶

Makes a iteration over all jobs, execute new jobs and read their statuses.

static viewDependencies(currentJobList)[source]¶

Prints all dependencies of the current JobList by recursively walking through the job list.

Parameters

currentJobList (JobList) – main job list

viewJobs()[source]¶

Prints a list of the job list and the associated dependency graph.
Job¶

Job supplies a class to compute a trace-, prep- or post job, on a workstation or a cluster. When creating an instance in dependency on parameters passed a suitable class will be chosen and initiated.

Assumptions:
Job scheduling on cluster is managed by PBS (Portable Batch System) or SLURM.

Requires:¶

for use on a cluster:
Information about the cluster has to be added in ‘mojo.jobManagement.management.clusterInfo._supportedClusters’

class Job(executable, nProcs=1, args=None, kwargs=None, nTecplotLicenses=0, queue=None, workingDirectory=None, outputFile=None, jobName=None, procsPerNode=None, threadsPerProc=None, envVars=None, runInBackground=False, nodeType='ib', executeOnMaster=False, jobType=None, outputDir=None, successStatements=None, failStatements=None, ignoreExitCode=False, resourcesDict=None, useMpirun=True, wallTime=None, weight=0, dependJobID=None, account=None, retries=0, group=None, mailAddress=None, slurmSwitches=None)[source]¶

A class to create the job you need by analyzing the given parameters

Parameters

executable (str or method) – path to the executable or a method

nProcs (int) – number of processors to use

args (list) – arguments which will be appended when executing

kwargs (dict) – a variable to pass keyword arguments to a methodJob

nTecplotLicenses (int) – number of tecplot licenses needed by the job

queue (str) – queue to put the job in

workingDirectory (str) – directory to change to before executing

outputFile (str) – path to a file for the output

jobName (str) – name for the job

procsPerNode (int) – process per node

threadsPerProc (int) – threads to start per processor

envVars (dict) – environment variables to set

lock (Lock instance) – DEPRECATED; exists for compatibility reasons

runInBackground (bool) – if False the job, when started as single job, will stay in the foreground

nodeType (str) – type of the node to calculate on

executeOnMaster (bool) – if True the job will be executed on the master if on a cluster

jobType (str) – possibility to declare the jobType to control the result; available: TRACE, PREP, POST, gmcPlay

outputDir (str) – directory the output file will be saved in

successStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for success

failStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for failure

ignoreExitCode (bool) – if True the job will ignore the exit code of the executable

resourcesDict (dict) – a dictionary containing resources needed by the job (key: resource, value: quantity of the resource)

useMpirun (bool) – if False the job will not be executed with MPI but the number of processes is used

wallTime (str) – maximum wall time for a job on the cluster

account (str) – account on which computation costs are charged

retries (int) – number of times the job is retried when failing

mailAddress (str) – mail address for status information (only used for SLURM jobs!)

slurmSwitches (str) – slurm option requesting minimal number of network switches

Returns

A job; MethodJob, ShellJob, QSubJobInstance or slurmJobInstance in dependency on the given parameters

Return type

MethodJob/ShellJob/QsubJob/SlurmJob instance
AbstractJob¶

AbstractJob offers a class to inherit from for every type of job the job management is intended to work with. It has the abstract methods runJob() and stop() implemented to be overwritten in every derived class. AbstractJob inherits from Process and therefore can be started in a own process. Furthermore five int-constants for the statuses and a dictionary to print a string instead of the value of the constant are offered.

Required classes and methods:
from abc import ABCMeta, abstractmethod
import collections
import glob
import os
import re
import sys
import time

from jobManagement.management.idGenerator import IdGenerator
class AbstractJob(name=None, nTecplotLicenses=0, outFile=None, outputDir=None, successStatements=None, failStatements=None, workingDir=None, runInBackground=None, ignoreExitCode=None, resourcesDict=None, weight=0, retries=0, group=None)[source]¶

Class every job should inherit from. Offers several methods and properties to communicate between the processes. It gets a id from IdGenerator, a queue to send the statuses to the main-thread, a name and a path to a log file(to a default log file if no log file is given).

Parameters

name (string) – name for the new job

nTecplotLicenses – number of tecplot licenses needed by the job

outFile (string) – path to the file the output will be written in

outputDir (string) – directory the output will be written in

successStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for success

failStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for failure

workingDir (string) – directory the job will be processed in

runInBackground (boolean) – If true the job will run in the background, when executed as single job

ignoreExitCode (boolean) – if True the job will ignore the exit code of the executable

resourcesDict (dictionary) – a dictionary containing resources needed by the job (key: resource(valid dictionary key), value: quantity of the resource)

weight (int) – the weight (seconds to run times number of processes) of the job

Tpye nTecplotLicenses

integer

allocateResources()[source]¶

Allocates the resources needed for the job.

cleanup(force=False, **kwargs)[source]¶

Removes the output file of the job.

If the output file is not specified explicitly nothing is written to the file with exception of the qsub job. Therefore empty output files will be deleted.

Parameters

force (boolean) – forces the removal of the output file

freeResources()[source]¶

Frees the resources used by this job.

getJobDictForJSON(jobsListName)[source]¶

Generates a dictionary with basic parameters to launch the job.

Parameters

jobsListName (str) – name of job list

Returns

launch parameters

Return type

dict

lookForSuccess(returnCode=0)[source]¶

Controls the success of the currently finished job. If a success statement or a fail statement was defined the output file is searched for them. Sets the status of the job.

Parameters

returnCode (integer) – return code of the program called

Returns

True if the job was successful

Return type

boolean

nJobsOfStatus(statusList=None)[source]¶

Returns if the jobs status is in the list of status.

Parameters

statusList (list or None) – list of valid status or None

Returns

nuber of jobs of this status (0 or 1)

Return type

int

prepareOutFile()[source]¶

Generates the path to the output file and the output file itself.

resourcesAvailable()[source]¶

Tests if the resources for the job are available.

Returns

True if the resources for the job are available

Return type

boolean

abstract runJob()[source]¶

Abstract method runJob, to be overwritten in subclasses. Shoult initialize and start the current job.

startJob()[source]¶

Set the own status to (RUNNING, ) and calls runJob.

abstract stop()[source]¶

Abstract method stop, to be overwritten in subclasses. Should stop the current job.

abstract updateStatus()[source]¶

Abstract method updateStatus, to be overwritten in subclasses. Updates the job status and returns true if the status has changed.

Returns

returns true if the status has changed

Return type

boolean

waitUntilFinished(pollingInterval=1, maxPollingInterval=5, scalingPollingFactor=2)[source]¶

Waits until the job is finished.

Polling interval to use in the beginning and maximal polling interval are set in jobMangement/jobManagementData.py
shellJob¶

ShellJob supplies a class to execute a bash script in the current shell. You have to pass a executable and can pass arguments for the executable, a name, a file the output will be saved in, a working directory and environment variables to be set before the execution.
class ShellJob(executable, args=None, nTecplotLicenses=0, nProcs=None, name=None, outFile=None, workingDir=None, envVars=None, successStatements=None, failStatements=None, outputDir=None, runInBackground=None, ignoreExitCode=None, resourcesDict=None, useMpirun=False, weight=129600, retries=0, threadsPerProc=None)[source]¶
Class which executes a shell script in a new process. Name and output path will be generated when not given.

Parameters

executable (string) – path to the executable

args (list of string) – arguments which will be appended when executing

nTecplotLicenses – number of tecplot licenses needed by the job

name (string) – name for the job

outFile (string) – path to a file for the output

workingDir (string) – directory to change to before executing

envVars (dictionary) – environment variables to set

successStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for success

failStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for failure

outputDir (string) – directory the output file will be written to

runInBackground (boolean) – If true the job will run in the background, when executed as single job

ignoreExitCode – if True the job will ignore the exit code of the executable

resourcesDict (dictionary) – a dictionary containing resources needed by the job (key: resource, value: quantity of the resource)

useMpirun (boolean) – True, if mpirun should used to start the executable, false otherwise

weight (int) – the weight (seconds to run times number of processes) of the job

retries (int) – number of times the job is retried when failing

Tpye nTecplotLicenses

integer

Methods in use inherited from AbstractJob:
_writeStatus(status, message)
runJob()[source]¶

Executes the given executable with the given parameters in the given working directory.

stop()[source]¶

Sets a flag to abort the job in run

updateStatus()[source]¶

Updates the status of the job and returns True if the status has changed False else.

Returns

Returns True if the status has changed

Return type

boolean
MethodJob¶

MethodJob supplies a class to create a job which executes a given method in the currently running python enviroment. To get the results of the method you have to pass a queue or something similar to send data between the processes.

class MethodJob(method, args=None, kwargs=None, nTecplotLicenses=0, name=None, workingDir=None, outFile=None, envVars=None, successStatements=None, outputDir=None, failStatements=None, runInBackground=None, ignoreExitCode=None, resourcesDict=None, weight=60, retries=0)[source]¶

Class to execute a given method with the given arguments.

Parameters

method (method) – method to execute

args (list) – arguments which will be appended when executing

kwargs (dictionary) – a variable to pass keyword arguments to a methodJob

nTecplotLicenses – number of tecplot licenses needed by the job

workingDirectory (string) – directory to change to before executing

outFile (string) – path to a file for the output

jobName (string) – name for the job

envVars (dictionary) – environment variables to set

nodeType (string) – type of the node to calculate on

successStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for success

failStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for failure

outputDir (string) – directory the output file will be written to

runInBackground (boolean) – If true the job will run in the background, when executed as single job

ignoreExitCode (boolean) – if True the job will ignore the exit code of the executable

resourcesDict (dictionary) – a dictionary containing resources needed by the job (key: resource, value: quantity of the resource)

weight (int) – the weight (seconds to run times number of processes) of the job

retries (int) – number of times the job is retried when failing

Tpye nTecplotLicenses

integer

runJob()[source]¶

Executes the given method with the given parameters.

stop()[source]¶

Terminates its own process and writes a comment to the output file.

updateStatus()[source]¶

Updates the status of the job and returns True if the status has changed False else.

Returns

Returns True if the status has changed

Return type

boolean

QSubJob¶

QSubJob supplies a class which inherits from ShellJob and Clusterinfo. The functionality is nearly the same as in ShellJob but the job will be executed on nodes and not on the workstation or the master.

class QSubJob(executable, nProcs=1, args=None, nTecplotLicenses=0, name=None, outFile=None, workingDir=None, envVars=None, threadsPerProc=None, procsPerNode=None, queue=None, nodeType='w', successStatements=None, outputDir=None, failStatements=None, runInBackground=None, ignoreExitCode=None, resourcesDict=None, useMpirun=False, wallTime=None, weight=129600, dependJobID=None, retries=0)[source]¶

Class to start a job with pThreads and mpi on nodes on the cluster.

Parameters

executable (str) – path to the executable

nProcs (int) – number of processors to use

args (list) – arguments which will be appended when executing

nTecplotLicenses – number of tecplot licenses needed by the job

name (str) – name for the job

outFile (str) – path to a file for the output

workingDir (str) – directory to change to before executing

envVars (dict) – environment variables to set

threadsPerProc (int) – threads to start per processor

procsPerNode (int) – process per node

queue (str) – queue to put the job in

nodeType (str) – type of the node to calculate on

successStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for success

failStatements (list) – list of regular expressions the output file will be searched for (line by line) to check for failure

outputDir (str) – directory the output file will be written to

runInBackground (bool) – If true the job will run in the background, when executed as single job

ignoreExitCode (bool) – if True the job will ignore the exit code of the executable

resourcesDict (dict) – a dictionary containing resources needed by the job (key: resource, value: quantity of the resource)

wallTime (str) – maximum wall time for a job on the cluster

weight (int) – the weight (seconds to run times number of processes) of the job

retries (int) – number of times the job is retried when failing

Tpye nTecplotLicenses

int

cleanup(force=False, **kwargs)[source]¶

Remove created scripts, output and error files

Parameters

force (bool) – flag whether cleanup is enforced

removeScript (bool) – flag whether files should be removed

runJob()[source]¶

Starts the job using qsub and pThreads when needed.

stop()[source]¶

Kills the current job via process qdel.

updateStatus()[source]¶
Graph¶

The class Graph serves as root for a graph which shows the dependencies between several jobs. In addition to the methods which are offered by node.py the user gains the ability to add jobs to the graph, mark jobs as done or failed and get the jobs with no more dependencies sorted by priority. The jobs are saved as id only. In addition you can view the graph as simple console-output.
Usage¶

create a graph: myGraph = Graph()

add jobs: myGraph.addJob(1, [])
myGraph.addJob(2, [1], 2) …

get jobs without dependencies: myGraph.getReadyJobs()

mark a job as done or Failed: myGraph.jobDone(1, DONE)
myGraph.jobDone(2, ERROR)

print the graph: myGraph.printGraph()

Required classes and methods:
from job_management.graph.node import Node, DONE, ERROR, WAITING, reprStatus
from copy import deepcopy
class Graph(nodeId=- 1)[source]¶
Serves as root for a graph, describing the dependencies in a joblist. Has got every node in the graph as child to have a easy access to every node.

Inherited from Node:
addParent(self, parent)
markParentProcessed(self, parent)
addChild(self, newChild)
calculatePriority(self)
addGraph(graph, parentIds)[source]¶

Adds a new graph, and therefore a new node to the graph.

Parameters

graph (Graph) – the graph to add

parentsId (list of ints) – a list of the parents ids

addJob(jobId, parentsIds, weight=1)[source]¶

Adds a new job, and therefore a new node to the graph.

Parameters

jobId (int) – the id of the job to add

parentsId (list of ints) – a list of the parents ids

weight (int) – the weight of the job

buildGraph()[source]¶

Adds the associated childs to every node.

Returns

the current Graph-instance

Return type

Graph()

getReadyJobs()[source]¶

Returns a list of the ids of the jobs with no further dependencies, ready to start. They are sorted by weight from big to small.

Returns

list of the ids of the jobs ready to start

Return type

list of int

jobDone(jobId, status)[source]¶

Marks a job in the graph as DONE or ERROR. If the status is ERROR all childs an grandchilds will get the status DEAD_PARENT. This happens in the setStatus of the nodes.

Parameters

jobId (int) – id of the job with the status to change

status (int-constant) – new status

printGraph()[source]¶

Prints a list with every node, its weight and its parents and children to the console

reset()[source]¶

Allow to process this graph again

sanityTest()[source]¶

Tests if all jobs in the graph can be reached by performing a test run on the graph.
Node¶

The class Node serves as nodes in the dependency-graph. A node owns a id, a list of its parents, a list of its childs, its weight, its status and a flag if the weight was already calculated. It offers a method to add a child to the list of childs and a method to delete a parent from the list of parents. Calculate weight calculates recursive the weight of the node. Furthermore five int-constants for the statuses and a dictionary to print a string instead of the value of the constant are offered

Attention

Node is not intended to be used without graph!

class Node(nodeId, parents, priority=1)[source]¶

Node is a class which serves as node in a graph of dependencies between jobs. As instancevariables it saves its own id, its parents which are not done, its childs, its weight and calculated weight and its status.

Parameters

id (int) – id of the node

parents (list of int) – list of the parentsIds

weight (int) – weight of the node

addChild(newChild)[source]¶

Adds a child to the list of childs

Parameters

newChild (Node) – child to add

addParent(newParent)[source]¶

Appends a new parent to the list of parents

Parameters

newParent (Node) – parent to append

calculatePriority()[source]¶

Calculates the weight of the node and all of its children

Returns

the own calculated weight

Return type

int

property calculatedPriority¶

Weight of critical path

markParentProcessed(parent)[source]¶

Marks the parent as processed

Parameters

parent (Node) – parent to mark

setStatus(newStatus)[source]¶

Sets a new status. The children will get the status DEAD_PARENT if the status is ERROR or DEAD_PARENT. If the status is DONE the node will delete itself from the children list of parents.

Parameters

newStatus (int-const) – new status to set

IdGenerator¶

IdGenerator is a class build as “Borgpattern”. Every instance will share the state. Therefore getId generates global unique ids in every Thread. The first id is one. By calling reset the ids will start at one again.

class IdGenerator[source]¶

Class to create unique ids. Can be reseted.

getId()[source]¶

Increases the current id and returns it.

Returns

a new unique id

Return type

int

reset()[source]¶

Set the current id to zero again.

Resources¶

Resources manages free nodes/processors on the cluster or a workstation. If you initialize a instance of Resources you have to pass the number of nodes you want to use. You can either ask for the number of free nodes, allocate nodes or free nodes if your calculation is done.

The __call__, __enter__ and __exit__-methods allow to use a instance of Resources with the with-statement.

class Resources(resourcesDict=None)[source]¶

Class to manage the resources available for e.g. a joblist

Parameters

nRes (int) – number of available nodes

allocateResources(resourcesDict)[source]¶

Allocates the resources asked for.

Parameters

resourcesDict (dictionary) – dictionary with the resources to allocate

classmethod fromParameters(nProcs=None, nNodes=None, nTecplotLicenses=None)[source]¶

Generates a Resource instance from parameters.

Parameters

cls (class) – class

nProcs (Optional[int]) – number of processors available

nNodes (Optional[int]) – number of nodes available

nTecplotLicenses (Optional[int]) – number of tecplot licenses available

Returns

a new instance of Resources

Return type

Resources instance

releaseResources(resourcesDict)[source]¶

Allows to free resources allocated earlier

Parameters

resourcesDict (dictionary) – dictionary containing the resources to free

resourcesAvailable(resourcesDict)[source]¶

Returns true if in the resource management has the resources in resourcesDict available.

Parameters

resourcesDict (dictionary) – a dictionary mapping resources to the needed number

Returns

True if the resources are available in this environment

Return type

boolean

property resourcesDict¶

Returns a copy of the resources dictionary hold by this manager.

Returns

copy of the resources dictionary hold by this manager

Return type

dictionary

update(resourcesDict)[source]¶

Fills the resource dictionary using the resource fabric class. Replaces -1 with sys.maxint for easier handling in the other methods of this class.

Parameters

resourcesDict (dictionary) – a dictionary containing names and number of resources
CpuInfo¶

CpuInfo supplies a class to hold the information of the cpu(s) of a workstation. It owns methods to read the information from proc/cpuinfo and calculate more information from the given.

Required classes and methods:
import re
class CPUInfo(numLogicalCPUs: int, numCoresPerSocket: int, numSiblingsPerSocket: int)[source]¶

wraps properties of /proc/cpuinfo. Such as number of physical/logical CPU cores and hyper-threading status. After initialization those properties are accessible as object attributes. This class works only for _symmetric_ multiprocessor systems running linux.

Instance variables:
numLogicalCPUs - the number of cores that linux takes for the number of cpus numCoresPerSocket - how many physical cores are on each socket numSiblingsPerSocket - how many (total) cores are on each socket numSiblingsPerCore - how many (total) cores are on each core hyperThreadingEnabled - determines whether hyperThreading is activated on the system numPhysicalCores - how many real (physical) cores are available on the system numCPUSockets - how many sockets (slots) are the cpus plugged into
ClusterInfo¶

ClusterInfo supplies a class to hold information about a cluster, and the nodes one wants to use. The class method ‘ClusterInfo.from_environment’ returns an instance corresponding to the current environment by reading ssh fingerprints.

class ClusterInfo(name: str, ssh_fingerprints: Set[str], default_partition_name: str, partitions: Dict[str, mojo.jobManagement.management.clusterInfo.Partition], queuing_system: str = 'SLURM', account_env_var_name: str = 'AT_NUM_DEFAULT_ACCOUNT', extra_startup_minutes: int = 10, nodes_per_switch: int = 40, mail_relay: str = 'smtprelay.dlr.de')[source]¶

Container for information about a known cluster environment.

For a new cluster add a new instance to ‘_supportedClusters’ down below.

property cpuInfo¶

Get information about the clusters default partition.

If you need information about another partition you can use ‘instance.partitions[partition_name].cpu_info’.

classmethod from_environment(default_partition: Optional[str] = None) → Optional[mojo.jobManagement.management.clusterInfo.ClusterInfo][source]¶

Get a new instance corresponding to the current environment.

Parameters

default_partition – Select a different default partition (node type)

classmethod onValidCluster() → bool [source]¶

Method to check if this instance runs on a known cluster environment.

Table of Contents

Previous topic

Next topic

This Page

Jobmanagement (`jobManagement`)¶

Usage¶

Modules¶

JobList¶

Job¶

Requires:¶

AbstractJob¶

shellJob¶

MethodJob¶

QSubJob¶

Graph¶

Usage¶

Node¶

IdGenerator¶

Resources¶

CpuInfo¶

ClusterInfo¶

Jobmanagement (jobManagement)¶

Usage¶

Modules¶

JobList¶

Job¶

Requires:¶

AbstractJob¶

shellJob¶

MethodJob¶

QSubJob¶

Graph¶

Usage¶

Node¶

IdGenerator¶

Resources¶

CpuInfo¶

ClusterInfo¶

Jobmanagement (`jobManagement`)¶