edu.isi.pegasus.planner.code.gridstart
Class PegasusLite

java.lang.Object
  extended by edu.isi.pegasus.planner.code.gridstart.PegasusLite
All Implemented Interfaces:
GridStart

public class PegasusLite
extends Object
implements GridStart

This class launches all the jobs using Pegasus Lite a shell script based wrapper. The Pegasus Lite shell script for the compute jobs contains the commands to

 1) create directory on worker node
 2) fetch input data files
 3) execute the job
 4) transfer the output data files
 5) cleanup the directory
 
The following property should be set to false to disable the staging of the SLS files via the first level staging jobs
 pegasus.transfer.stage.sls.file     false
 
To enable this implementation at runtime set the following property
 pegasus.gridstart PegasusLite
 

Version:
$Revision: 4781 $
Author:
Karan Vahi

Field Summary
static String CLASSNAME
          The basename of the class that is implmenting this.
private  PegasusBag mBag
           
private  Map<String,String> mChmodOnExecutionSiteMap
          A map indexed by the execution site and value is the path to chmod on that site.
private  ADag mDAG
           
protected  boolean mEnablingPartOfAggregatedJob
          An instance variable to track if enabling is happening as part of a clustered job.
protected  boolean mGenerateLOF
          A boolean indicating whether to generate lof files or not.
private  Kickstart mKickstartGridStartImpl
          Handle to kickstart GridStart implementation.
protected  String mLocalPathToPegasusLiteCommon
          The local path on the submit host to pegasus-lite-common.sh
protected  LogManager mLogger
          The LogManager object which is used to log all the messages.
private  String mMajorVersionLevel
          Stores the major version of the planner.
private  String mMinorVersionLevel
          Stores the major version of the planner.
private  String mPatchVersionLevel
          Stores the major version of the planner.
protected  PlannerOptions mPOptions
          The options passed to the planner.
protected  PegasusProperties mProps
          The object holding all the properties pertaining to Pegasus.
protected  SiteStore mSiteStore
          Handle to the site catalog store.
protected  SLS mSLS
          The handle to the SLS implementor
protected  boolean mStageSLSFile
          Boolean to track whether to stage sls file or not
protected  String mSubmitDir
          The submit directory where the submit files are being generated for the workflow.
private  TransformationCatalog mTCHandle
          Handle to Transformation Catalog.
protected  boolean mTransferWorkerPackage
          Boolean indicating whether worker package transfer is enabled or not
protected  boolean mWorkerNodeExecution
          A boolean indicating whether to have worker node execution or not.
(package private)  Map<String,String> mWorkerPackageMap
          A map indexed by execution site and the corresponding worker package location in the submit directory
static String PEGASUS_LITE_COMMON_FILE_BASENAME
          The basename of the pegasus lite common shell functions file.
static String SHORT_NAME
          The SHORTNAME for this implementation.
static String XBIT_DERIVATION_NS
          The derivation namespace for the setXBit jobs.
static String XBIT_DERIVATION_VERSION
          The version number for the derivations for setXBit jobs.
static String XBIT_EXECUTABLE_BASENAME
          The basename of the pegasus dirmanager executable.
static String XBIT_TRANSFORMATION
          The logical name of the transformation that creates directories on the remote execution pools.
static String XBIT_TRANSFORMATION_NS
          The transformation namespace for the setXBit jobs.
static String XBIT_TRANSFORMATION_VERSION
          The version number for the derivations for setXBit jobs.
 
Fields inherited from interface edu.isi.pegasus.planner.code.GridStart
mSeparator, VERSION
 
Constructor Summary
PegasusLite()
           
 
Method Summary
private  void associateCredentials(Job job, Collection<FileTransfer> files)
          Associates credentials with the job corresponding to the files that are being transferred.
 boolean canSetXBit()
          Indicates whether the enabling mechanism can set the X bit on the executable on the remote grid site, in addition to launching it on the remote grid stie
private  void construct(Job job, String key, String value)
          Constructs a condor variable in the condor profile namespace associated with the job.
protected  StringBuffer convertToTransferInputFormat(Collection<FileTransfer> files)
          Convers the collection of files into an input format suitable for the transfer executable
 String defaultPOSTScript()
          Returns the SHORT_NAME for the POSTScript implementation that is used to be as default with this GridStart implementation.
 boolean enable(AggregatedJob job, boolean isGlobusJob)
          Enables a job to run on the grid.
 boolean enable(Job job, boolean isGlobusJob)
          Enables a job to run on the grid by launching it directly.
private  void enableForWorkerNodeExecution(Job job, boolean isGlobusJob)
          Enables jobs for worker node execution.
 String generateListofFilenamesFile(Set files, String basename)
          Writes out the list of filenames file for the job.
private  String getDirectoryKey(Job job)
          Returns the directory that is associated with the job to specify the directory in which the job needs to run
protected  String getPathToChmodExecutable(String site)
          Returns the path to the chmod executable for a particular execution site by looking up the transformation executable.
protected  String getSubmitHostPathToPegasusLiteCommon()
          Determines the path to common shell functions file that Pegasus Lite wrapped jobs use.
 String getVDSKeyValue()
          Returns the value of the vds profile with key as Pegasus.GRIDSTART_KEY, that would result in the loading of this particular implementation.
 String getWorkerNodeDirectory(Job job)
          Returns the directory in which the job executes on the worker node.
 void initialize(PegasusBag bag, ADag dag)
          Initializes the GridStart implementation.
private  boolean removeDirectoryKey(Job job)
          Returns a boolean indicating whether to remove remote directory information or not from the job.
protected  boolean setXBitOnFile(String file)
          Sets the xbit on the file.
 String shortDescribe()
          Returns a short textual description in the form of the name of the class.
protected  StringBuffer slurpInFile(String directory, String file)
          Convenience method to slurp in contents of a file into memory.
 void useFullPathToGridStarts(boolean fullPath)
          Setter method to control whether a full path to Gridstart should be returned while wrapping a job or not.
protected  File wrapJobWithPegasusLite(Job job, boolean isGlobusJob)
          Generates a seqexec input file for the job.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

mBag

private PegasusBag mBag

mDAG

private ADag mDAG

CLASSNAME

public static final String CLASSNAME
The basename of the class that is implmenting this. Could have been determined by reflection.

See Also:
Constant Field Values

SHORT_NAME

public static final String SHORT_NAME
The SHORTNAME for this implementation.

See Also:
Constant Field Values

PEGASUS_LITE_COMMON_FILE_BASENAME

public static final String PEGASUS_LITE_COMMON_FILE_BASENAME
The basename of the pegasus lite common shell functions file.

See Also:
Constant Field Values

XBIT_TRANSFORMATION

public static final String XBIT_TRANSFORMATION
The logical name of the transformation that creates directories on the remote execution pools.

See Also:
Constant Field Values

XBIT_EXECUTABLE_BASENAME

public static final String XBIT_EXECUTABLE_BASENAME
The basename of the pegasus dirmanager executable.

See Also:
Constant Field Values

XBIT_TRANSFORMATION_NS

public static final String XBIT_TRANSFORMATION_NS
The transformation namespace for the setXBit jobs.

See Also:
Constant Field Values

XBIT_TRANSFORMATION_VERSION

public static final String XBIT_TRANSFORMATION_VERSION
The version number for the derivations for setXBit jobs.


XBIT_DERIVATION_NS

public static final String XBIT_DERIVATION_NS
The derivation namespace for the setXBit jobs.

See Also:
Constant Field Values

XBIT_DERIVATION_VERSION

public static final String XBIT_DERIVATION_VERSION
The version number for the derivations for setXBit jobs.


mMajorVersionLevel

private String mMajorVersionLevel
Stores the major version of the planner.


mMinorVersionLevel

private String mMinorVersionLevel
Stores the major version of the planner.


mPatchVersionLevel

private String mPatchVersionLevel
Stores the major version of the planner.


mLogger

protected LogManager mLogger
The LogManager object which is used to log all the messages.


mProps

protected PegasusProperties mProps
The object holding all the properties pertaining to Pegasus.


mSubmitDir

protected String mSubmitDir
The submit directory where the submit files are being generated for the workflow.


mGenerateLOF

protected boolean mGenerateLOF
A boolean indicating whether to generate lof files or not.


mWorkerNodeExecution

protected boolean mWorkerNodeExecution
A boolean indicating whether to have worker node execution or not.


mSLS

protected SLS mSLS
The handle to the SLS implementor


mPOptions

protected PlannerOptions mPOptions
The options passed to the planner.


mSiteStore

protected SiteStore mSiteStore
Handle to the site catalog store.


mEnablingPartOfAggregatedJob

protected boolean mEnablingPartOfAggregatedJob
An instance variable to track if enabling is happening as part of a clustered job. See Bug 21 comments on Pegasus Bugzilla


mKickstartGridStartImpl

private Kickstart mKickstartGridStartImpl
Handle to kickstart GridStart implementation.


mTCHandle

private TransformationCatalog mTCHandle
Handle to Transformation Catalog.


mStageSLSFile

protected boolean mStageSLSFile
Boolean to track whether to stage sls file or not


mLocalPathToPegasusLiteCommon

protected String mLocalPathToPegasusLiteCommon
The local path on the submit host to pegasus-lite-common.sh


mTransferWorkerPackage

protected boolean mTransferWorkerPackage
Boolean indicating whether worker package transfer is enabled or not


mWorkerPackageMap

Map<String,String> mWorkerPackageMap
A map indexed by execution site and the corresponding worker package location in the submit directory


mChmodOnExecutionSiteMap

private Map<String,String> mChmodOnExecutionSiteMap
A map indexed by the execution site and value is the path to chmod on that site.

Constructor Detail

PegasusLite

public PegasusLite()
Method Detail

initialize

public void initialize(PegasusBag bag,
                       ADag dag)
Initializes the GridStart implementation.

Specified by:
initialize in interface GridStart
Parameters:
bag - the bag of objects that is used for initialization.
dag - the concrete dag so far.

enable

public boolean enable(AggregatedJob job,
                      boolean isGlobusJob)
Enables a job to run on the grid. This also determines how the stdin,stderr and stdout of the job are to be propogated. To grid enable a job, the job may need to be wrapped into another job, that actually launches the job. It usually results in the job description passed being modified modified.

Specified by:
enable in interface GridStart
Parameters:
job - the Job object containing the job description of the job that has to be enabled on the grid.
isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
Returns:
boolean true if enabling was successful,else false.

enable

public boolean enable(Job job,
                      boolean isGlobusJob)
Enables a job to run on the grid by launching it directly. It ends up running the executable directly without going through any intermediate launcher executable. It connects the stdio, and stderr to underlying condor mechanisms so that they are transported back to the submit host.

Specified by:
enable in interface GridStart
Parameters:
job - the Job object containing the job description of the job that has to be enabled on the grid.
isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
Returns:
boolean true if enabling was successful,else false in case when the path to kickstart could not be determined on the site where the job is scheduled.

enableForWorkerNodeExecution

private void enableForWorkerNodeExecution(Job job,
                                          boolean isGlobusJob)
Enables jobs for worker node execution.

Parameters:
job - the job to be enabled.
isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.

canSetXBit

public boolean canSetXBit()
Indicates whether the enabling mechanism can set the X bit on the executable on the remote grid site, in addition to launching it on the remote grid stie

Specified by:
canSetXBit in interface GridStart
Returns:
false, as no wrapper executable is being used.

getVDSKeyValue

public String getVDSKeyValue()
Returns the value of the vds profile with key as Pegasus.GRIDSTART_KEY, that would result in the loading of this particular implementation. It is usually the name of the implementing class without the package name.

Specified by:
getVDSKeyValue in interface GridStart
Returns:
the value of the profile key.
See Also:
org.griphyn.cPlanner.namespace.Pegasus#GRIDSTART_KEY

shortDescribe

public String shortDescribe()
Returns a short textual description in the form of the name of the class.

Specified by:
shortDescribe in interface GridStart
Returns:
short textual description.

defaultPOSTScript

public String defaultPOSTScript()
Returns the SHORT_NAME for the POSTScript implementation that is used to be as default with this GridStart implementation.

Specified by:
defaultPOSTScript in interface GridStart
Returns:
the identifier for the default POSTScript implementation for kickstart gridstart module.
See Also:
Kickstart.defaultPOSTScript()

getDirectoryKey

private String getDirectoryKey(Job job)
Returns the directory that is associated with the job to specify the directory in which the job needs to run

Parameters:
job - the job
Returns:
the condor key . can be initialdir or remote_initialdir

removeDirectoryKey

private boolean removeDirectoryKey(Job job)
Returns a boolean indicating whether to remove remote directory information or not from the job. This is determined on the basis of the style key that is associated with the job.

Parameters:
job - the job in question.
Returns:
boolean

construct

private void construct(Job job,
                       String key,
                       String value)
Constructs a condor variable in the condor profile namespace associated with the job. Overrides any preexisting key values.

Parameters:
job - contains the job description.
key - the key of the profile.
value - the associated value.

generateListofFilenamesFile

public String generateListofFilenamesFile(Set files,
                                          String basename)
Writes out the list of filenames file for the job.

Parameters:
files - the list of PegasusFile objects contains the files whose stat information is required.
basename - the basename of the file that is to be created
Returns:
the full path to lof file created, else null if no file is written out.

getWorkerNodeDirectory

public String getWorkerNodeDirectory(Job job)
Returns the directory in which the job executes on the worker node.

Specified by:
getWorkerNodeDirectory in interface GridStart
Parameters:
job -
Returns:
the full path to the directory where the job executes

wrapJobWithPegasusLite

protected File wrapJobWithPegasusLite(Job job,
                                      boolean isGlobusJob)
Generates a seqexec input file for the job. The function first enables the job via kickstart module for worker node execution and then retrieves the commands to put in the input file from the environment variables specified for kickstart. It creates a single input file for the seqexec invocation. The input file contains commands to
 1) create directory on worker node
 2) fetch input data files
 3) execute the job
 4) transfer the output data files
 5) cleanup the directory
 

Parameters:
job - the job to be enabled.
isGlobusJob - is true, if the job generated a line universe = globus, and thus runs remotely. Set to false, if the job runs on the submit host in any way.
Returns:
the file handle to the seqexec input file

convertToTransferInputFormat

protected StringBuffer convertToTransferInputFormat(Collection<FileTransfer> files)
Convers the collection of files into an input format suitable for the transfer executable

Parameters:
files - Collection of FileTransfer objects.
Returns:
the blurb containing the files in the input format for the transfer executable

slurpInFile

protected StringBuffer slurpInFile(String directory,
                                   String file)
                            throws IOException
Convenience method to slurp in contents of a file into memory.

Parameters:
directory - the directory where the file resides
file - the file to be slurped in.
Returns:
StringBuffer containing the contents
Throws:
IOException

getPathToChmodExecutable

protected String getPathToChmodExecutable(String site)
Returns the path to the chmod executable for a particular execution site by looking up the transformation executable.

Parameters:
site - the execution site.
Returns:
the path to chmod executable

setXBitOnFile

protected boolean setXBitOnFile(String file)
Sets the xbit on the file.

Parameters:
file - the file for which the xbit is to be set
Returns:
boolean indicating whether xbit was set or not.

getSubmitHostPathToPegasusLiteCommon

protected String getSubmitHostPathToPegasusLiteCommon()
Determines the path to common shell functions file that Pegasus Lite wrapped jobs use.

Returns:
the path on the submit host.

useFullPathToGridStarts

public void useFullPathToGridStarts(boolean fullPath)
Description copied from interface: GridStart
Setter method to control whether a full path to Gridstart should be returned while wrapping a job or not.

Specified by:
useFullPathToGridStarts in interface GridStart
Parameters:
fullPath - if set to true, indicates that full path would be used.

associateCredentials

private void associateCredentials(Job job,
                                  Collection<FileTransfer> files)
Associates credentials with the job corresponding to the files that are being transferred.

Parameters:
job - the job for which credentials need to be added.
files - the files that are being transferred.


Copyright © 2011 The University of Southern California. All Rights Reserved.