edu.isi.pegasus.planner.code.generator.condor
Class SUBDAXGenerator

java.lang.Object
  extended by edu.isi.pegasus.planner.code.generator.condor.SUBDAXGenerator

public class SUBDAXGenerator
extends Object

The class that takes in a dax job specified in the DAX and renders it into a SUBDAG with pegasus-plan as the appropriate prescript.

Version:
$Revision: 4507 $
Author:
Karan Vahi

Field Summary
private static String CACHE_FILE_SUFFIX
          Suffix to be applied for cache file generation.
static String CONDOR_DAGMAN_LOGICAL_NAME
          The logical name with which to query the transformation catalog for the condor_dagman executable, that ends up running the mini dag as one job.
static String CONDOR_DAGMAN_NAMESPACE
          The namespace to use for condor dagman.
static String CPLANNER_LOGICAL_NAME
          The logical name with which to query the transformation catalog for cPlanner executable.
static String[][] DAGMAN_KNOBS
          The dagman knobs controlled through property.
static String DEFAULT_SUBDAX_CATEGORY_KEY
          The default category for the sub dax jobs.
static boolean GENERATE_SUBDAG_KEYWORD
          Whether to generate the SUBDAG keyword or not.
private  PegasusBag mBag
          Bag of Pegasus objects
private  PegasusProperties.CLEANUP_SCOPE mCleanupScope
          The cleanup scope for the workflows.
private  long mCondorVersion
          The long value of condor version.
private  ADag mDAG
           
private  PrintWriter mDAGWriter
          The print writer handle to DAG file being written out.
private  Map<String,String> mDAXJobIDToSubmitDirectoryCacheFile
          Maps a sub dax job id to it's submit directory.
private  LogManager mLogger
          Handle to the logging manager.
private  NumberFormat mNumFormatter
          The number formatter to format the run submit dir entries.
private  PlannerOptions mPegasusPlanOptions
          The object containing all the options passed to the Concrete Planner.
private  PegasusProperties mProps
          The handle to Pegasus Properties.
private  TransformationCatalog mTCHandle
          The handle to the transformation catalog
private  String mUser
          The username of the user running the program.
private  Graph mWorkflow
           
static String NAMESPACE
          The namespace to which the job in the MEGA DAG being created refer to.
static String RETRY_LOGICAL_NAME
          The planner utility that needs to be called as a prescript.
 
Constructor Summary
SUBDAXGenerator()
          The default constructor.
 
Method Summary
protected  Job constructDAGJob(Job subdaxJob, File directory, File subdaxDirectory, String basenamePrefix)
          Constructs a job that plans and submits the partitioned workflow, referred to by a Partition.
 String constructDAGManKnobs(Job job)
          Constructs Any extra arguments that need to be passed to dagman, as determined from the properties file.
 String constructPegasusPlanPrescript(Job job, PlannerOptions options, String rootUUID, String properties, String log)
          Constructs the pegasus plan prescript for the subdax
private  TransformationCatalogEntry constructTCEntryFromEnvironment()
          Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION
private  TransformationCatalogEntry constructTCEntryFromEnvProfiles(ENV env)
          Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION
private  TransformationCatalogEntry constructTCEntryFromPath()
          Returns a tranformation catalog entry object constructed from the path environment variable
private  TransformationCatalogEntry constructTransformationCatalogEntryForDAGMan(String path)
          Constructs TransformationCatalogEntry for DAGMan.
protected  String createSubmitDirectory(ADag dag, String dir, String user, String vogroup, boolean timestampBased)
          Creates the submit directory for the workflow.
protected  String createSubmitDirectory(String label, String dir, String user, String vogroup, boolean timestampBased)
          Creates the submit directory for the workflow.
protected  boolean createSymbolicLink(String source, String destination)
          This method generates a symlink between two files
protected  boolean createSymbolicLink(String source, String destination, boolean logErrorToDebug)
          This method generates a symlink between two files
 boolean createSymbolicLinktoCacheFile(PlannerOptions options, String label, String index)
          Creates a symbolic link to the DAX file in a dax sub directory in the submit directory
 String createSymbolicLinktoDAX(String submitDirectory, String dax)
          Creates a symbolic link to the DAX file in a dax sub directory in the submit directory
private  TransformationCatalogEntry defaultTCEntry(String site)
          Returns a default TC entry to be used in case entry is not found in the transformation catalog.
 Job generateCode(Job job)
          Generates code for a job
protected  String getBasename(String prefix, String suffix)
          Returns the basename of a dagman (usually) related file for a particular partition.
protected  String getCacheFileName(PlannerOptions options, String label, String index)
          Constructs the basename to the cache file that is to be used to log the transient files.
 Set<String> getParentsTransientRC(Job job)
          Returns a set containing the paths to the parent dax jobs transient replica catalogs.
protected  String getWorkflowFileName(PlannerOptions options, String label, String index, String suffix)
          Constructs the basename to a workflow file that.
 void initialize(PegasusBag bag, ADag dag, Graph workflow, PrintWriter dagWriter)
          Initializes the class.
protected static int parseInt(String s)
          Parses a string into an integer.
protected static void sanityCheck(File dir)
          Checks the destination location for existence, if it can be created, if it is writable etc.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_SUBDAX_CATEGORY_KEY

public static final String DEFAULT_SUBDAX_CATEGORY_KEY
The default category for the sub dax jobs.

See Also:
Constant Field Values

GENERATE_SUBDAG_KEYWORD

public static final boolean GENERATE_SUBDAG_KEYWORD
Whether to generate the SUBDAG keyword or not.

See Also:
Constant Field Values

CACHE_FILE_SUFFIX

private static final String CACHE_FILE_SUFFIX
Suffix to be applied for cache file generation.

See Also:
Constant Field Values

CPLANNER_LOGICAL_NAME

public static final String CPLANNER_LOGICAL_NAME
The logical name with which to query the transformation catalog for cPlanner executable.

See Also:
Constant Field Values

CONDOR_DAGMAN_NAMESPACE

public static final String CONDOR_DAGMAN_NAMESPACE
The namespace to use for condor dagman.

See Also:
Constant Field Values

CONDOR_DAGMAN_LOGICAL_NAME

public static final String CONDOR_DAGMAN_LOGICAL_NAME
The logical name with which to query the transformation catalog for the condor_dagman executable, that ends up running the mini dag as one job.

See Also:
Constant Field Values

NAMESPACE

public static final String NAMESPACE
The namespace to which the job in the MEGA DAG being created refer to.

See Also:
Constant Field Values

RETRY_LOGICAL_NAME

public static final String RETRY_LOGICAL_NAME
The planner utility that needs to be called as a prescript.

See Also:
Constant Field Values

DAGMAN_KNOBS

public static final String[][] DAGMAN_KNOBS
The dagman knobs controlled through property. They map the property name to the corresponding dagman option.


mUser

private String mUser
The username of the user running the program.


mNumFormatter

private NumberFormat mNumFormatter
The number formatter to format the run submit dir entries.


mPegasusPlanOptions

private PlannerOptions mPegasusPlanOptions
The object containing all the options passed to the Concrete Planner.


mProps

private PegasusProperties mProps
The handle to Pegasus Properties.


mLogger

private LogManager mLogger
Handle to the logging manager.


mBag

private PegasusBag mBag
Bag of Pegasus objects


mDAGWriter

private PrintWriter mDAGWriter
The print writer handle to DAG file being written out.


mTCHandle

private TransformationCatalog mTCHandle
The handle to the transformation catalog


mCleanupScope

private PegasusProperties.CLEANUP_SCOPE mCleanupScope
The cleanup scope for the workflows.


mCondorVersion

private long mCondorVersion
The long value of condor version.


mDAXJobIDToSubmitDirectoryCacheFile

private Map<String,String> mDAXJobIDToSubmitDirectoryCacheFile
Maps a sub dax job id to it's submit directory. The population relies on top down traversal during Code Generation.


mWorkflow

private Graph mWorkflow

mDAG

private ADag mDAG
Constructor Detail

SUBDAXGenerator

public SUBDAXGenerator()
The default constructor.

Method Detail

initialize

public void initialize(PegasusBag bag,
                       ADag dag,
                       Graph workflow,
                       PrintWriter dagWriter)
Initializes the class.

Parameters:
bag - the bag of objects required for initialization
dag - the dag for which code is being generated
workflow - the graph representation of the dag
daxReplicaStore - the dax replica store.
dagWriter - handle to the dag writer

generateCode

public Job generateCode(Job job)
Generates code for a job

Parameters:
job - the job for which code has to be generated.
Returns:
a Job if a submit file needs to be generated for the job. Else return null.

constructDAGJob

protected Job constructDAGJob(Job subdaxJob,
                              File directory,
                              File subdaxDirectory,
                              String basenamePrefix)
Constructs a job that plans and submits the partitioned workflow, referred to by a Partition. The main job itself is a condor dagman job that submits the concrete workflow. The concrete workflow is generated by running the planner in the prescript for the job.

Parameters:
subdaxJob - the original subdax job.
directory - the directory where the submit file for dagman job has to be written out to.
subdaxDirectory - the submit directory where the submit files for the subdag reside.
basenamePrefix - the basename to be assigned to the files associated with DAGMan
Returns:
the constructed DAG job.

constructDAGManKnobs

public String constructDAGManKnobs(Job job)
Constructs Any extra arguments that need to be passed to dagman, as determined from the properties file.

Parameters:
job - the job
Returns:
any arguments to be added, else empty string

parseInt

protected static int parseInt(String s)
Parses a string into an integer. Non valid values returned as -1

Parameters:
s - the String to be parsed as integer
Returns:
the int value if valid, else -1

getBasename

protected String getBasename(String prefix,
                             String suffix)
Returns the basename of a dagman (usually) related file for a particular partition.

Parameters:
prefix - the prefix.
suffix - the suffix for the file basename.
Returns:
the basename.

getCacheFileName

protected String getCacheFileName(PlannerOptions options,
                                  String label,
                                  String index)
Constructs the basename to the cache file that is to be used to log the transient files. The basename is dependant on whether the basename prefix has been specified at runtime or not.

Parameters:
options - the options for the sub workflow.
label - the label for the workflow.
index - the index for the workflow.
Returns:
the name of the cache file

getWorkflowFileName

protected String getWorkflowFileName(PlannerOptions options,
                                     String label,
                                     String index,
                                     String suffix)
Constructs the basename to a workflow file that. The basename is dependant on whether the basename prefix has been specified at runtime or not.

Parameters:
options - the options for the sub workflow.
label - the label for the workflow.
index - the index for the workflow.
Returns:
the name of the cache file

defaultTCEntry

private TransformationCatalogEntry defaultTCEntry(String site)
Returns a default TC entry to be used in case entry is not found in the transformation catalog.

Parameters:
site - the site for which the default entry is required.
Returns:
the default entry.

constructTCEntryFromEnvironment

private TransformationCatalogEntry constructTCEntryFromEnvironment()
Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION

Returns:
the constructed entry else null.

constructTCEntryFromPath

private TransformationCatalogEntry constructTCEntryFromPath()
Returns a tranformation catalog entry object constructed from the path environment variable

Parameters:
env - the environment profiles.
Returns:
the entry constructed else null if environment variables not defined.

constructTCEntryFromEnvProfiles

private TransformationCatalogEntry constructTCEntryFromEnvProfiles(ENV env)
Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION

Parameters:
env - the environment profiles.
Returns:
the entry constructed else null if environment variables not defined.

constructTransformationCatalogEntryForDAGMan

private TransformationCatalogEntry constructTransformationCatalogEntryForDAGMan(String path)
Constructs TransformationCatalogEntry for DAGMan.

Parameters:
path - path to dagman
Returns:
TransformationCatalogEntry for dagman if path is not null, else null.

constructPegasusPlanPrescript

public String constructPegasusPlanPrescript(Job job,
                                            PlannerOptions options,
                                            String rootUUID,
                                            String properties,
                                            String log)
Constructs the pegasus plan prescript for the subdax

Parameters:
job - the subdax job
options - the planner options with which subdax has to be invoked
rootUUID - the root workflow uuid
properties - the properties file.
log - the log for the prescript output
Returns:
the prescript

createSymbolicLinktoCacheFile

public boolean createSymbolicLinktoCacheFile(PlannerOptions options,
                                             String label,
                                             String index)
Creates a symbolic link to the DAX file in a dax sub directory in the submit directory

Parameters:
options - the options for the sub workflow.
label - the label for the workflow.
index - the index for the workflow.
Returns:
boolean whether symlink is created or not

createSymbolicLinktoDAX

public String createSymbolicLinktoDAX(String submitDirectory,
                                      String dax)
Creates a symbolic link to the DAX file in a dax sub directory in the submit directory

Parameters:
submitDirectory - the submit directory for the sub workflow.
dax - the dax file to which the symbolic link has to be created.
Returns:
the symbolic link created.

createSubmitDirectory

protected String createSubmitDirectory(ADag dag,
                                       String dir,
                                       String user,
                                       String vogroup,
                                       boolean timestampBased)
                                throws IOException
Creates the submit directory for the workflow. This is not thread safe.

Parameters:
dag - the workflow being worked upon.
dir - the base directory specified by the user.
user - the username of the user.
vogroup - the vogroup to which the user belongs to.
timestampBased - boolean indicating whether to have a timestamp based dir or not
Returns:
the directory name created relative to the base directory passed as input.
Throws:
IOException - in case of unable to create submit directory.

createSubmitDirectory

protected String createSubmitDirectory(String label,
                                       String dir,
                                       String user,
                                       String vogroup,
                                       boolean timestampBased)
                                throws IOException
Creates the submit directory for the workflow. This is not thread safe.

Parameters:
label - the label of the workflow
dir - the base directory specified by the user.
user - the username of the user.
vogroup - the vogroup to which the user belongs to.
timestampBased - boolean indicating whether to have a timestamp based dir or not
Returns:
the directory name created relative to the base directory passed as input.
Throws:
IOException - in case of unable to create submit directory.

sanityCheck

protected static void sanityCheck(File dir)
                           throws IOException
Checks the destination location for existence, if it can be created, if it is writable etc.

Parameters:
dir - is the new base directory to optionally create.
Throws:
IOException - in case of error while writing out files.

createSymbolicLink

protected boolean createSymbolicLink(String source,
                                     String destination)
This method generates a symlink between two files

Parameters:
source - the file that has to be symlinked
destination - the destination of the symlink
Returns:
boolean indicating if creation of symlink was successful or not

createSymbolicLink

protected boolean createSymbolicLink(String source,
                                     String destination,
                                     boolean logErrorToDebug)
This method generates a symlink between two files

Parameters:
source - the file that has to be symlinked
destination - the destination of the symlink
logErrorToDebug - whether to log messeage to debug or not
Returns:
boolean indicating if creation of symlink was successful or not

getParentsTransientRC

public Set<String> getParentsTransientRC(Job job)
Returns a set containing the paths to the parent dax jobs transient replica catalogs.

Parameters:
job - the job
Returns:
Set of paths


Copyright © 2011 The University of Southern California. All Rights Reserved.