edu.isi.pegasus.planner.catalog.replica.impl
Class SimpleFile

java.lang.Object
  extended by edu.isi.pegasus.planner.catalog.replica.impl.SimpleFile
All Implemented Interfaces:
Catalog, ReplicaCatalog

public class SimpleFile
extends Object
implements ReplicaCatalog

This class implements a replica catalog on top of a simple file which contains two or more columns. It is neither transactionally safe, nor advised to use for production purposes in any way. Multiple concurrent instances will clobber each other!

The site attribute should be specified whenever possible. The attribute key for the site attribute is "pool". For the shell planner, its value will always be "local".

The class is permissive in what inputs it accepts. The LFN may or may not be quoted. If it contains linear whitespace, quotes, backslash or an equality sign, it must be quoted and escaped. Ditto for the PFN. The attribute key-value pairs are separated by an equality sign without any whitespaces. The value may be in quoted. The LFN sentiments about quoting apply.

 LFN PFN
 LFN PFN a=b [..]
 LFN PFN a="b" [..]
 "LFN w/LWS" "PFN w/LWS" [..]
 
The class is strict when producing (storing) results. The LFN and PFN are only quoted and escaped, if necessary. The attribute values are always quoted and escaped.

Version:
$Revision: 2079 $
Author:
Jens-S. Vöckler

Field Summary
private static short[][] c_action
          Contains the actions to perform upon each state transition including transition into self state.
private static String[] c_final
          Provides the final states and associated messages.
private static short[][] c_state
          Contains the state transition tables.
protected  String m_filename
          Records the name of the on-disk representation.
protected  Map m_lfn
          Maintains a memory slurp of the file representation.
protected  boolean m_quote
          Records the quoting mode for LFNs and PFNs.
(package private)  boolean m_readonly
          A boolean indicating whether the catalog is read only or not.
static String READ_ONLY_KEY
          The name of the key that disables writing back to the cache file.
 
Fields inherited from interface edu.isi.pegasus.planner.catalog.ReplicaCatalog
BATCH_KEY, c_prefix, DB_PREFIX
 
Fields inherited from interface edu.isi.pegasus.planner.catalog.Catalog
DB_ALL_PREFIX
 
Constructor Summary
SimpleFile()
          Default empty constructor creates an object that is not yet connected to any database.
 
Method Summary
 int clear()
          Removes everything.
 void close()
          This operation will dump the in-memory representation back onto disk.
 boolean connect(Properties props)
          Establishes a connection to the database from the properties.
 boolean connect(String filename)
          Reads the on-disk map file into memory.
 int delete(Map x, boolean matchAttributes)
          Deletes multiple mappings into the replica catalog.
 int delete(String lfn, ReplicaCatalogEntry tuple)
          Deletes a very specific mapping from the replica catalog.
 int delete(String lfn, String pfn)
          Deletes a specific mapping from the replica catalog.
 int delete(String lfn, String name, Object value)
          Deletes all PFN entries for a given LFN from the replica catalog where the PFN attribute is found, and matches exactly the object value.
 int deleteByResource(String lfn, String handle)
          Deletes all PFN entries for a given LFN from the replica catalog where the resource handle is found.
private  boolean hasMatchingAttr(ReplicaCatalogEntry rce, String name, Object value)
          Looks for a match of an attribute value in a replica catalog entry.
 int insert(Map x)
          Inserts multiple mappings into the replica catalog.
 int insert(String lfn, ReplicaCatalogEntry tuple)
          Inserts a new mapping into the replica catalog.
 int insert(String lfn, String pfn, String handle)
          Inserts a new mapping into the replica catalog.
 boolean isClosed()
          Predicate to check, if the connection with the catalog's implementation is still active.
 Set list()
          Lists all logical filenames in the catalog.
 Set list(String constraint)
          Lists a subset of all logical filenames in the catalog.
 Map lookup(Map constraints)
          Retrieves multiple entries for a given logical filename, up to the complete catalog.
 Map lookup(Set lfns)
          Retrieves multiple entries for a given logical filename, up to the complete catalog.
 Map lookup(Set lfns, String handle)
          Retrieves multiple entries for a given logical filename, up to the complete catalog.
 Collection lookup(String lfn)
          Retrieves all entries for a given LFN from the replica catalog.
 String lookup(String lfn, String handle)
          Retrieves the entry for a given filename and site handle from the replica catalog.
 Map lookupNoAttributes(Set lfns)
          Retrieves multiple entries for a given logical filename, up to the complete catalog.
 Map lookupNoAttributes(Set lfns, String handle)
          Retrieves multiple entries for a given logical filename, up to the complete catalog.
 Set lookupNoAttributes(String lfn)
          Retrieves all entries for a given LFN from the replica catalog.
private  boolean matchMe(ReplicaCatalogEntry full, ReplicaCatalogEntry part)
          Attempts to see, if all keys in the partial replica catalog entry are contained in the full replica catalog entry.
 boolean parse(String line, int lineno)
          Parses a line from the file replica catalog
 String quote(Escape e, String s)
          Quotes a string only if necessary.
 int remove(Set lfns)
          Removes all mappings for a set of LFNs.
 int remove(String lfn)
          Removes all mappings for an LFN from the replica catalog.
 int removeByAttribute(String handle)
          Removes all entries associated with a particular resource handle.
 int removeByAttribute(String name, Object value)
          Removes all entries from the replica catalog where the PFN attribute is found, and matches exactly the object value.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

READ_ONLY_KEY

public static final String READ_ONLY_KEY
The name of the key that disables writing back to the cache file. Designates a static file. i.e. read only

See Also:
Constant Field Values

m_quote

protected boolean m_quote
Records the quoting mode for LFNs and PFNs. If false, only quote as necessary. If true, always quote all LFNs and PFNs.


m_filename

protected String m_filename
Records the name of the on-disk representation.


m_lfn

protected Map m_lfn
Maintains a memory slurp of the file representation.


m_readonly

boolean m_readonly
A boolean indicating whether the catalog is read only or not.


c_final

private static final String[] c_final
Provides the final states and associated messages.
 ---+----+--------------------
 F1 | 17 | final state, no record
 F2 | 16 | final state, valid record
 E1 | 18 | premature end
 E2 | 19 | illegal character
 E3 | 20 | incomplete record
 E4 | 21 | unterminated string
 


c_state

private static final short[][] c_state
Contains the state transition tables. The notes a through c mark similar states:
      | EOS | lws |  =  | ""  | \\  | else|
 -----+-----+-----+-----+-----+-----+-----+--------------
    0 | F1,-|  0,-|  E2 |  3,-|  E2 | 1,Sl| skip initial ws
 a  1 |  E3 | 2,Fl|  E2 |  E2 |  E2 | 1,Sl| LFN w/o quotes
    2 |  E3 |  2,-|  E2 |  6,-|  E2 | 5,Sp| skip ws between LFN and PFN
 b  3 |  E4 | 3,Sl| 3,Sl| 2,Fl|  4,-| 3,Sl| LFN in quotes
 c  4 |  E4 | 3,Sl| 3,Sl| 3,Sl| 3,Sl| 3,Sl| LFN backslash escape
 -----+-----+-----+-----+-----+-----+-----+--------------
 a  5 |F2,Fp| 8,Fp|  E2 |  E2 |  E2 | 5,Sp| PFN w/o quotes
 b  6 |  E4 | 6,Sp| 6,Sp| 8,Fp|  7,-| 6,Sp| PFN in quotes
 c  7 |  E4 | 6,Sp| 6,Sp| 6,Sp| 6,Sp| 6,Sp| PFN backslash escape
    8 | F2,-|  8,-|  E2 |  E2 |  E2 | 9,Sk| skip ws before attributes
    9 |  E1 |  E2 |10,Fk|  E2 |  E2 | 9,Sk| attribute key
   10 |  E1 |  E2 |  E2 | 12,-|  E2 |11,Sv| equals sign
 -----+-----+-----+-----+-----+-----+-----+--------------
 a 11 |F2,Fv| 8,Fv|  E2 |  E2 |  E2 |11,Sv| value w/o quotes
 b 12 |  E4 |12,Sv|12,Sv| 8,Fv| 13,-|12,Sv| value in quotes
 c 13 |  E4 |12,Sv|12,Sv|12,Sv|12,Sv|12,Sv| value backslash escape
 


c_action

private static final short[][] c_action
Contains the actions to perform upon each state transition including transition into self state.
    |   |
 ---+---+-------------------------------------------
  - | 0 | no op
  S*| 1 | append to sb
  Fl| 2 | lfn := sb
  Fp| 3 | pfn := sb
  Fk| 4 | key := sb
  Fv| 5 | value := sb
 

Constructor Detail

SimpleFile

public SimpleFile()
Default empty constructor creates an object that is not yet connected to any database. You must use support methods to connect before this instance becomes usable.

See Also:
connect( Properties )
Method Detail

parse

public boolean parse(String line,
                     int lineno)
Parses a line from the file replica catalog

Parameters:
line - is the line to parse
lineno - is the line number of this line
Returns:
true if a valid element was generated

connect

public boolean connect(String filename)
Reads the on-disk map file into memory.

Parameters:
filename - is the name of the file to read.
Returns:
true, if the in-memory data structures appear sound.

connect

public boolean connect(Properties props)
Establishes a connection to the database from the properties. You will need to specify a "file" property to point to the location of the on-disk instance. If the property "quote" is set to a true value, LFNs and PFNs are always quoted. By default, and if false, LFNs and PFNs are only quoted as necessary.

Specified by:
connect in interface Catalog
Parameters:
props - is the property table with sufficient settings to establish a link with the database.
Returns:
true if connected, false if failed to connect.
Throws:
Error - subclasses for runtime errors in the class loader.

quote

public String quote(Escape e,
                    String s)
Quotes a string only if necessary. This methods first determines, if a strings requires quoting, because it contains whitespace, an equality sign, quotes, or a backslash. If not, the string is not quoted. If the input contains forbidden characters, it is placed into quotes and quote and backslash are backslash escaped.

However, if the property "quote" had a true value when connecting to the database, output will always be quoted.

Parameters:
e - is the Escape instance used to escape strings.
s - is the string that may require quoting
Returns:
either the original string, or a newly allocated instance to an escaped string.

close

public void close()
This operation will dump the in-memory representation back onto disk. The store operation is strict in what it produces. The LFN and PFN records are only quoted, if they require quotes, because they contain special characters. The attributes are always quoted and thus quote-escaped.

Specified by:
close in interface Catalog

isClosed

public boolean isClosed()
Predicate to check, if the connection with the catalog's implementation is still active. This helps determining, if it makes sense to call close().

Specified by:
isClosed in interface Catalog
Returns:
true, if the implementation is disassociated, false otherwise.
See Also:
close()

lookup

public String lookup(String lfn,
                     String handle)
Retrieves the entry for a given filename and site handle from the replica catalog.

Specified by:
lookup in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to obtain information for.
handle - is the resource handle to obtain entries for.
Returns:
the (first) matching physical filename, or null if no match was found.

lookup

public Collection lookup(String lfn)
Retrieves all entries for a given LFN from the replica catalog. Each entry in the result set is a tuple of a PFN and all its attributes.

Specified by:
lookup in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to obtain information for.
Returns:
a collection of replica catalog entries
See Also:
ReplicaCatalogEntry

lookupNoAttributes

public Set lookupNoAttributes(String lfn)
Retrieves all entries for a given LFN from the replica catalog. Each entry in the result set is just a PFN string. Duplicates are reduced through the set paradigm.

Specified by:
lookupNoAttributes in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to obtain information for.
Returns:
a set of PFN strings

lookup

public Map lookup(Set lfns)
Retrieves multiple entries for a given logical filename, up to the complete catalog. Retrieving full catalogs should be harmful, but may be helpful in an online display or portal.

Specified by:
lookup in interface ReplicaCatalog
Parameters:
lfns - is a set of logical filename strings to look up.
Returns:
a map indexed by the LFN. Each value is a collection of replica catalog entries for the LFN.
See Also:
org.griphyn.common.catalog.ReplicaCatalogEntry

lookupNoAttributes

public Map lookupNoAttributes(Set lfns)
Retrieves multiple entries for a given logical filename, up to the complete catalog. Retrieving full catalogs should be harmful, but may be helpful in an online display or portal.

Specified by:
lookupNoAttributes in interface ReplicaCatalog
Parameters:
lfns - is a set of logical filename strings to look up.
Returns:
a map indexed by the LFN. Each value is a set of PFN strings.

lookup

public Map lookup(Set lfns,
                  String handle)
Retrieves multiple entries for a given logical filename, up to the complete catalog. Retrieving full catalogs should be harmful, but may be helpful in online display or portal.

Specified by:
lookup in interface ReplicaCatalog
Parameters:
lfns - is a set of logical filename strings to look up.
handle - is the resource handle, restricting the LFNs.
Returns:
a map indexed by the LFN. Each value is a collection of replica catalog entries (all attributes).
See Also:
ReplicaCatalogEntry

lookupNoAttributes

public Map lookupNoAttributes(Set lfns,
                              String handle)
Retrieves multiple entries for a given logical filename, up to the complete catalog. Retrieving full catalogs should be harmful, but may be helpful in online display or portal.

Specified by:
lookupNoAttributes in interface ReplicaCatalog
Parameters:
lfns - is a set of logical filename strings to look up.
handle - is the resource handle, restricting the LFNs.
Returns:
a map indexed by the LFN. Each value is a set of physical filenames.

lookup

public Map lookup(Map constraints)
Retrieves multiple entries for a given logical filename, up to the complete catalog. Retrieving full catalogs should be harmful, but may be helpful in online display or portal.

Specified by:
lookup in interface ReplicaCatalog
Parameters:
constraints - is mapping of keys 'lfn', 'pfn', or any attribute name, e.g. the resource handle 'pool', to a string that has some meaning to the implementing system. This can be a SQL wildcard for queries, or a regular expression for Java-based memory collections. Unknown keys are ignored. Using an empty map requests the complete catalog.
Returns:
a map indexed by the LFN. Each value is a collection of replica catalog entries.
See Also:
ReplicaCatalogEntry

list

public Set list()
Lists all logical filenames in the catalog.

Specified by:
list in interface ReplicaCatalog
Returns:
A set of all logical filenames known to the catalog.

list

public Set list(String constraint)
Lists a subset of all logical filenames in the catalog.

Specified by:
list in interface ReplicaCatalog
Parameters:
constraint - is a constraint for the logical filename only. It is a string that has some meaning to the implementing system. This can be a SQL wildcard for queries, or a regular expression for Java-based memory collections.
Returns:
A set of logical filenames that match. The set may be empty

insert

public int insert(String lfn,
                  ReplicaCatalogEntry tuple)
Inserts a new mapping into the replica catalog. Any existing mapping of the same LFN and PFN will be replaced, including all its attributes.

Specified by:
insert in interface ReplicaCatalog
Parameters:
lfn - is the logical filename under which to book the entry.
tuple - is the physical filename and associated PFN attributes.
Returns:
number of insertions, should always be 1. On failure, throw an exception, don't use zero.

insert

public int insert(String lfn,
                  String pfn,
                  String handle)
Inserts a new mapping into the replica catalog. This is a convenience function exposing the resource handle. Internally, the ReplicaCatalogEntry element will be contructed, and passed to the appropriate insert function.

Specified by:
insert in interface ReplicaCatalog
Parameters:
lfn - is the logical filename under which to book the entry.
pfn - is the physical filename associated with it.
handle - is a resource handle where the PFN resides.
Returns:
number of insertions, should always be 1. On failure, throw an exception, don't use zero.
See Also:
insert( String, ReplicaCatalogEntry ), ReplicaCatalogEntry

insert

public int insert(Map x)
Inserts multiple mappings into the replica catalog. The input is a map indexed by the LFN. The value for each LFN key is a collection of replica catalog entries. Note that this operation will replace existing entries.

Specified by:
insert in interface ReplicaCatalog
Parameters:
x - is a map from logical filename string to list of replica catalog entries.
Returns:
the number of insertions.
See Also:
org.griphyn.common.catalog.ReplicaCatalogEntry

delete

public int delete(String lfn,
                  String pfn)
Deletes a specific mapping from the replica catalog. We don't care about the resource handle. More than one entry could theoretically be removed. Upon removal of an entry, all attributes associated with the PFN also evaporate (cascading deletion).

Specified by:
delete in interface ReplicaCatalog
Parameters:
lfn - is the logical filename in the tuple.
pfn - is the physical filename in the tuple.
Returns:
the number of removed entries.

delete

public int delete(Map x,
                  boolean matchAttributes)
Deletes multiple mappings into the replica catalog. The input is a map indexed by the LFN. The value for each LFN key is a collection of replica catalog entries. On setting matchAttributes to false, all entries having matching lfn pfn mapping to an entry in the Map are deleted. However, upon removal of an entry, all attributes associated with the pfn also evaporate (cascaded deletion).

Specified by:
delete in interface ReplicaCatalog
Parameters:
x - is a map from logical filename string to list of replica catalog entries.
matchAttributes - whether mapping should be deleted only if all attributes match.
Returns:
the number of deletions.
See Also:
ReplicaCatalogEntry

matchMe

private boolean matchMe(ReplicaCatalogEntry full,
                        ReplicaCatalogEntry part)
Attempts to see, if all keys in the partial replica catalog entry are contained in the full replica catalog entry.

Parameters:
full - is the full entry to check against.
part - is the partial entry to check with.
Returns:
true, if contained, false if not contained.

delete

public int delete(String lfn,
                  ReplicaCatalogEntry tuple)
Deletes a very specific mapping from the replica catalog. The LFN must be matches, the PFN, and all PFN attributes specified in the replica catalog entry. More than one entry could theoretically be removed. Upon removal of an entry, all attributes associated with the PFN also evaporate (cascading deletion).

Specified by:
delete in interface ReplicaCatalog
Parameters:
lfn - is the logical filename in the tuple.
tuple - is a description of the PFN and its attributes.
Returns:
the number of removed entries, either 0 or 1.

hasMatchingAttr

private boolean hasMatchingAttr(ReplicaCatalogEntry rce,
                                String name,
                                Object value)
Looks for a match of an attribute value in a replica catalog entry.

Parameters:
rce - is the replica catalog entry
name - is the attribute key to match
value - is the value to match against
Returns:
true, if a match was found.

delete

public int delete(String lfn,
                  String name,
                  Object value)
Deletes all PFN entries for a given LFN from the replica catalog where the PFN attribute is found, and matches exactly the object value. This method may be useful to remove all replica entries that have a certain MD5 sum associated with them. It may also be harmful overkill.

Specified by:
delete in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to look for.
name - is the PFN attribute name to look for.
value - is an exact match of the attribute value to match.
Returns:
the number of removed entries.

deleteByResource

public int deleteByResource(String lfn,
                            String handle)
Deletes all PFN entries for a given LFN from the replica catalog where the resource handle is found. Karan requested this convenience method, which can be coded like
  delete( lfn, RESOURCE_HANDLE, handle )
 

Specified by:
deleteByResource in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to look for.
handle - is the resource handle
Returns:
the number of entries removed.

remove

public int remove(String lfn)
Removes all mappings for an LFN from the replica catalog.

Specified by:
remove in interface ReplicaCatalog
Parameters:
lfn - is the logical filename to remove all mappings for.
Returns:
the number of removed entries.

remove

public int remove(Set lfns)
Removes all mappings for a set of LFNs.

Specified by:
remove in interface ReplicaCatalog
Parameters:
lfns - is a set of logical filename to remove all mappings for.
Returns:
the number of removed entries.
See Also:
remove( String )

removeByAttribute

public int removeByAttribute(String name,
                             Object value)
Removes all entries from the replica catalog where the PFN attribute is found, and matches exactly the object value.

Specified by:
removeByAttribute in interface ReplicaCatalog
Parameters:
name - is the PFN attribute key to look for.
value - is an exact match of the attribute value to match.
Returns:
the number of removed entries.

removeByAttribute

public int removeByAttribute(String handle)
Removes all entries associated with a particular resource handle. This is useful, if a site goes offline. It is a convenience method, which calls the generic removeByAttribute method.

Specified by:
removeByAttribute in interface ReplicaCatalog
Parameters:
handle - is the site handle to remove all entries for.
Returns:
the number of removed entries.
See Also:
removeByAttribute( String, Object )

clear

public int clear()
Removes everything. Use with caution!

Specified by:
clear in interface ReplicaCatalog
Returns:
the number of removed entries.


Copyright © 2011 The University of Southern California. All Rights Reserved.