org.forester.io.parsers.nhx
Class NHXParser

java.lang.Object
  extended by org.forester.io.parsers.nhx.NHXParser
All Implemented Interfaces:
PhylogenyParser

public final class NHXParser
extends java.lang.Object
implements PhylogenyParser


Field Summary
static java.util.regex.Pattern MB_BL_PATTERN
           
static java.util.regex.Pattern MB_PROB_PATTERN
           
static java.util.regex.Pattern MB_PROB_SD_PATTERN
           
static java.util.regex.Pattern NUMBERS_ONLY_PATTERN
           
static boolean REPLACE_UNDERSCORES_DEFAULT
           
static PhylogenyMethods.TAXONOMY_EXTRACTION TAXONOMY_EXTRACTION_DEFAULT
           
static java.util.regex.Pattern UC_LETTERS_NUMBERS_PATTERN
           
 
Constructor Summary
NHXParser()
           
 
Method Summary
 PhylogenyMethods.TAXONOMY_EXTRACTION getTaxonomyExtraction()
           
 boolean hasNext()
           
 Phylogeny[] parse()
          Parses the source set with setSource( final Object nhx_source ).
 Phylogeny parseNext()
           
static void parseNHX(java.lang.String s, PhylogenyNode node_to_annotate, PhylogenyMethods.TAXONOMY_EXTRACTION taxonomy_extraction, boolean replace_underscores)
           
 void setGuessRootedness(boolean guess_rootedness)
           
 void setIgnoreQuotes(boolean ignore_quotes)
           
 void setReplaceUnderscores(boolean replace_underscores)
           
 void setSource(java.lang.Object nhx_source)
          This sets the source to be parsed.
 void setTaxonomyExtraction(PhylogenyMethods.TAXONOMY_EXTRACTION taxonomy_extraction)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TAXONOMY_EXTRACTION_DEFAULT

public static final PhylogenyMethods.TAXONOMY_EXTRACTION TAXONOMY_EXTRACTION_DEFAULT

REPLACE_UNDERSCORES_DEFAULT

public static final boolean REPLACE_UNDERSCORES_DEFAULT
See Also:
Constant Field Values

UC_LETTERS_NUMBERS_PATTERN

public static final java.util.regex.Pattern UC_LETTERS_NUMBERS_PATTERN

NUMBERS_ONLY_PATTERN

public static final java.util.regex.Pattern NUMBERS_ONLY_PATTERN

MB_PROB_PATTERN

public static final java.util.regex.Pattern MB_PROB_PATTERN

MB_PROB_SD_PATTERN

public static final java.util.regex.Pattern MB_PROB_SD_PATTERN

MB_BL_PATTERN

public static final java.util.regex.Pattern MB_BL_PATTERN
Constructor Detail

NHXParser

public NHXParser()
Method Detail

getTaxonomyExtraction

public PhylogenyMethods.TAXONOMY_EXTRACTION getTaxonomyExtraction()

hasNext

public boolean hasNext()

parse

public Phylogeny[] parse()
                  throws java.io.IOException,
                         NHXFormatException
Parses the source set with setSource( final Object nhx_source ). Returns the Phylogenies found in the source as Phylogeny[]. Everything between [ and ] is considered comment and ignored, unless: "[&&NHX... ]" or ":digits and/or.[bootstrap]"

Specified by:
parse in interface PhylogenyParser
Returns:
Phylogeny[]
Throws:
java.io.IOException
NHXFormatException
PhylogenyParserException
See Also:
#setSource( final Object nhx_source ), PhylogenyParser.parse()

parseNext

public Phylogeny parseNext()
                    throws java.io.IOException,
                           NHXFormatException
Throws:
java.io.IOException
NHXFormatException

setGuessRootedness

public void setGuessRootedness(boolean guess_rootedness)

setIgnoreQuotes

public void setIgnoreQuotes(boolean ignore_quotes)

setReplaceUnderscores

public void setReplaceUnderscores(boolean replace_underscores)

setSource

public void setSource(java.lang.Object nhx_source)
               throws PhylogenyParserException,
                      java.io.IOException
This sets the source to be parsed. The source can be: String, StringBuffer, char[], File, or InputStream. The source can contain more than one phylogenies in either New Hamphshire (NH) or New Hamphshire Extended (NHX) format. There is no need to separate phylogenies with any special character. White space is always ignored, as are semicolons inbetween phylogenies. Example of a source describing two phylogenies (source is a String, in this example): "(A,(B,(C,(D,E)de)cde)bcde)abcde ((((A,B)ab,C)abc,D)abcd,E)abcde". Everything between a '[' followed by any character other than '&' and ']' is considered a comment and ignored (example: "[this is a comment]"). NHX tags are surrounded by '[&&NHX' and ']' (example: "[&&NHX:S=Varanus_storri]"). A sequence like "[& some info]" is ignored, too (at the PhylogenyNode level, though). Exception: numbers only between [ and ] (e.g. [90]) are interpreted as support values.

Specified by:
setSource in interface PhylogenyParser
Parameters:
nhx_source - the source to be parsed (String, StringBuffer, char[], File, or InputStream)
Throws:
java.io.IOException
PhylogenyParserException
See Also:
parse(), PhylogenyParser.setSource(java.lang.Object)

setTaxonomyExtraction

public void setTaxonomyExtraction(PhylogenyMethods.TAXONOMY_EXTRACTION taxonomy_extraction)

parseNHX

public static void parseNHX(java.lang.String s,
                            PhylogenyNode node_to_annotate,
                            PhylogenyMethods.TAXONOMY_EXTRACTION taxonomy_extraction,
                            boolean replace_underscores)
                     throws NHXFormatException,
                            PhyloXmlDataFormatException
Throws:
NHXFormatException
PhyloXmlDataFormatException