main function defines the long options in its global g_longOpts
array. The g_longOpts and g_longEnd global variables are not declared
for general use in a header file but are declared in class specific headers
when needed. Currently that's only scenario/scenario.ih.
int main(int argc, char **argv)
try
{
Arg const &arg = Arg::initialize("a:B:C:D:hi:n:oP:R:S:t:vV",
g_longOpts, g_longEnd, argc, argv);
arg.versionHelp(usage, Icmake::version, arg.option('o') ? 0 : 1);
Simulator simulator; // construct the simulator
simulator.run(); // and run it.
}
The simulations themselves are controlled by a Simulator, having only one
public member: run, performing the simulations.
Simulator is constructed by main. Its constructor determines the
source of the analysis specifications (setAnalysisSource).
The class defines the following data members:
bool d_next = false; // true if 'run()' should run
// an analysis
uint16_t d_lineNr = 1; // updated by 'fileAnalysis'
std::ifstream d_ifstream; // multiple analysis specs
std::string (Simulator::*d_nextSpecs)(); // ptr to function handling
// the (next) analysis spec.
If the command-line option -o was specified then analysis specifications
may be provided as command-line arguments, handled by the member
cmdLineAnalysis.
Otherwise a specification file must be provided as first command-line
argument. This file is read until a line begins with analysis:, which may
then be followed by specifications, read by the member fileAnalysis.
In both cases the Simulator's member d_next is set to true.
The member d_nextSpecs is set to either cmdLineAnalysis or
fileAnalysis. Eventually these members set d_next to false, ending
the analyses.
run member runs all analyses. At the construction d_next may be
set to true indicating that an analysis must be performed:
void Simulator::run()
{
while (d_next)
{
uint16_t lineNr = d_lineNr;
// read the next analysis specs
string spec = (this->*d_nextSpecs)();
Analysis analysis{ istringstream{ spec }, lineNr };
analysis.run(); // run the simulation
}
}
If a simulation must be performed then the non-default specifications are
provided by the member to which d_nextSpec points.
The actual simulation analysis is then performed by an Analysis object,
which also has a run public member performing the actual analysis.
Analysis class objects handle one simulation analysis. Since multiple
analyses are performed independently from each other each Analysis object
initializes its own error count (class Error) and option handling (class
Options.
Since the options may specify the name of the file containing the
analysis-parameters Analysis objects also define a configuration file
object (ConfFile).
In simrisc's scientific context simulation parameters are also known as
`scenarios'. Scenarios contain information like the number of
iterations to perform and the number of cases to simulate.
The analysis's Scenario object is initialized by
Analysis's constructor, and used by its run member.
Here is an overview of Analysis's data members:
// for this analysis:
Error d_error; // error handling
Options d_options; // options
ConfFile d_confFile; // configuration (parameters) file
Scenario d_scenario; // performs one (1) simulation
The class Analysis uses the classes Scenario, Loop, ConfFile,
Options and Error, which are covered in separate sections.
run member may immediately end if errors were encountered in the
specifications of the Scanario and/or ConfigFile parameters.
One of the options defines the base directory into which the output is
written. The member requireBase checks whether the base directory exists,
and if not creates it. If that fails the program ends, showing the message
Cannot create base directory ...
where ... is replaced by the name of the base directory provided by the
Options object.
If all's well, the actual simulation is then performed by a Loop object.
Scenario class objects are defined by Analysis class objects and
contain parameter values of the simulation to perform. For each separate
analysis a new Scenario object is constructed.
Most of its members are accessors: const members returning values of
scenario parameters.
Some parameter values are stored in the Scenario object itself:
d_iterations, the number of iterations to perform (defaults to 1);
d_labels, a vector of strings containing the values of label:
lines that may be specified in analysis: sections to briefly
describe the essence of simulations;
d_nWomen, the number of cases to simulate (defaults to 100). This
name was used in the original versions of the simrisc program, and may be
changed to d_nCases in future versions. For now it is kept;
d_seed, the initial seed of the random number generator(s);
d_spread, when set to true some (otherwise fixed) configuration
parameters may show random fluctuations.
Configuration parameters start with identifying names, like costs: or
screeningRounds:. Those names are then followed by the parameter's
specifications. Those specifications are made available by the members
value, returning the value of a numeric parameter;
lines, containing iterators to the lines in a ConfFile identified
by the identifying name;
find, returning the iterator to a single line identified by the
identifying name.
Loop is the `workhorse' class. It performs the simulation as
specified by the Scenario object which is passed to its objects when they
are constructed. The class Loop uses many other classes, most of which
encapsulate parameters specified in the configuration file. Those classes read
configuration file parameters and prepare these parameters for their use by
Loop.
The constructor of a Loop object defines the following objects:
d_costs: a Costs object providing the treatment costs
parameters (cf. section 3.1);
d_densities: a Densities object providing the breast-densities
parameters (cf. section 3.2);
d_modalities: a Modalities object providing the details about
which screening modalities are used in a specific simulation
(cf. section 3.3);
d_screening: a Screening object providing screening parameters
(cf. section 3.4);
d_tumor: a Tumor object handling the simulation related to the
occurrence of tumors (cf. section 3.5);
d_tumorInfo: a TumorInfo object providing tumor-related
configuration parameters (like incidence, growth and survival
parameters. Cf. section 3.6);
It also defines a Random object (d_random), generating random numbers
(cf. section 4.1).
iterate member performs the number of iterations (scenario:
iterations). At each iteration
TumorInfo (cf. section 3.6) parameters may be
refreshed;
Loop's counters are reset:
d_totalCost, the total simulated treatment cost;
d_sumDeathAge, the total simulated dying ages;
d_nIntervals,;
d_nRoundFP, the number of false positive diagnoses per age round;
d_nRoundFN, the number of false negative diagnoses per age round;
d_nDetections, the number of self detected cancers per age round;
d_roundCost, the total treatment costs per age round;
d_modalities, the Modalities counters.
womenLoop;
womenLoop).
womenLoop performs one complete iteration. The option
nCases may be used to analyze a specific number of cases, and then only
write the data of the final case to file. By default the number of cases
specified in `scenario nWomen' are simulated.
womenLoop initializes the random number generator with the current
scenario seed. Then it iterates over all cases. For each case:
ALIVE
if (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age)))
No prescreening is therefore performed by complementing this expression:
not (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age)))
or (using De Morgan's rule: a && b == !a || !b):
not (Nscr > 0) or
not (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age))
So no screening is performed if there are no screening rounds (not (Nscr > 0)) and also if
not (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age))
is true.
Applying the not-operator to the condition and using De Morgan's rule !(a || b) == !a && !b results in
naturalDeathAge >= 1st screening age and
not (tumor present &&
tumor.selfDetectAge() < 1st screening age)
Once more aplying De Morgan's rule finally results in:
naturalDeathAge >= 1st screening age and
(not tumor present or
tumor.selfDetectAge() >= 1st screening age)
This condition is tested and if true no pre-screening is performed.
When pre-screening is used:
If there's no tumor or if there is a tumor, but the case's natural death happens before the death cause by the tumor, then there is a natural induced death in the pre-screening phase.
Alternatively, there is a self-detected tumor. In that case death may happen before the tumor causes the case's dath (a natural death at the case's natural death age) or the death caused by the tumor may be before the case's natural dath, resulting in a tumor caused pre-screen death at the simulated death age caused by the tumor.
The screening phase ends if the case is no alive anymore. At each round, if the case is alive, then it may be that death occurs.
To determine whether death occurs the following checks are performed:
If the natural death age occurs before the tumor-caused death age the case dies of a natural death. Otherwise, the death is caused by the cancer at the tumor determined death age;
If death has not occurred so far, then the screening continues for the current screening age. At this point the modalities are considered. Each of the modalities configured for the current screening age is considered in turn. They are considered in their order of specification in the configuration file. E.g., when specifying
screeningRound: 50 Mammo MRI
screeningRound: 52 MRI Mammo
Mammo is considered before MRI at screening age 50, and MRI is considered
before Mammo at screening age 52.
For each modality it may be that it is not considered. Whether it is
considered is determined by comparing the configured attendance rate (e.g.,
.80) with the random value round nr * modality nr from the pool of
attendance random values
note: this computation also results in multiple uses of random values for different round and modality combinations. E.g., round 1 and modality 2 use the same random value as round 2 and modality 1. So far multiple modalities haven't been used, but here, too, the used attendance pool index should avoid multiply using random values
If the screening rate's value is less than the random value then screening is not used for the current case at this screening age
note: this looks a bit strange: I would have expected that an attendance rate of .80 would mean that there is a probability of .80 that a screening round is used. This implementation does the opposite: 20% of the screening rounds are used. In the original code I readif ( (rndPoolAttendance[ screeningRound*(modality+1) ] <= attendanceRate) && (( (modality == 0) && (scr[screeningRound].T0)) || ( (modality == 1) && (scr[screeningRound].T1))) )in which case screening is performed. Thus no screening is performed when applying the not-operator to this condition:not ( (rndPoolAttendance[ screeningRound*(modality+1) ] <= attendanceRate) && (( (modality == 0) && (scr[screeningRound].T0)) || ( (modality == 1) && (scr[screeningRound].T1))) )The checks for used modality have already been performed, so they can be omitted, resulting in:
rndPoolAttendance[ screeningRound*(modality+1) ] > attendanceRateor:
attendanceRate < rndPoolAttendance[ screeningRound*(modality+1) ]resulting in not using this screening round: here too a high attendance rate results in low probabilities that the screening round is used.
If a round is used then its costs are added to the costs so far, and if there is a tumor then its characteristics are determined.
Then, if there is a tumor and the current screening age is at least equal to the tumor's detectable age it may be detected; otherwise there may be a false positive.
Maybe detecting the tumor uses the betafunction. It returns true if the random sensitivity pool value (again: collisions are possible) is at most equal to the computed beta function value. As a reference, here is the beta function's implementation:
bool Loop::betaFunction(uint16_t modalityNr)
{
double diameter = d_tumor.diameter();
double mValue = d_screening.m().value();
double expBeta =
exp(
d_screening.beta(0).value() +
d_screening.beta(1).value() * diameter +
d_screening.beta(2).value() * mValue +
d_screening.beta(3).value() * mValue / (diameter * diameter)
);
return d_random.uniform() <= 0.9 * expBeta / (1 + expBeta);
}
If the beta function returns true then if the case's natural death age is less than the tumor cause dying age the case dies naturally.
note: which looks weird to me. The original code specifies
naturalDeathAge < tumorDeathAge, but if the tumor caused death age
clearly exceeds the case's natural death age why would the case then die?
Consider the situation where the screening age is 50, and the natural
death age is 83.5: why would the case die at the screening round for age
50?
When the beta function returns true and the case's natural death age exceeds the tumor cause dying age then death is observed during this screening round, and a false positive occurs if the pool's false positives random value for the current round and modality exceeds the modality's specificity (and again: multiply used random values may occur).
If there is a tumor ad the tumor self-detection ages is earlier than the case's natural death then it is concluded that there either is a natural death or the tumor's characteristics may have caused death before the natural death age.
If there is no tumor or if the tumor's self-detection age is after the case's natural death then there is a natural death at the case's natural death age. In this latter case (there is a tumor, but it hasn't caused death) the tumor's characteristics are also determined (cf. section 3.5.2).