MaCh3  2.5.0
Reference Guide
Public Member Functions | Private Member Functions | Private Attributes | List of all members
PredictiveThrower Class Reference

Implementation of Prior/Posterior Predictive and Bayesian p-Value calculations following the approach described in [11] , [12], [14]. More...

#include <Fitters/PredictiveThrower.h>

Inheritance diagram for PredictiveThrower:
[legend]
Collaboration diagram for PredictiveThrower:
[legend]

Public Member Functions

 PredictiveThrower (Manager *const fitMan)
 Constructor. More...
 
virtual ~PredictiveThrower ()
 Destructor. More...
 
void ProduceToys ()
 Produce toys by throwing from MCMC. More...
 
void RunPredictiveAnalysis ()
 Main routine responsible for producing posterior predictive distributions and $p$-value. More...
 
void RunMCMC () override
 This is not used in this class. More...
 
- Public Member Functions inherited from FitterBase
 FitterBase (Manager *const fitMan)
 Constructor. More...
 
virtual ~FitterBase ()
 Destructor for the FitterBase class. More...
 
void AddSampleHandler (SampleHandlerInterface *sample)
 This function adds a sample PDF object to the analysis framework. The sample PDF object will be utilized in fitting procedures or likelihood scans. More...
 
void AddSystObj (ParameterHandlerBase *cov)
 This function adds a Covariance object to the analysis framework. The Covariance object will be utilized in fitting procedures or likelihood scans. More...
 
void DragRace (const int NLaps=100)
 Calculates the required time for each sample or covariance object in a drag race simulation. Inspired by Dan's feature. More...
 
void RunLLHScan ()
 Perform a 1D likelihood scan. More...
 
void RunLLHMap ()
 Perform a general multi-dimensional likelihood scan. More...
 
void GetStepScaleBasedOnLLHScan (const std::string &filename="")
 LLH scan is good first estimate of step scale. More...
 
void Run2DLLHScan ()
 Perform a 2D likelihood scan. More...
 
void RunSigmaVar ()
 Perform a 1D/2D sigma var for all samples. More...
 
virtual void StartFromPreviousFit (const std::string &FitName)
 Allow to start from previous fit/chain. More...
 
std::string GetName () const
 Get name of class. More...
 

Private Member Functions

void SetParamters (std::vector< std::string > &ParameterGroupsNotVaried, std::unordered_set< int > &ParameterOnlyToVary)
 This set some params to prior value this way you can evaluate errors from subset of errors. More...
 
void SetupToyGeneration (std::vector< std::string > &ParameterGroupsNotVaried, std::unordered_set< int > &ParameterOnlyToVary, std::vector< const M3::float_t * > &BoundValuePointer, std::vector< std::pair< double, double >> &ParamBounds)
 Setup useful variables etc before stating toy generation. More...
 
bool LoadToys ()
 Load existing toys. More...
 
void WriteToy (TDirectory *ToyDirectory, TDirectory *Toy_1DDirectory, TDirectory *Toy_2DDirectory, const int iToy)
 Save histograms for a single MCMC Throw/Toy. More...
 
void SetupSampleInformation ()
 Setup sample information. More...
 
std::vector< std::string > GetStoredFancyName (ParameterHandlerBase *Systematics) const
 Get Fancy parameters stored in mcmc chains for passed ParameterHandler. More...
 
std::vector< std::unique_ptr< TH1 > > MakePredictive (const std::vector< std::vector< std::unique_ptr< TH1 >>> &Toys, const std::vector< TDirectory * > &Director, const std::string &suffix, const bool DebugHistograms, const bool WriteHist)
 Produce posterior predictive distribution. More...
 
void Study1DProjections (const std::vector< TDirectory * > &SampleDirectories) const
 Load 1D projections and later produce violin plots for each. More...
 
void ProduceSpectra (const std::vector< std::vector< std::vector< std::unique_ptr< TH1D >>>> &Toys, const std::vector< TDirectory * > &Director, const std::string suffix) const
 Produce Violin style spectra. More...
 
void MakeFluctuatedHistogram (TH1 *FluctHist, TH1 *PolyHist)
 Make Poisson fluctuation of TH1D hist. More...
 
void PredictiveLLH (const std::vector< std::unique_ptr< TH1 >> &Data_histogram, const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w, const std::vector< TDirectory * > &SampleDir)
 Calculate Posterior Predictive LLH. More...
 
void PosteriorPredictivepValue (const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< TDirectory * > &SampleDir)
 Calculate Posterior Predictive $p$-value Compares observed data to toy datasets generated from: More...
 
void ExtractLLH (TH1 *DatHist, TH1 *MCHist, TH1 *W2Hist, const SampleHandlerInterface *SampleHandler) const
 Calculate the LLH for TH1, set the LLH to title of MCHist. More...
 
double CalcLLH (const double data, const double mc, const double w2, const SampleHandlerInterface *SampleHandler) const
 Calculates the -2LLH (likelihood) for a single sample. More...
 
double CalcLLH (const TH1 *DatHist, const TH1 *MCHist, const TH1 *W2Hist, const SampleHandlerInterface *SampleHandler) const
 Calculates the likelihood (-2LLH) for a single sample; dynamically casts to call the correct GetLLH overload. More...
 
double GetLLH (const TH1D *DatHist, const TH1D *MCHist, const TH1D *W2Hist, const SampleHandlerInterface *SampleHandler) const
 Helper functions to calculate likelihoods using TH1D. More...
 
double GetLLH (const TH2D *DatHist, const TH2D *MCHist, const TH2D *W2Hist, const SampleHandlerInterface *SampleHandler) const
 Helper functions to calculate likelihoods using TH2D. More...
 
double GetLLH (const TH2Poly *DatHist, const TH2Poly *MCHist, const TH2Poly *W2Hist, const SampleHandlerInterface *SampleHandler) const
 Helper functions to calculate likelihoods using TH2Poly. More...
 
void MakeChi2Plots (const std::vector< std::vector< double >> &Chi2_x, const std::string &Chi2_x_title, const std::vector< std::vector< double >> &Chi2_y, const std::string &Chi2_y_title, const std::vector< TDirectory * > &SampleDir, const std::string Title)
 Produce Chi2 plot for a single sample based on which $p$-value is calculated. More...
 
void StudyInformationCriterion (M3::kInfCrit Criterion, const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
 Information Criterion. More...
 
void StudyBIC (const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
 Study Bayesian Information Criterion (BIC) The BIC is defined as: More...
 
void StudyDIC (const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
 KS: Get the Deviance Information Criterion (DIC) The deviance is defined as: More...
 
void StudyWAIC ()
 KS: Get the Watanabe-Akaike information criterion (WAIC) More...
 
std::string GetBinName (TH1 *hist, const bool uniform, const int Dim, const std::vector< int > &bins) const
 Construct a human-readable label describing a specific analysis bin. More...
 
std::vector< std::unique_ptr< TH1D > > PerBinHistogram (TH1 *hist, const int SampleId, const int Dim, const std::string &suffix) const
 Create per-bin posterior histograms for a given sample. More...
 
void StudyBetaParameters (TDirectory *PredictiveDir)
 Evaluate prior/post predictive distribution for beta parameters (used for evaluating impact MC statistical uncertainty) More...
 
void MakeCutEventRate (TH1D *Histogram, const double DataRate) const
 Make the 1D Event Rate Hist. More...
 
void RateAnalysis (const std::vector< std::vector< std::unique_ptr< TH1 >>> &Toys, const std::vector< TDirectory * > &SampleDirectories) const
 Produce distribution of number of events for each sample. More...
 

Private Attributes

bool FullLLH
 KS: Use Full LLH or only sample contribution based on discussion with Asher we almost always only want the sample likelihood. More...
 
int NModelParams
 KS: Count total number of model parameters which can be used for stuff like BIC. More...
 
bool Is_PriorPredictive
 Whether it is Prior or Posterior predictive. More...
 
int TotalNumberOfSamples
 Number of toys we are generating analysing. More...
 
std::vector< PredictiveSampleSampleInfo
 Handy struct for all sample info. More...
 
int Ntoys
 Number of toys we are generating analysing. More...
 
ParameterHandlerGenericModelSystematic
 Pointer to El Generico. More...
 
std::vector< std::unique_ptr< TH1 > > Data_Hist
 Vector of Data histograms. More...
 
std::vector< std::unique_ptr< TH1 > > MC_Nom_Hist
 Vector of MC histograms. More...
 
std::vector< std::unique_ptr< TH1 > > W2_Nom_Hist
 Vector of W2 histograms. More...
 
std::vector< std::vector< std::unique_ptr< TH1 > > > MC_Hist_Toy
 
std::vector< std::vector< std::unique_ptr< TH1 > > > W2_Hist_Toy
 
std::vector< double > ReweightWeight
 Reweighting factors applied for each toy, by default 1. More...
 
std::vector< double > PenaltyTerm
 Penalty term values for each toy by default 0. More...
 
bool StandardFluctuation
 KS: We have two methods for Poissonian fluctuation. More...
 

Additional Inherited Members

- Protected Member Functions inherited from FitterBase
void ProcessMCMC ()
 Process MCMC output. More...
 
void PrepareOutput ()
 Prepare the output file. More...
 
void SaveOutput ()
 Save output and close files. More...
 
void SanitiseInputs ()
 Remove obsolete memory and make other checks before fit starts. More...
 
void SaveSettings ()
 Save the settings that the MCMC was run with. More...
 
bool GetScanRange (std::map< std::string, std::vector< double >> &scanRanges) const
 YSP: Set up a mapping to store parameters with user-specified ranges, suggested by D. Barrow. More...
 
void GetParameterScanRange (const ParameterHandlerBase *cov, const int i, double &CentralValue, double &lower, double &upper, const int n_points, const std::string &suffix="") const
 Helper function to get parameter scan range, central value. More...
 
bool CheckSkipParameter (const std::vector< std::string > &SkipVector, const std::string &ParamName) const
 KS: Check whether we want to skip parameter using skip vector. More...
 
void CustomRange (const std::string &ParName, const double sigma, double &ParamShiftValue) const
 For comparison with other fitting frameworks (like P-Theta) we usually have to apply different parameter values then usual 1, 3 sigma. More...
 
- Protected Attributes inherited from FitterBase
ManagerfitMan
 The manager for configuration handling. More...
 
unsigned int step
 current state More...
 
double logLCurr
 current likelihood More...
 
double logLProp
 proposed likelihood More...
 
double accProb
 current acceptance prob More...
 
int accCount
 counts accepted steps More...
 
unsigned int stepStart
 step start, by default 0 if we start from previous chain then it will be different More...
 
std::vector< double > sample_llh
 store the llh breakdowns More...
 
std::vector< double > syst_llh
 systematic llh breakdowns More...
 
std::vector< SampleHandlerInterface * > samples
 Sample holder. More...
 
unsigned int TotalNSamples
 Total number of samples used, single SampleHandler can store more than one analysis sample! More...
 
std::vector< ParameterHandlerBase * > systematics
 Systematic holder. More...
 
std::unique_ptr< TStopwatch > clock
 tells global time how long fit took More...
 
std::unique_ptr< TStopwatch > stepClock
 tells how long single step/fit iteration took More...
 
double stepTime
 Time of single step. More...
 
std::unique_ptr< TRandom3 > random
 Random number. More...
 
TFile * outputFile
 Output. More...
 
TDirectory * CovFolder
 Output cov folder. More...
 
TDirectory * SampleFolder
 Output sample folder. More...
 
TTree * outTree
 Output tree with posteriors. More...
 
int auto_save
 auto save every N steps More...
 
bool fTestLikelihood
 Necessary for some fitting algorithms like PSO. More...
 
bool FileSaved
 Checks if file saved not repeat some operations. More...
 
bool SettingsSaved
 Checks if setting saved not repeat some operations. More...
 
bool OutputPrepared
 Checks if output prepared not repeat some operations. More...
 
std::string AlgorithmName
 Name of fitting algorithm that is being used. More...
 

Detailed Description

Implementation of Prior/Posterior Predictive and Bayesian p-Value calculations following the approach described in [11] , [12], [14].

For more information, visit the Posterior Predictive page.

Author
Asher Kaboth
Dan Barrow
Ed Atkin
Yashwanth S Prabhu
Kamil Skwarczynski
Patrick Dunne
Clarence Wret
Todo:

add BIC, DIC, WAIC

add Rate $p$-value

add plots by mode

Post Pred LLH

unify code with SampleSummary

Definition at line 38 of file PredictiveThrower.h.

Constructor & Destructor Documentation

◆ PredictiveThrower()

PredictiveThrower::PredictiveThrower ( Manager *const  fitMan)

Constructor.

Parameters
fitManA pointer to a manager object, which will handle all settings.

Definition at line 10 of file PredictiveThrower.cpp.

10  : FitterBase(man) {
11 // *************************
12  AlgorithmName = "PredictiveThrower";
13  if(!CheckNodeExists(fitMan->raw(), "Predictive")) {
14  MACH3LOG_ERROR("Predictive is missing in your main yaml config");
15  throw MaCh3Exception(__FILE__ , __LINE__ );
16  }
17 
18  StandardFluctuation = GetFromManager<bool>(fitMan->raw()["Predictive"]["StandardFluctuation"], true, __FILE__, __LINE__ );
19 
20  if(StandardFluctuation) MACH3LOG_INFO("Using standard method of statistical fluctuation");
21  else MACH3LOG_INFO("Using alternative method of statistical fluctuation, which is much slower");
22 
23  ModelSystematic = nullptr;
24  // Use the full likelihood for the Prior/Posterior predictive pvalue
25  FullLLH = GetFromManager<bool>(fitMan->raw()["Predictive"]["FullLLH"], false, __FILE__, __LINE__ );
26  NModelParams = 0;
27 
28  Is_PriorPredictive = Get<bool>(fitMan->raw()["Predictive"]["PriorPredictive"], __FILE__, __LINE__);
29  Ntoys = Get<int>(fitMan->raw()["Predictive"]["Ntoy"], __FILE__, __LINE__);
30 
31  ReweightWeight.resize(Ntoys);
32  PenaltyTerm.resize(Ntoys);
33 }
#define MACH3LOG_ERROR
Definition: MaCh3Logger.h:37
#define MACH3LOG_INFO
Definition: MaCh3Logger.h:35
bool CheckNodeExists(const YAML::Node &node, Args... args)
KS: Wrapper function to call the recursive helper.
Definition: YamlHelper.h:60
FitterBase(Manager *const fitMan)
Constructor.
Definition: FitterBase.cpp:15
std::string AlgorithmName
Name of fitting algorithm that is being used.
Definition: FitterBase.h:169
Manager * fitMan
The manager for configuration handling.
Definition: FitterBase.h:109
Custom exception class used throughout MaCh3.
YAML::Node const & raw() const
Return config.
Definition: Manager.h:47
std::vector< double > PenaltyTerm
Penalty term values for each toy by default 0.
bool FullLLH
KS: Use Full LLH or only sample contribution based on discussion with Asher we almost always only wan...
bool Is_PriorPredictive
Whether it is Prior or Posterior predictive.
int NModelParams
KS: Count total number of model parameters which can be used for stuff like BIC.
int Ntoys
Number of toys we are generating analysing.
bool StandardFluctuation
KS: We have two methods for Poissonian fluctuation.
ParameterHandlerGeneric * ModelSystematic
Pointer to El Generico.
std::vector< double > ReweightWeight
Reweighting factors applied for each toy, by default 1.

◆ ~PredictiveThrower()

PredictiveThrower::~PredictiveThrower ( )
virtual

Destructor.

Definition at line 37 of file PredictiveThrower.cpp.

37  {
38 // *************************
39 
40 }

Member Function Documentation

◆ CalcLLH() [1/2]

double PredictiveThrower::CalcLLH ( const double  data,
const double  mc,
const double  w2,
const SampleHandlerInterface SampleHandler 
) const
private

Calculates the -2LLH (likelihood) for a single sample.

Parameters
dataData value for the sample.
mcMC (Monte Carlo) value for the sample.
w2W2 value for the sample.
SampleHandlerPointer to SampleHandlerInterface providing the LLH test statistic.

Definition at line 1071 of file PredictiveThrower.cpp.

1074  {
1075 // *************************
1076  double llh = SampleHandler->GetTestStatLLH(data, mc, w2);
1077  //KS: do times 2 because banff reports chi2
1078  return 2*llh;
1079 }
double GetTestStatLLH(const double data, const double mc, const double w2) const
Calculate test statistic for a single bin. Calculation depends on setting of fTestStatistic....

◆ CalcLLH() [2/2]

double PredictiveThrower::CalcLLH ( const TH1 *  DatHist,
const TH1 *  MCHist,
const TH1 *  W2Hist,
const SampleHandlerInterface SampleHandler 
) const
private

Calculates the likelihood (-2LLH) for a single sample; dynamically casts to call the correct GetLLH overload.

Parameters
DatHistData histogram with data distribution for a single sample
MCHistMC histogram with MC distribution for a single sample
W2HistW2 histogram with W2 distribution for a single sample
SampleHandlerPointer to SampleHandlerInterface providing LLH test statistic

Definition at line 1082 of file PredictiveThrower.cpp.

1085  {
1086 // *************************
1087  // 1D case
1088  if (auto h1 = dynamic_cast<const TH1D*>(DatHist)) {
1089  return GetLLH(h1,
1090  static_cast<const TH1D*>(MCHist),
1091  static_cast<const TH1D*>(W2Hist),
1092  SampleHandler);
1093  }
1094 
1095  // 2D case
1096  if (auto h2 = dynamic_cast<const TH2D*>(DatHist)) {
1097  return GetLLH(h2,
1098  static_cast<const TH2D*>(MCHist),
1099  static_cast<const TH2D*>(W2Hist),
1100  SampleHandler);
1101  }
1102 
1103  // 2D poly case
1104  if (auto h2p = dynamic_cast<const TH2Poly*>(DatHist)) {
1105  return GetLLH(h2p,
1106  static_cast<const TH2Poly*>(MCHist),
1107  static_cast<const TH2Poly*>(W2Hist),
1108  SampleHandler);
1109  }
1110 
1111  MACH3LOG_ERROR("Unsupported histogram type in {}", __func__);
1112  throw MaCh3Exception(__FILE__ , __LINE__ );
1113 }
double GetLLH(const TH1D *DatHist, const TH1D *MCHist, const TH1D *W2Hist, const SampleHandlerInterface *SampleHandler) const
Helper functions to calculate likelihoods using TH1D.

◆ ExtractLLH()

void PredictiveThrower::ExtractLLH ( TH1 *  DatHist,
TH1 *  MCHist,
TH1 *  W2Hist,
const SampleHandlerInterface SampleHandler 
) const
private

Calculate the LLH for TH1, set the LLH to title of MCHist.

Parameters
DatHistData histogram with data distribution for a single sample
MCHistMC histogram with MC distribution for a single sample
W2HistW2 histogram with W2 distribution for a single sample
SampleHandlerPointer to SampleHandlerInterface providing LLH test statistic

Definition at line 1431 of file PredictiveThrower.cpp.

1431  {
1432 // ****************
1433  const double llh = CalcLLH(DatHist, MCHist, W2Hist, SampleHandler);
1434  std::stringstream ss;
1435  ss << "_2LLH=" << llh;
1436  MCHist->SetTitle((std::string(MCHist->GetTitle())+ss.str()).c_str());
1437  MACH3LOG_INFO("{:<55} {:<10.2f} {:<10.2f} {:<10.2f}", MCHist->GetName(), DatHist->Integral(), MCHist->Integral(), llh);
1438 }
double CalcLLH(const double data, const double mc, const double w2, const SampleHandlerInterface *SampleHandler) const
Calculates the -2LLH (likelihood) for a single sample.

◆ GetBinName()

std::string PredictiveThrower::GetBinName ( TH1 *  hist,
const bool  uniform,
const int  Dim,
const std::vector< int > &  bins 
) const
private

Construct a human-readable label describing a specific analysis bin.

Parameters
histHistogram providing the binning definition.
uniformFlag indicating whether the histogram uses regular axis binning (TH1/TH2) or irregular polygonal binning (e.g. TH2Poly).
DimDimensionality of the original distribution.
binsVector of per-dimension bin indices in analysis coordinates.

Definition at line 825 of file PredictiveThrower.cpp.

828  {
829 // *************************
830  std::string BinName = "";
831  if(Dim == 1) { // True 1D distribution using TH1D
832  const int b = bins[0];
833  const TAxis* ax = hist->GetXaxis();
834  const double low = ax->GetBinLowEdge(b);
835  const double up = ax->GetBinUpEdge(b);
836 
837  BinName = fmt::format("Dim0 ({:g}, {:g})", low, up);
838  } else if (Dim == 2) { // True 2D dsitrubitons
839  if(uniform == true) { //using TH2D
840  const int bx = bins[0];
841  const int by = bins[1];
842  const TAxis* ax = hist->GetXaxis();
843  const TAxis* ay = hist->GetYaxis();
844  BinName = fmt::format("Dim0 ({:g}, {:g}), ", ax->GetBinLowEdge(bx), ax->GetBinUpEdge(bx));
845  BinName += fmt::format("Dim1 ({:g}, {:g})", ay->GetBinLowEdge(by), ay->GetBinUpEdge(by));
846  } else { // using TH2Poly
847  TH2PolyBin* bin = static_cast<TH2PolyBin*>(static_cast<TH2Poly*>(hist)->GetBins()->At(bins[0]-1));
848  // Just make a little fancy name
849  BinName += fmt::format("Dim{} ({:g}, {:g})", 0, bin->GetXMin(), bin->GetXMax());
850  BinName += fmt::format("Dim{} ({:g}, {:g})", 1, bin->GetYMin(), bin->GetYMax());
851  }
852  } else { // N-dimensional distribution using flatten TH1D
853  BinName = hist->GetXaxis()->GetBinLabel(bins[0]);
854  }
855  return BinName;
856 }

◆ GetLLH() [1/3]

double PredictiveThrower::GetLLH ( const TH1D *  DatHist,
const TH1D *  MCHist,
const TH1D *  W2Hist,
const SampleHandlerInterface SampleHandler 
) const
private

Helper functions to calculate likelihoods using TH1D.

Parameters
DatHistData histogram with data distribution for a single sample
MCHistMC histogram with MC distribution for a single sample
W2HistW2 histogram with W2 distribution for a single sample
SampleHandlerPointer to SampleHandlerInterface providing LLH test statistic

Definition at line 1116 of file PredictiveThrower.cpp.

1119  {
1120 // *************************
1121  double llh = 0.0;
1122  for (int i = 1; i <= DatHist->GetXaxis()->GetNbins(); ++i)
1123  {
1124  const double data = DatHist->GetBinContent(i);
1125  const double mc = MCHist->GetBinContent(i);
1126  const double w2 = W2Hist->GetBinContent(i);
1127  llh += SampleHandler->GetTestStatLLH(data, mc, w2);
1128  }
1129  //KS: do times 2 because banff reports chi2
1130  return 2*llh;
1131 }

◆ GetLLH() [2/3]

double PredictiveThrower::GetLLH ( const TH2D *  DatHist,
const TH2D *  MCHist,
const TH2D *  W2Hist,
const SampleHandlerInterface SampleHandler 
) const
private

Helper functions to calculate likelihoods using TH2D.

Parameters
DatHistData 2D histogram with data distribution for a single sample
MCHistMC 2D histogram with MC distribution for a single sample
W2HistW2 2D histogram with W2 distribution for a single sample
SampleHandlerPointer to SampleHandlerInterface providing LLH test statistic

Definition at line 1152 of file PredictiveThrower.cpp.

1155  {
1156 // *************************
1157  double llh = 0.0;
1158 
1159  const int nBinsX = DatHist->GetXaxis()->GetNbins();
1160  const int nBinsY = DatHist->GetYaxis()->GetNbins();
1161 
1162  for (int i = 1; i <= nBinsX; ++i)
1163  {
1164  for (int j = 1; j <= nBinsY; ++j)
1165  {
1166  const double data = DatHist->GetBinContent(i, j);
1167  const double mc = MCHist->GetBinContent(i, j);
1168  const double w2 = W2Hist->GetBinContent(i, j);
1169 
1170  llh += SampleHandler->GetTestStatLLH(data, mc, w2);
1171  }
1172  }
1173 
1174  // KS: do times 2 because banff reports chi2
1175  return 2 * llh;
1176 }

◆ GetLLH() [3/3]

double PredictiveThrower::GetLLH ( const TH2Poly *  DatHist,
const TH2Poly *  MCHist,
const TH2Poly *  W2Hist,
const SampleHandlerInterface SampleHandler 
) const
private

Helper functions to calculate likelihoods using TH2Poly.

Parameters
DatHistData 2D poly histogram with data distribution for a single sample
MCHistMC 2D poly histogram with MC distribution for a single sample
W2HistW2 2D poly histogram with W2 distribution for a single sample
SampleHandlerPointer to SampleHandlerInterface providing LLH test statistic

Definition at line 1134 of file PredictiveThrower.cpp.

1137  {
1138 // *************************
1139  double llh = 0.0;
1140  for (int i = 1; i <= DatHist->GetNumberOfBins(); ++i)
1141  {
1142  const double data = DatHist->GetBinContent(i);
1143  const double mc = MCHist->GetBinContent(i);
1144  const double w2 = W2Hist->GetBinContent(i);
1145  llh += SampleHandler->GetTestStatLLH(data, mc, w2);
1146  }
1147  //KS: do times 2 because banff reports chi2
1148  return 2*llh;
1149 }

◆ GetStoredFancyName()

std::vector< std::string > PredictiveThrower::GetStoredFancyName ( ParameterHandlerBase Systematics) const
private

Get Fancy parameters stored in mcmc chains for passed ParameterHandler.

Definition at line 328 of file PredictiveThrower.cpp.

328  {
329 // *************************
330  TDirectory * ogdir = gDirectory;
331 
332  std::vector<std::string> FancyNames;
333  std::string Name = std::string("Config_") + Systematics->GetName();
334  auto PosteriorFileName = Get<std::string>(fitMan->raw()["Predictive"]["PosteriorFile"], __FILE__, __LINE__);
335 
336  TFile* file = TFile::Open(PosteriorFileName.c_str(), "READ");
337  TDirectory* CovarianceFolder = file->GetDirectory("CovarianceFolder");
338 
339  TMacro* FoundMacro = static_cast<TMacro*>(CovarianceFolder->Get(Name.c_str()));
340  if(FoundMacro == nullptr) {
341  file->Close();
342  delete file;
343  if(ogdir){ ogdir->cd(); }
344 
345  return FancyNames;
346  }
347  MACH3LOG_DEBUG("Found config for {}", Name);
348  YAML::Node Settings = TMacroToYAML(*FoundMacro);
349 
350  int params = int(Settings["Systematics"].size());
351  FancyNames.resize(params);
352  int iPar = 0;
353  for (auto const &param : Settings["Systematics"]) {
354  FancyNames[iPar] = Get<std::string>(param["Systematic"]["Names"]["FancyName"], __FILE__ , __LINE__);
355  iPar++;
356  }
357  file->Close();
358  delete file;
359  if(ogdir){ ogdir->cd(); }
360  return FancyNames;
361 }
#define MACH3LOG_DEBUG
Definition: MaCh3Logger.h:34
YAML::Node TMacroToYAML(const TMacro &macro)
KS: Convert a ROOT TMacro object to a YAML node.
Definition: YamlHelper.h:152
std::string GetName() const
Get name of covariance.
TFile * Open(const std::string &Name, const std::string &Type, const std::string &File, const int Line)
Opens a ROOT file with the given name and mode.

◆ LoadToys()

bool PredictiveThrower::LoadToys ( )
private

Load existing toys.

Definition at line 235 of file PredictiveThrower.cpp.

235  {
236 // *************************
237  auto PosteriorFileName = Get<std::string>(fitMan->raw()["Predictive"]["PosteriorFile"], __FILE__, __LINE__);
238  // Open the ROOT file
239  int originalErrorWarning = gErrorIgnoreLevel;
240  gErrorIgnoreLevel = kFatal;
241  TFile* file = TFile::Open(PosteriorFileName.c_str(), "READ");
242 
243  gErrorIgnoreLevel = originalErrorWarning;
244  TDirectory* ToyDir = nullptr;
245  if (!file || file->IsZombie()) {
246  return false;
247  } else {
248  // Check for the "toys" directory
249  if ((ToyDir = file->GetDirectory("Toys"))) {
250  MACH3LOG_INFO("Found toys in Posterior file will attempt toy reading");
251  } else {
252  file->Close();
253  delete file;
254  return false;
255  }
256  }
257 
258  // Finally get the TTree branch with the penalty vectors for each of the toy throws
259  TTree* PenaltyTree = static_cast<TTree*>(file->Get("ToySummary"));
260  if (!PenaltyTree) {
261  MACH3LOG_WARN("ToySummary TTree not found in file.");
262  file->Close();
263  delete file;
264  return false;
265  }
266 
267  Ntoys = static_cast<int>(PenaltyTree->GetEntries());
268  int ConfigNtoys = Get<int>(fitMan->raw()["Predictive"]["Ntoy"], __FILE__, __LINE__);;
269  if (Ntoys != ConfigNtoys) {
270  MACH3LOG_WARN("Found different number of toys in saved file than asked to run!");
271  MACH3LOG_INFO("Will read _ALL_ toys in the file");
272  MACH3LOG_INFO("Ntoys in file: {}", Ntoys);
273  MACH3LOG_INFO("Ntoys specified: {}", ConfigNtoys);
274  }
275 
276  PenaltyTerm.resize(Ntoys);
277  ReweightWeight.resize(Ntoys);
278 
279  double Penalty = 0, Weight = 1;
280  PenaltyTree->SetBranchAddress("Penalty", &Penalty);
281  PenaltyTree->SetBranchAddress("Weight", &Weight);
282  PenaltyTree->SetBranchAddress("NModelParams", &NModelParams);
283 
284  for (int i = 0; i < Ntoys; ++i) {
285  PenaltyTree->GetEntry(i);
286  if (FullLLH) {
287  PenaltyTerm[i] = Penalty;
288  } else {
289  PenaltyTerm[i] = 0.0;
290  }
291 
292  ReweightWeight[i] = Weight;
293  }
294  // Resize all vectors and get sample names
296 
297  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
298  TH1* DataHist1D = static_cast<TH1*>(ToyDir->Get((SampleInfo[sample].Name + "_data").c_str()));
299  Data_Hist[sample] = M3::Clone(DataHist1D);
300 
301  TH1* MCHist1D = static_cast<TH1*>(ToyDir->Get((SampleInfo[sample].Name + "_mc").c_str()));
302  MC_Nom_Hist[sample] = M3::Clone(MCHist1D);
303 
304  TH1* W2Hist1D = static_cast<TH1*>(ToyDir->Get((SampleInfo[sample].Name + "_w2").c_str()));
305  W2_Nom_Hist[sample] = M3::Clone(W2Hist1D);
306  }
307 
308 
309  for (int iToy = 0; iToy < Ntoys; ++iToy)
310  {
311  if (iToy % 100 == 0) MACH3LOG_INFO(" Loaded toy {}", iToy);
312 
313  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
314  TH1* MCHist1D = static_cast<TH1*>(ToyDir->Get((SampleInfo[sample].Name + "_mc_" + std::to_string(iToy)).c_str()));
315  TH1* W2Hist1D = static_cast<TH1*>(ToyDir->Get((SampleInfo[sample].Name + "_w2_" + std::to_string(iToy)).c_str()));
316 
317  MC_Hist_Toy[sample][iToy] = M3::Clone(MCHist1D);
318  W2_Hist_Toy[sample][iToy] = M3::Clone(W2Hist1D);
319  }
320  }
321 
322  file->Close();
323  delete file;
324  return true;
325 }
#define MACH3LOG_WARN
Definition: MaCh3Logger.h:36
std::vector< std::unique_ptr< TH1 > > W2_Nom_Hist
Vector of W2 histograms.
std::vector< std::vector< std::unique_ptr< TH1 > > > W2_Hist_Toy
void SetupSampleInformation()
Setup sample information.
std::vector< std::unique_ptr< TH1 > > MC_Nom_Hist
Vector of MC histograms.
int TotalNumberOfSamples
Number of toys we are generating analysing.
std::vector< std::unique_ptr< TH1 > > Data_Hist
Vector of Data histograms.
std::vector< std::vector< std::unique_ptr< TH1 > > > MC_Hist_Toy
std::unique_ptr< ObjectType > Clone(const ObjectType *obj, const std::string &name="")
KS: Creates a copy of a ROOT-like object and wraps it in a smart pointer.
KS: Store info about MC sample.

◆ MakeChi2Plots()

void PredictiveThrower::MakeChi2Plots ( const std::vector< std::vector< double >> &  Chi2_x,
const std::string &  Chi2_x_title,
const std::vector< std::vector< double >> &  Chi2_y,
const std::string &  Chi2_y_title,
const std::vector< TDirectory * > &  SampleDir,
const std::string  Title 
)
private

Produce Chi2 plot for a single sample based on which $p$-value is calculated.

Definition at line 1295 of file PredictiveThrower.cpp.

1300  {
1301 // *************************
1302  for (int iSample = 0; iSample < TotalNumberOfSamples+1; ++iSample) {
1303  SampleDir[iSample]->cd();
1304 
1305  // Transpose to extract chi2 values for a given sample across all toys
1306  std::vector<double> chi2_y_sample(Ntoys);
1307  std::vector<double> chi2_x_per_sample(Ntoys);
1308 
1309  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1310  chi2_y_sample[iToy] = Chi2_y[iSample][iToy];
1311  chi2_x_per_sample[iToy] = Chi2_x[iSample][iToy];
1312  }
1313 
1314  const double min_val = std::min(*std::min_element(chi2_y_sample.begin(), chi2_y_sample.end()),
1315  *std::min_element(chi2_x_per_sample.begin(), chi2_x_per_sample.end()));
1316  const double max_val = std::max(*std::max_element(chi2_y_sample.begin(), chi2_y_sample.end()),
1317  *std::max_element(chi2_x_per_sample.begin(), chi2_x_per_sample.end()));
1318 
1319  auto chi2_hist = std::make_unique<TH2D>((SampleInfo[iSample].Name+ Title).c_str(),
1320  (SampleInfo[iSample].Name+ Title).c_str(),
1321  50, min_val, max_val, 50, min_val, max_val);
1322  chi2_hist->SetDirectory(nullptr);
1323  chi2_hist->GetXaxis()->SetTitle(Chi2_x_title.c_str());
1324  chi2_hist->GetYaxis()->SetTitle(Chi2_y_title.c_str());
1325 
1326  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1327  chi2_hist->Fill(chi2_x_per_sample[iToy], chi2_y_sample[iToy]);
1328  }
1329 
1330  Get2DBayesianpValue(chi2_hist.get());
1331  chi2_hist->Write();
1332  }
1333 }
void Get2DBayesianpValue(TH2D *Histogram)
Calculates the 2D Bayesian p-value and generates a visualization.

◆ MakeCutEventRate()

void PredictiveThrower::MakeCutEventRate ( TH1D *  Histogram,
const double  DataRate 
) const
private

Make the 1D Event Rate Hist.

Definition at line 1442 of file PredictiveThrower.cpp.

1442  {
1443 // ****************
1444  // Open the ROOT file
1445  int originalErrorWarning = gErrorIgnoreLevel;
1446  gErrorIgnoreLevel = kFatal;
1447 
1448  // For the event rate histogram add a TLine to the data rate
1449  auto TempLine = std::make_unique<TLine>(DataRate, Histogram->GetMinimum(), DataRate, Histogram->GetMaximum());
1450  TempLine->SetLineColor(kRed);
1451  TempLine->SetLineWidth(2);
1452  // Also fit a Gaussian because why not?
1453  auto Fitter = std::make_unique<TF1>("Fit", "gaus", Histogram->GetBinLowEdge(1), Histogram->GetBinLowEdge(Histogram->GetNbinsX()+1));
1454  Histogram->Fit(Fitter.get(), "RQ");
1455  Fitter->SetLineColor(kRed-5);
1456  // Calculate a p-value
1457  double Above = 0.0;
1458  for (int z = 0; z < Histogram->GetNbinsX(); ++z) {
1459  const double xvalue = Histogram->GetBinCenter(z+1);
1460  if (xvalue >= DataRate) {
1461  Above += Histogram->GetBinContent(z+1);
1462  }
1463  }
1464  const double pvalue = Above/Histogram->Integral();
1465  TLegend Legend(0.4, 0.75, 0.98, 0.90);
1466  Legend.SetFillColor(0);
1467  Legend.SetFillStyle(0);
1468  Legend.SetLineWidth(0);
1469  Legend.SetLineColor(0);
1470  Legend.AddEntry(TempLine.get(), Form("Data, %.0f, p-value=%.2f", DataRate, pvalue), "l");
1471  Legend.AddEntry(Histogram, Form("MC, #mu=%.1f#pm%.1f", Histogram->GetMean(), Histogram->GetRMS()), "l");
1472  Legend.AddEntry(Fitter.get(), Form("Gauss, #mu=%.1f#pm%.1f", Fitter->GetParameter(1), Fitter->GetParameter(2)), "l");
1473  std::string TempTitle = std::string(Histogram->GetName());
1474  TempTitle += "_canv";
1475  TCanvas TempCanvas(TempTitle.c_str(), TempTitle.c_str(), 1024, 1024);
1476  TempCanvas.SetGridx();
1477  TempCanvas.SetGridy();
1478  TempCanvas.SetRightMargin(0.03);
1479  TempCanvas.SetBottomMargin(0.08);
1480  TempCanvas.SetLeftMargin(0.10);
1481  TempCanvas.SetTopMargin(0.06);
1482  TempCanvas.cd();
1483  Histogram->Draw();
1484  TempLine->Draw("same");
1485  Fitter->Draw("same");
1486  Legend.Draw("same");
1487  TempCanvas.Write();
1488  Histogram->Write();
1489  gErrorIgnoreLevel = originalErrorWarning;
1490 }

◆ MakeFluctuatedHistogram()

void PredictiveThrower::MakeFluctuatedHistogram ( TH1 *  FluctHist,
TH1 *  PolyHist 
)
private

Make Poisson fluctuation of TH1D hist.

Parameters
FluctHistHistogram to store fluctuated values (must match Hist type)
HistOriginal histogram to fluctuate

Definition at line 1180 of file PredictiveThrower.cpp.

1180  {
1181 // ****************
1182  // Determine which fluctuation function to call
1183  auto applyFluctuation = [&](auto* f, auto* h) {
1184  if (StandardFluctuation) {
1186  } else {
1188  }
1189  };
1190 
1191  if (Hist->InheritsFrom(TH2Poly::Class())) {
1192  applyFluctuation(static_cast<TH2Poly*>(FluctHist), static_cast<TH2Poly*>(Hist));
1193  }
1194  else if (Hist->InheritsFrom(TH2D::Class())) {
1195  applyFluctuation(static_cast<TH2D*>(FluctHist), static_cast<TH2D*>(Hist));
1196  }
1197  else if (Hist->InheritsFrom(TH1D::Class())) {
1198  applyFluctuation(static_cast<TH1D*>(FluctHist), static_cast<TH1D*>(Hist));
1199  }
1200  else {
1201  MACH3LOG_ERROR("Unsupported histogram type");
1202  throw MaCh3Exception(__FILE__ , __LINE__ );
1203  }
1204 }
void MakeFluctuatedHistogramAlternative(TH1D *FluctHist, TH1D *PolyHist, TRandom3 *rand)
Make Poisson fluctuation of TH1D hist using slow method which is only for cross-check.
void MakeFluctuatedHistogramStandard(TH1D *FluctHist, TH1D *PolyHist, TRandom3 *rand)
Make Poisson fluctuation of TH1D hist using default fast method.
std::unique_ptr< TRandom3 > random
Random number.
Definition: FitterBase.h:145

◆ MakePredictive()

std::vector< std::unique_ptr< TH1 > > PredictiveThrower::MakePredictive ( const std::vector< std::vector< std::unique_ptr< TH1 >>> &  Toys,
const std::vector< TDirectory * > &  Director,
const std::string &  suffix,
const bool  DebugHistograms,
const bool  WriteHist 
)
private

Produce posterior predictive distribution.

  1. Fill histograms, thread safe as all histograms are allocated before and we loop over samples

Definition at line 916 of file PredictiveThrower.cpp.

920  {
921 // *************************
922  std::vector<std::unique_ptr<TH1>> PostPred(TotalNumberOfSamples);
923  std::vector<std::vector<std::unique_ptr<TH1D>>> Posterior_hist(TotalNumberOfSamples);
924  // 1.initialisation
925  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
926  const int nDims = SampleInfo[sample].Dimenstion;
927  const std::string Sample_Name = SampleInfo[sample].Name;
928  Posterior_hist[sample] = PerBinHistogram(Toys[sample][0].get(), sample, nDims, suffix);
929  auto PredictiveHist = M3::Clone(Toys[sample][0].get());
930  // Clear the bin contents
931  PredictiveHist->Reset();
932  PredictiveHist->SetName((Sample_Name + "_" + suffix + "_PostPred").c_str());
933  PredictiveHist->SetTitle((Sample_Name + "_" + suffix + "_PostPred").c_str());
934  PredictiveHist->SetDirectory(nullptr);
935  PostPred[sample] = std::move(PredictiveHist);
936  }
937 
939  #ifdef MULTITHREAD
940  #pragma omp parallel for
941  #endif
942  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
943  const int nDims = SampleInfo[sample].Dimenstion;
944  auto& hist = Toys[sample][0];
945  for (size_t iToy = 0; iToy < Toys[sample].size(); ++iToy) {
946  if(nDims == 2) {
947  if(std::string(hist->ClassName()) == "TH2Poly") {
948  for (int i = 1; i <= static_cast<TH2Poly*>(hist.get())->GetNumberOfBins(); ++i) {
949  double content = Toys[sample][iToy]->GetBinContent(i);
950  Posterior_hist[sample][i-1]->Fill(content, ReweightWeight[iToy]);
951  }
952  } else {
953  int nbinsx = hist->GetNbinsX();
954  int nbinsy = hist->GetNbinsY();
955  for (int iy = 1; iy <= nbinsy; ++iy) {
956  for (int ix = 1; ix <= nbinsx; ++ix) {
957  int Bin = (iy-1) * nbinsx + (ix-1);
958  double content = Toys[sample][iToy]->GetBinContent(ix, iy);
959  Posterior_hist[sample][Bin]->Fill(content, ReweightWeight[iToy]);
960  } // end loop over X bins
961  } // end loop over Y bins
962  }
963  } else {
964  int nbinsx = hist->GetNbinsX();
965  for (int i = 1; i <= nbinsx; ++i) {
966  double content = Toys[sample][iToy]->GetBinContent(i);
967  Posterior_hist[sample][i-1]->Fill(content, ReweightWeight[iToy]);
968  } // end loop over bins
969  } // end if over dimensions
970  } // end loop over toys
971  } // end loop over samples
972 
973  // 3.save
974  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
975  const int nDims = SampleInfo[sample].Dimenstion;
976  auto& hist = Toys[sample][0];
977  Directory[sample]->cd();
978  if(nDims == 2) {
979  if(std::string(hist->ClassName()) == "TH2Poly") {
980  for (int i = 1; i <= static_cast<TH2Poly*>(hist.get())->GetNumberOfBins(); ++i) {
981  PostPred[sample]->SetBinContent(i, Posterior_hist[sample][i-1]->GetMean());
982  // KS: If ROOT below 6.18 one need -1 only for error due to stupid bug...
983  PostPred[sample]->SetBinError(i, Posterior_hist[sample][i-1]->GetRMS());
984  if (DebugHistograms) Posterior_hist[sample][i-1]->Write();
985  } // end loop over poly bins
986  } else {
987  int nbinsx = hist->GetNbinsX();
988  int nbinsy = hist->GetNbinsY();
989  for (int iy = 1; iy <= nbinsy; ++iy) {
990  for (int ix = 1; ix <= nbinsx; ++ix) {
991  int Bin = (iy-1) * nbinsx + (ix-1);
992  if (DebugHistograms) Posterior_hist[sample][Bin]->Write();
993  PostPred[sample]->SetBinContent(ix, iy, Posterior_hist[sample][Bin]->GetMean());
994  PostPred[sample]->SetBinError(ix, iy, Posterior_hist[sample][Bin]->GetRMS());
995  } // end loop over x
996  } // end loop over y
997  }
998  } else {
999  int nbinsx = hist->GetNbinsX();
1000  for (int i = 1; i <= nbinsx; ++i) {
1001  PostPred[sample]->SetBinContent(i, Posterior_hist[sample][i-1]->GetMean());
1002  PostPred[sample]->SetBinError(i, Posterior_hist[sample][i-1]->GetRMS());
1003  if (DebugHistograms) Posterior_hist[sample][i-1]->Write();
1004  }
1005  }
1006  if(WriteHist) PostPred[sample]->Write();
1007  } // end loop over samples
1008  return PostPred;
1009 }
std::vector< std::unique_ptr< TH1D > > PerBinHistogram(TH1 *hist, const int SampleId, const int Dim, const std::string &suffix) const
Create per-bin posterior histograms for a given sample.

◆ PerBinHistogram()

std::vector< std::unique_ptr< TH1D > > PredictiveThrower::PerBinHistogram ( TH1 *  hist,
const int  SampleId,
const int  Dim,
const std::string &  suffix 
) const
private

Create per-bin posterior histograms for a given sample.

For each analysis bin of the input histogram, this function allocates a new 1D histogram intended to accumulate the distribution of predicted event counts (e.g. across throws, toys, or posterior evaluations).

The number of output histograms therefore equals the number of physical bins:

  • TH1 → N histograms
  • TH2 → Nx × Ny histograms
  • TH2Poly → one histogram per polygon bin
Parameters
histInput histogram defining the bin structure for this sample.
SampleIdIndex identifying the sample in SampleInfo.
DimDimensionality of the original distribution.
suffixString appended to histogram names (e.g. to distinguish stages).

Definition at line 859 of file PredictiveThrower.cpp.

862  {
863 // *************************
864  std::vector<std::unique_ptr<TH1D>> PosteriorHistVec;
865  constexpr int nBins = 100;
866  const std::string Sample_Name = SampleInfo[SampleId].Name;
867  if (Dim == 2) {
868  if(std::string(hist->ClassName()) == "TH2Poly") {
869  for (int i = 1; i <= static_cast<TH2Poly*>(hist)->GetNumberOfBins(); ++i) {
870  std::string ProjName = fmt::format("{} {} Bin: {}",
871  Sample_Name, suffix,
872  GetBinName(hist, false, Dim, {i}));
873  // KS: When a histogram is created with an axis lower limit greater or equal to its upper limit ROOT will automatically adjust histogram range
874  // https://root.cern.ch/doc/master/classTH1.html#auto-bin
875  auto PosteriorHist = std::make_unique<TH1D>(ProjName.c_str(), ProjName.c_str(), nBins, 1, -1);
876  PosteriorHist->SetDirectory(nullptr);
877  PosteriorHist->GetXaxis()->SetTitle("Events");
878  PosteriorHistVec.push_back(std::move(PosteriorHist));
879  } //end loop over bin
880  } else {
881  int nbinsx = hist->GetNbinsX();
882  int nbinsy = hist->GetNbinsY();
883  for (int iy = 1; iy <= nbinsy; ++iy) {
884  for (int ix = 1; ix <= nbinsx; ++ix) {
885  std::string ProjName = fmt::format("{} {} Bin: {}",
886  Sample_Name, suffix,
887  GetBinName(hist, true, Dim, {ix,iy}));
888  //KS: When a histogram is created with an axis lower limit greater or equal to its upper limit ROOT will automatically adjust histogram range
889  // https://root.cern.ch/doc/master/classTH1.html#auto-bin
890  auto PosteriorHist = std::make_unique<TH1D>(ProjName.c_str(), ProjName.c_str(), nBins, 1, -1);
891  PosteriorHist->SetDirectory(nullptr);
892  PosteriorHist->GetXaxis()->SetTitle("Events");
893  PosteriorHistVec.push_back(std::move(PosteriorHist));
894  }
895  }
896  }
897  } else {
898  int nbinsx = hist->GetNbinsX();
899  PosteriorHistVec.reserve(nbinsx);
900  for (int i = 1; i <= nbinsx; ++i) {
901  std::string ProjName = fmt::format("{} {} Bin: {}",
902  Sample_Name, suffix,
903  GetBinName(hist, true, Dim, {i}));
904  //KS: When a histogram is created with an axis lower limit greater or equal to its upper limit ROOT will automatically adjust histogram range
905  // https://root.cern.ch/doc/master/classTH1.html#auto-bin
906  auto PosteriorHist = std::make_unique<TH1D>(ProjName.c_str(), ProjName.c_str(), nBins, 1, -1);
907  PosteriorHist->SetDirectory(nullptr);
908  PosteriorHist->GetXaxis()->SetTitle("Events");
909  PosteriorHistVec.push_back(std::move(PosteriorHist));
910  }
911  }
912  return PosteriorHistVec;
913 }
std::string GetBinName(TH1 *hist, const bool uniform, const int Dim, const std::vector< int > &bins) const
Construct a human-readable label describing a specific analysis bin.

◆ PosteriorPredictivepValue()

void PredictiveThrower::PosteriorPredictivepValue ( const std::vector< std::unique_ptr< TH1 >> &  PostPred_mc,
const std::vector< TDirectory * > &  SampleDir 
)
private

Calculate Posterior Predictive $p$-value Compares observed data to toy datasets generated from:

  • fitted model fluctuations ("Draw")
  • posterior predictive distribution ("Pred")

Computes two discrepancy metrics: • Shape+Rate : bin-by-bin likelihood • Rate-only : total event normalization

TODO This can be multithreaded but be careful for Clone!!!

Definition at line 1207 of file PredictiveThrower.cpp.

1208  {
1209 // *************************
1210  // Step 1: Initialize per-toy accumulators once
1211  // [Sample] [Toys]
1212  auto make_matrix = [&](double init = 0.0) {
1213  return std::vector<std::vector<double>>(
1215  std::vector<double>(Ntoys, init));
1216  };
1217  auto chi2_dat = make_matrix();
1218  auto chi2_mc = make_matrix();
1219  auto chi2_pred = make_matrix();
1220  auto chi2_rate_dat = make_matrix();
1221  auto chi2_rate_mc = make_matrix();
1222  auto chi2_rate_pred = make_matrix();
1223 
1224  // 2. Add penalty terms to global bin
1225  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1226  chi2_dat[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1227  chi2_mc[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1228  chi2_pred[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1229 
1230  chi2_rate_dat[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1231  chi2_rate_mc[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1232  chi2_rate_pred[TotalNumberOfSamples][iToy] = PenaltyTerm[iToy];
1233  }
1234 
1236  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1237  auto SampleHandler = SampleInfo[iSample].SamHandler;
1238  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1239  // Clone histograms to avoid modifying originals
1240  auto DrawFluctHist = M3::Clone(MC_Hist_Toy[iSample][iToy].get());
1241  auto PredFluctHist = M3::Clone(PostPred_mc[iSample].get());
1242 
1243  // Apply fluctuations
1244  MakeFluctuatedHistogram(DrawFluctHist.get(), MC_Hist_Toy[iSample][iToy].get());
1245  MakeFluctuatedHistogram(PredFluctHist.get(), PostPred_mc[iSample].get());
1246 
1247  // I. SHAPE + RATE (bin-by-bin likelihood)
1248  chi2_dat[iSample][iToy] = CalcLLH(Data_Hist[iSample].get(), MC_Hist_Toy[iSample][iToy].get(), W2_Hist_Toy[iSample][iToy].get(), SampleHandler);
1249  chi2_mc[iSample][iToy] = CalcLLH(DrawFluctHist.get(), MC_Hist_Toy[iSample][iToy].get(), W2_Hist_Toy[iSample][iToy].get(), SampleHandler);
1250  chi2_pred[iSample][iToy] = CalcLLH(PredFluctHist.get(), MC_Hist_Toy[iSample][iToy].get(), W2_Hist_Toy[iSample][iToy].get(), SampleHandler);
1251 
1252  // II. RATE-ONLY (total normalization)
1253  chi2_rate_dat[iSample][iToy] = CalcLLH(Data_Hist[iSample]->Integral(), MC_Hist_Toy[iSample][iToy]->Integral(), W2_Hist_Toy[iSample][iToy]->Integral(), SampleHandler);
1254  chi2_rate_mc[iSample][iToy] = CalcLLH(DrawFluctHist->Integral(), MC_Hist_Toy[iSample][iToy]->Integral(), W2_Hist_Toy[iSample][iToy]->Integral(), SampleHandler);
1255  chi2_rate_pred[iSample][iToy] = CalcLLH(PredFluctHist->Integral(), MC_Hist_Toy[iSample][iToy]->Integral(), W2_Hist_Toy[iSample][iToy]->Integral(), SampleHandler);
1256 
1257  // III. accumulate global sums ---
1258  chi2_dat[TotalNumberOfSamples][iToy] += chi2_dat[iSample][iToy];
1259  chi2_mc[TotalNumberOfSamples][iToy] += chi2_mc[iSample][iToy];
1260  chi2_pred[TotalNumberOfSamples][iToy] += chi2_pred[iSample][iToy];
1261 
1262  chi2_rate_dat[TotalNumberOfSamples][iToy] += chi2_rate_dat[iSample][iToy];
1263  chi2_rate_mc[TotalNumberOfSamples][iToy] += chi2_rate_mc[iSample][iToy];
1264  chi2_rate_pred[TotalNumberOfSamples][iToy] += chi2_rate_pred[iSample][iToy];
1265  }
1266  }
1267 
1268  // 4. Produce pValue plots
1269  // Shape+rate posterior predictive checks
1270  MakeChi2Plots(chi2_mc, "-2LLH (Draw Fluc, Draw)", chi2_dat, "-2LLH (Data, Draw)", SampleDir, "_drawfluc_draw");
1271  MakeChi2Plots(chi2_pred, "-2LLH (Pred Fluc, Draw)", chi2_dat, "-2LLH (Data, Draw)", SampleDir, "_predfluc_draw");
1272 
1273  // Rate-only posterior predictive checks
1274  MakeChi2Plots(chi2_rate_mc, "-2LLH (Rate Draw Fluc, Draw)", chi2_rate_dat, "-2LLH (Rate Data, Draw)", SampleDir, "_rate_drawfluc_draw");
1275  MakeChi2Plots(chi2_rate_pred, "-2LLH (Rate Pred Fluc, Draw)", chi2_rate_dat, "-2LLH (Rate Data, Draw)", SampleDir, "_rate_predfluc_draw");
1276 }
void MakeChi2Plots(const std::vector< std::vector< double >> &Chi2_x, const std::string &Chi2_x_title, const std::vector< std::vector< double >> &Chi2_y, const std::string &Chi2_y_title, const std::vector< TDirectory * > &SampleDir, const std::string Title)
Produce Chi2 plot for a single sample based on which $p$-value is calculated.
void MakeFluctuatedHistogram(TH1 *FluctHist, TH1 *PolyHist)
Make Poisson fluctuation of TH1D hist.

◆ PredictiveLLH()

void PredictiveThrower::PredictiveLLH ( const std::vector< std::unique_ptr< TH1 >> &  Data_histogram,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_mc,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_w,
const std::vector< TDirectory * > &  SampleDir 
)
private

Calculate Posterior Predictive LLH.

Definition at line 1279 of file PredictiveThrower.cpp.

1282  {
1283 // *************************
1284  MACH3LOG_INFO("{:<55} {:<10} {:<10} {:<10}", "Sample", "DataInt", "MCInt", "-2LLH");
1285  MACH3LOG_INFO("{:-<55} {:-<10} {:-<10} {:-<10}", "", "", "", "");
1286  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1287  SampleDir[iSample]->cd();
1288  ExtractLLH(Data_histogram[iSample].get(), PostPred_mc[iSample].get(), PostPred_w[iSample].get(), SampleInfo[iSample].SamHandler);
1289  PostPred_mc[iSample]->Write();
1290  }
1291 }
void ExtractLLH(TH1 *DatHist, TH1 *MCHist, TH1 *W2Hist, const SampleHandlerInterface *SampleHandler) const
Calculate the LLH for TH1, set the LLH to title of MCHist.

◆ ProduceSpectra()

void PredictiveThrower::ProduceSpectra ( const std::vector< std::vector< std::vector< std::unique_ptr< TH1D >>>> &  Toys,
const std::vector< TDirectory * > &  Director,
const std::string  suffix 
) const
private

Produce Violin style spectra.

Definition at line 732 of file PredictiveThrower.cpp.

734  {
735 // *************************
736  std::vector<std::vector<double>> MaxValue(TotalNumberOfSamples);
737 
738  // 1. Create Max value
739  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
740  const int nDims = SampleInfo[sample].Dimenstion;
741  MaxValue[sample].assign(nDims, 0);
742  }
743 
744  // 2. Find maximum entries over all toys
745  #ifdef MULTITHREAD
746  #pragma omp parallel for
747  #endif
748  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
749  for (int toy = 0; toy < Ntoys; ++toy) {
750  const int nDims = SampleInfo[sample].Dimenstion;
751  for (int dim = 0; dim < nDims; dim++) {
752  double max_val = Toys[sample][toy][dim]->GetMaximum();
753  MaxValue[sample][dim] = std::max(MaxValue[sample][dim], max_val);
754  }
755  }
756  }
757 
758  // 3. Make actual spectra histogram (this is because making ROOT histograms is not save)
759  // And we now have actual max values
760  std::vector<std::vector<std::unique_ptr<TH2D>>> Spectra(TotalNumberOfSamples);
761  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
762  const int nDims = SampleInfo[sample].Dimenstion;
763  Spectra[sample].resize(nDims);
764  for (int dim = 0; dim < nDims; dim++) {
765  // Get MC histogram x-axis binning
766  TH1D* refHist = Toys[sample][0][dim].get();
767 
768  const int n_bins_x = refHist->GetNbinsX();
769  std::vector<double> x_bin_edges(n_bins_x + 1);
770  for (int b = 0; b < n_bins_x; ++b) {
771  x_bin_edges[b] = refHist->GetXaxis()->GetBinLowEdge(b + 1);
772  }
773  x_bin_edges[n_bins_x] = refHist->GetXaxis()->GetBinUpEdge(n_bins_x);
774 
775  constexpr int n_bins_y = 400;
776  constexpr double y_min = 0.0;
777  const double y_max = MaxValue[sample][dim] * 1.05;
778 
779  // Create TH2D with variable binning on x axis
780  Spectra[sample][dim] = std::make_unique<TH2D>(
781  (SampleInfo[sample].Name + "_" + suffix + "_dim" + std::to_string(dim)).c_str(), // name
782  (SampleInfo[sample].Name + "_" + suffix + "_dim" + std::to_string(dim)).c_str(), // title
783  n_bins_x, x_bin_edges.data(), // x axis bins
784  n_bins_y, y_min, y_max // y axis bins
785  );
786 
787  Spectra[sample][dim]->GetXaxis()->SetTitle(refHist->GetXaxis()->GetTitle());
788  Spectra[sample][dim]->GetYaxis()->SetTitle("Events");
789 
790  Spectra[sample][dim]->SetDirectory(nullptr);
791  Spectra[sample][dim]->Sumw2(true);
792  }
793  }
794 
795  // 4. now we can actually fill our projections
796  #ifdef MULTITHREAD
797  #pragma omp parallel for collapse(2)
798  #endif
799  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
800  for (int toy = 0; toy < Ntoys; ++toy) {
801  const int nDims = SampleInfo[sample].Dimenstion;
802  for (int dim = 0; dim < nDims; dim++) {
803  FastViolinFill(Spectra[sample][dim].get(), Toys[sample][toy][dim].get());
804  }
805  }
806  }
807 
808  // 5. Save histograms which is not thread save
809  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
810  SampleDirectories[sample]->cd();
811  const int nDims = SampleInfo[sample].Dimenstion;
812  for (long unsigned int dim = 0; dim < Spectra[sample].size(); dim++) {
813  Spectra[sample][dim]->Write();
814  // For case of 2D make additional histograms
815  if(nDims == 2) {
816  const std::string name = SampleInfo[sample].Name + "_" + suffix+ "_PostPred_dim" + std::to_string(dim);
817  auto Summary = MakeSummaryFromSpectra(Spectra[sample][dim].get(), name);
818  Summary->Write();
819  }
820  }
821  }
822 }
void FastViolinFill(TH2D *violin, TH1D *hist_1d)
KS: Fill Violin histogram with entry from a toy.
std::unique_ptr< TH1D > MakeSummaryFromSpectra(const TH2D *Spectra, const std::string &name)
Build a 1D posterior-predictive summary from a violin spectrum.

◆ ProduceToys()

void PredictiveThrower::ProduceToys ( )

Produce toys by throwing from MCMC.

KS: Names of parameter groups that will not be varied

KS: Index of parameters that will be varied

this store value of parameters sampled from a chain

Definition at line 439 of file PredictiveThrower.cpp.

439  {
440 // *************************
441  // If we found toys then skip process of making new toys
442  if(LoadToys()) return;
443 
445  std::vector<std::string> ParameterGroupsNotVaried;
447  std::unordered_set<int> ParameterOnlyToVary;
448  // For study where one would like to apply bounds
449  std::vector<const M3::float_t*> BoundValuePointer;
450  std::vector<std::pair<double, double>> ParamBounds;
451 
452  // Setup useful information for toy generation
453  SetupToyGeneration(ParameterGroupsNotVaried, ParameterOnlyToVary,
454  BoundValuePointer, ParamBounds);
455 
456  auto PosteriorFileName = Get<std::string>(fitMan->raw()["Predictive"]["PosteriorFile"], __FILE__, __LINE__);
457 
458  MACH3LOG_INFO("Starting {}", __func__);
459 
460  outputFile->cd();
461  double Penalty = 0, Weight = 1;
462  int Draw = 0;
463 
464  TTree *ToyTree = new TTree("ToySummary", "ToySummary");
465  ToyTree->Branch("Penalty", &Penalty, "Penalty/D");
466  ToyTree->Branch("Weight", &Weight, "Weight/D");
467  ToyTree->Branch("Draw", &Draw, "Draw/I");
468  ToyTree->Branch("NModelParams", &NModelParams, "NModelParams/I");
469 
470  // KS: define branches so we can keep track of what params we are throwing
471  std::vector<double> ParamValues(NModelParams);
472  std::vector<const M3::float_t*> ParampPointers(NModelParams);
473  int ParamCounter = 0;
474  for (size_t iSys = 0; iSys < systematics.size(); iSys++)
475  {
476  for (int iPar = 0; iPar < systematics[iSys]->GetNumParams(); iPar++)
477  {
478  ParampPointers[ParamCounter] = systematics[iSys]->RetPointer(iPar);
479  std::string Name = systematics[iSys]->GetParFancyName(iPar);
480  //CW: Also strip out - signs because it messes up TBranches
481  while (Name.find("-") != std::string::npos) {
482  Name.replace(Name.find("-"), 1, std::string("_"));
483  }
484  ToyTree->Branch(Name.c_str(), &ParamValues[ParamCounter], (Name + "/D").c_str());
485  ParamCounter++;
486  }
487  }
488  TDirectory* ToyDirectory = outputFile->mkdir("Toys");
489  ToyDirectory->cd();
490  int SampleCounter = 0;
491  for (size_t iPDF = 0; iPDF < samples.size(); iPDF++)
492  {
493  auto* MaCh3Sample = samples[iPDF];
494  for (int SampleIndex = 0; SampleIndex < MaCh3Sample->GetNSamples(); ++SampleIndex)
495  {
496  // Get nominal spectra and event rates
497  const TH1* DataHist = MaCh3Sample->GetDataHist(SampleIndex);
498  Data_Hist[SampleCounter] = M3::Clone(DataHist, MaCh3Sample->GetSampleTitle(SampleIndex) + "_data");
499  Data_Hist[SampleCounter]->Write((MaCh3Sample->GetSampleTitle(SampleIndex) + "_data").c_str());
500 
501  const TH1* MCHist = MaCh3Sample->GetMCHist(SampleIndex);
502  MC_Nom_Hist[SampleCounter] = M3::Clone(MCHist, MaCh3Sample->GetSampleTitle(SampleIndex) + "_mc");
503  MC_Nom_Hist[SampleCounter]->Write((MaCh3Sample->GetSampleTitle(SampleIndex) + "_mc").c_str());
504 
505  const TH1* W2Hist = MaCh3Sample->GetW2Hist(SampleIndex);
506  W2_Nom_Hist[SampleCounter] = M3::Clone(W2Hist, MaCh3Sample->GetSampleTitle(SampleIndex) + "_w2");
507  W2_Nom_Hist[SampleCounter]->Write((MaCh3Sample->GetSampleTitle(SampleIndex) + "_w2").c_str());
508  SampleCounter++;
509  }
510  }
511 
512  TDirectory* Toy_1DDirectory = outputFile->mkdir("Toys_1DHistVar");
513  TDirectory* Toy_2DDirectory = outputFile->mkdir("Toys_2DHistVar");
515  std::vector<std::vector<double>> branch_vals(systematics.size());
516  std::vector<std::vector<std::string>> branch_name(systematics.size());
517 
518  TChain* PosteriorFile = nullptr;
519  unsigned int burn_in = 0;
520  unsigned int maxNsteps = 0;
521  unsigned int Step = 0;
522  if(!Is_PriorPredictive)
523  {
524  PosteriorFile = new TChain("posteriors");
525  PosteriorFile->Add(PosteriorFileName.c_str());
526 
527  PosteriorFile->SetBranchAddress("step", &Step);
528  if (PosteriorFile->GetBranch("Weight")) {
529  PosteriorFile->SetBranchStatus("Weight", true);
530  PosteriorFile->SetBranchAddress("Weight", &Weight);
531  } else {
532  MACH3LOG_WARN("Not applying reweighting weight");
533  Weight = 1.0;
534  }
535 
536  for (size_t s = 0; s < systematics.size(); ++s) {
537  auto fancy_names = GetStoredFancyName(systematics[s]);
538  systematics[s]->MatchMaCh3OutputBranches(PosteriorFile, branch_vals[s], branch_name[s], fancy_names);
539  }
540 
541  //Get the burn-in from the config
542  burn_in = Get<unsigned int>(fitMan->raw()["Predictive"]["BurnInSteps"], __FILE__, __LINE__);
543 
544  //DL: Adding sanity check for chains shorter than burn in
545  maxNsteps = static_cast<unsigned int>(PosteriorFile->GetMaximum("step"));
546  if(burn_in >= maxNsteps)
547  {
548  MACH3LOG_ERROR("You are running on a chain shorter than burn in cut");
549  MACH3LOG_ERROR("Maximal value of nSteps: {}, burn in cut {}", maxNsteps, burn_in);
550  MACH3LOG_ERROR("You will run into infinite loop");
551  MACH3LOG_ERROR("You can make new chain or modify burn in cut");
552  throw MaCh3Exception(__FILE__,__LINE__);
553  }
554  }
555 
556  TStopwatch TempClock;
557  TempClock.Start();
558  for(int i = 0; i < Ntoys; i++)
559  {
560  if(Ntoys >= 10 && i % (Ntoys/10) == 0) {
562  }
563  if(!Is_PriorPredictive){
564  int entry = 0;
565  Step = 0;
566  // KS This allow to set additional bounds like mass ordering
567  bool WithinBounds = false;
568  //YSP: Ensures you get an entry from the mcmc even when burn_in is set to zero (Although not advised :p ).
569  //Take 200k burn in steps, WP: Eb C in 1st peaky
570  // If we have combined chains by hadd need to check the step in the chain
571  // Note, entry is not necessarily same as step due to merged ROOT files, so can't choose entry in the range BurnIn - nEntries :(
572  while(Step < burn_in || !WithinBounds) {
573  entry = random->Integer(static_cast<unsigned int>(PosteriorFile->GetEntries()));
574  PosteriorFile->GetEntry(entry);
575  // KS: This might be bit hacky... but BoundValuePointer refer to values in ParameterHandler
576  // so we need to update them
577  if(BoundValuePointer.size() > 0) {
578  for (size_t s = 0; s < systematics.size(); ++s) {
579  systematics[s]->SetParameters(branch_vals[s]);
580  }
581  }
582  WithinBounds = CheckBounds(BoundValuePointer, ParamBounds);
583  }
584  Draw = entry;
585  }
586  for (size_t s = 0; s < systematics.size(); ++s)
587  {
588  //KS: Below line can help you get prior predictive distributions which are helpful for getting pre and post ND fit spectra
589  //YSP: If not set in the config, the code runs SK Posterior Predictive distributions by default. If true, then the code runs SK prior predictive.
590  if(Is_PriorPredictive) {
591  systematics[s]->ThrowParameters();
592  } else {
593  systematics[s]->SetParameters(branch_vals[s]);
594  }
595  }
596 
597  // This set some params to prior value this way you can evaluate errors from subset of errors
598  SetParamters(ParameterGroupsNotVaried, ParameterOnlyToVary);
599 
600  Penalty = 0;
601  if(FullLLH) {
602  for (size_t s = 0; s < systematics.size(); ++s) {
603  //KS: do times 2 because banff reports chi2
604  Penalty = 2.0 * systematics[s]->GetLikelihood();
605  }
606  }
607 
608  PenaltyTerm[i] = Penalty;
609  ReweightWeight[i] = Weight;
610 
611  for (size_t iPDF = 0; iPDF < samples.size(); iPDF++) {
612  samples[iPDF]->Reweight();
613  }
614  // Save histograms to file
615  WriteToy(ToyDirectory, Toy_1DDirectory, Toy_2DDirectory, i);
616 
617  // Fill parameter value so we know throw values
618  for (size_t iPar = 0; iPar < ParamValues.size(); iPar++) {
619  ParamValues[iPar] = *ParampPointers[iPar];
620  }
621 
622  ToyTree->Fill();
623  }//end of toys loop
624  TempClock.Stop();
625 
626  if(PosteriorFile) delete PosteriorFile;
627  ToyDirectory->Close(); delete ToyDirectory;
628  Toy_1DDirectory->Close(); delete Toy_1DDirectory;
629  Toy_2DDirectory->Close(); delete Toy_2DDirectory;
630 
631  outputFile->cd();
632  ToyTree->Write(); delete ToyTree;
633 
634  MACH3LOG_INFO("{} took {:.2f}s to finish for {} toys", __func__, TempClock.RealTime(), Ntoys);
635 }
bool CheckBounds(const std::vector< const M3::float_t * > &BoundValuePointer, const std::vector< std::pair< double, double >> &ParamBounds)
TFile * outputFile
Output.
Definition: FitterBase.h:148
std::vector< SampleHandlerInterface * > samples
Sample holder.
Definition: FitterBase.h:130
std::vector< ParameterHandlerBase * > systematics
Systematic holder.
Definition: FitterBase.h:135
void WriteToy(TDirectory *ToyDirectory, TDirectory *Toy_1DDirectory, TDirectory *Toy_2DDirectory, const int iToy)
Save histograms for a single MCMC Throw/Toy.
bool LoadToys()
Load existing toys.
void SetupToyGeneration(std::vector< std::string > &ParameterGroupsNotVaried, std::unordered_set< int > &ParameterOnlyToVary, std::vector< const M3::float_t * > &BoundValuePointer, std::vector< std::pair< double, double >> &ParamBounds)
Setup useful variables etc before stating toy generation.
std::vector< std::string > GetStoredFancyName(ParameterHandlerBase *Systematics) const
Get Fancy parameters stored in mcmc chains for passed ParameterHandler.
void SetParamters(std::vector< std::string > &ParameterGroupsNotVaried, std::unordered_set< int > &ParameterOnlyToVary)
This set some params to prior value this way you can evaluate errors from subset of errors.
void PrintProgressBar(const Long64_t Done, const Long64_t All)
KS: Simply print progress bar.
Definition: Monitor.cpp:229

◆ RateAnalysis()

void PredictiveThrower::RateAnalysis ( const std::vector< std::vector< std::unique_ptr< TH1 >>> &  Toys,
const std::vector< TDirectory * > &  SampleDirectories 
) const
private

Produce distribution of number of events for each sample.

Definition at line 1493 of file PredictiveThrower.cpp.

1494  {
1495 // *************************
1496  std::vector<std::unique_ptr<TH1D>> EventHist(TotalNumberOfSamples+1);
1497  for (int iSample = 0; iSample < TotalNumberOfSamples+1; ++iSample) {
1498  std::string Title = "EventHist: ";
1499  if (iSample == TotalNumberOfSamples) {
1500  Title = "Total";
1501  } else {
1502  Title = SampleInfo[iSample].Name;
1503  }
1504  Title += "_sum";
1505  //KS: When a histogram is created with an axis lower limit greater or equal to its upper limit ROOT will automatically adjust histogram range
1506  // https://root.cern.ch/doc/master/classTH1.html#auto-bin
1507  EventHist[iSample] = std::make_unique<TH1D>(Title.c_str(), Title.c_str(), 100, 1, -1);
1508  EventHist[iSample]->SetDirectory(nullptr);
1509  EventHist[iSample]->GetXaxis()->SetTitle("Total event rate");
1510  EventHist[iSample]->GetYaxis()->SetTitle("Counts");
1511  EventHist[iSample]->SetLineWidth(2);
1512  }
1513 
1514  // First fill per-sample histograms
1515  #ifdef MULTITHREAD
1516  #pragma omp parallel for
1517  #endif
1518  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1519  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1520  double Count = Toys[iSample][iToy]->Integral();
1521  EventHist[iSample]->Fill(Count);
1522  }
1523  }
1524 
1525  // Now fill total histogram properly (per toy)
1526  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1527  double TotalCount = 0.0;
1528  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1529  TotalCount += Toys[iSample][iToy]->Integral();
1530  }
1531  EventHist[TotalNumberOfSamples]->Fill(TotalCount);
1532  }
1533 
1534  double DataRate = 0.0;
1535  std::vector<double> DataRates(TotalNumberOfSamples+1);
1536  #ifdef MULTITHREAD
1537  #pragma omp parallel for reduction(+:DataRate)
1538  #endif
1539  for (int i = 0; i < TotalNumberOfSamples; ++i) {
1540  DataRates[i] = Data_Hist[i]->Integral();
1541  DataRate += DataRates[i];
1542  }
1543  DataRates[TotalNumberOfSamples] = DataRate;
1544 
1545  for (int SampleNum = 0; SampleNum < TotalNumberOfSamples+1; ++SampleNum) {
1546  SampleDirectories[SampleNum]->cd();
1547  //Make fancy event rate histogram
1548  MakeCutEventRate(EventHist[SampleNum].get(), DataRates[SampleNum]);
1549  }
1550 }
void MakeCutEventRate(TH1D *Histogram, const double DataRate) const
Make the 1D Event Rate Hist.

◆ RunMCMC()

void PredictiveThrower::RunMCMC ( )
inlineoverridevirtual

This is not used in this class.

Implements FitterBase.

Definition at line 53 of file PredictiveThrower.h.

53  {
54  MACH3LOG_ERROR("{} is not supported in {}", __func__, GetName());
55  throw MaCh3Exception(__FILE__ , __LINE__ );
56  };
std::string GetName() const
Get name of class.
Definition: FitterBase.h:71

◆ RunPredictiveAnalysis()

void PredictiveThrower::RunPredictiveAnalysis ( )

Main routine responsible for producing posterior predictive distributions and $p$-value.

Definition at line 1013 of file PredictiveThrower.cpp.

1013  {
1014 // *************************
1015  // Remove not useful stuff
1016  SanitiseInputs();
1017 
1018  MACH3LOG_INFO("Starting {}", __func__);
1019  MACH3LOG_WARN("\033[0;31mCurrent Total RAM usage is {:.2f} GB\033[0m", M3::Utils::getValue("VmRSS") / 1048576.0);
1020  MACH3LOG_WARN("\033[0;31mOut of Total available RAM {:.2f} GB\033[0m", M3::Utils::getValue("MemTotal") / 1048576.0);
1021 
1022  TStopwatch TempClock;
1023  TempClock.Start();
1024 
1025  auto DebugHistograms = GetFromManager<bool>(fitMan->raw()["Predictive"]["DebugHistograms"], false, __FILE__, __LINE__);
1026  auto StudyBeta = GetFromManager<bool>(fitMan->raw()["Predictive"]["StudyBetaParameters"], true, __FILE__, __LINE__);
1027 
1028  TDirectory* PredictiveDir = outputFile->mkdir("Predictive");
1029  std::vector<TDirectory*> SampleDirectories;
1030  SampleDirectories.resize(TotalNumberOfSamples+1);
1031 
1032  // open directory for every sample
1033  for (int sample = 0; sample < TotalNumberOfSamples+1; ++sample) {
1034  SampleDirectories[sample] = PredictiveDir->mkdir(SampleInfo[sample].Name.c_str());
1035  }
1036 
1037  // Produce Violin style spectra
1038  Study1DProjections(SampleDirectories);
1039  // Produce posterior predictive distribution for mc
1040  auto PostPred_mc = MakePredictive(MC_Hist_Toy, SampleDirectories, "mc", DebugHistograms, false);
1041  // Produce posterior predictive distribution for w2
1042  auto PostPred_w2 = MakePredictive(W2_Hist_Toy, SampleDirectories, "w2", false, false);
1043  // Calculate Posterior Predictive LLH
1044  PredictiveLLH(Data_Hist, PostPred_mc, PostPred_w2, SampleDirectories);
1045  // Calculate Posterior Predictive $p$-value
1046  PosteriorPredictivepValue(PostPred_mc, SampleDirectories);
1047  // Check how number of events changed
1048  RateAnalysis(MC_Hist_Toy, SampleDirectories);
1049 
1050  // Studying information criterion
1051  StudyInformationCriterion(M3::kWAIC, PostPred_mc, PostPred_w2);
1052 
1053  // Close directories
1054  for (int sample = 0; sample < TotalNumberOfSamples+1; ++sample) {
1055  SampleDirectories[sample]->Close();
1056  delete SampleDirectories[sample];
1057  }
1058  // Perform beta analysis for mc statical uncertainty
1059  if(StudyBeta) StudyBetaParameters(PredictiveDir);
1060 
1061  PredictiveDir->Close();
1062  delete PredictiveDir;
1063 
1064  outputFile->cd();
1065 
1066  TempClock.Stop();
1067  MACH3LOG_INFO("{} took {:.2f}s to finish for {} toys", __func__, TempClock.RealTime(), Ntoys);
1068 }
void SanitiseInputs()
Remove obsolete memory and make other checks before fit starts.
Definition: FitterBase.cpp:221
void PosteriorPredictivepValue(const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< TDirectory * > &SampleDir)
Calculate Posterior Predictive $p$-value Compares observed data to toy datasets generated from:
void StudyInformationCriterion(M3::kInfCrit Criterion, const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
Information Criterion.
void PredictiveLLH(const std::vector< std::unique_ptr< TH1 >> &Data_histogram, const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w, const std::vector< TDirectory * > &SampleDir)
Calculate Posterior Predictive LLH.
void RateAnalysis(const std::vector< std::vector< std::unique_ptr< TH1 >>> &Toys, const std::vector< TDirectory * > &SampleDirectories) const
Produce distribution of number of events for each sample.
void Study1DProjections(const std::vector< TDirectory * > &SampleDirectories) const
Load 1D projections and later produce violin plots for each.
void StudyBetaParameters(TDirectory *PredictiveDir)
Evaluate prior/post predictive distribution for beta parameters (used for evaluating impact MC statis...
std::vector< std::unique_ptr< TH1 > > MakePredictive(const std::vector< std::vector< std::unique_ptr< TH1 >>> &Toys, const std::vector< TDirectory * > &Director, const std::string &suffix, const bool DebugHistograms, const bool WriteHist)
Produce posterior predictive distribution.
int getValue(const std::string &Type)
CW: Get info like RAM.
Definition: Monitor.cpp:252
@ kWAIC
Watanabe-Akaike information criterion.

◆ SetParamters()

void PredictiveThrower::SetParamters ( std::vector< std::string > &  ParameterGroupsNotVaried,
std::unordered_set< int > &  ParameterOnlyToVary 
)
private

This set some params to prior value this way you can evaluate errors from subset of errors.

Have ability to not throw legacy matrices

Alternatively vary only selected params

Definition at line 43 of file PredictiveThrower.cpp.

44  {
45 // *************************
46  // WARNING This should be removed in the future
47  auto DoNotThrowLegacyCov = GetFromManager<std::vector<std::string>>(fitMan->raw()["Predictive"]["DoNotThrowLegacyCov"], {}, __FILE__, __LINE__);
49  for (size_t i = 0; i < DoNotThrowLegacyCov.size(); ++i) {
50  for (size_t s = 0; s < systematics.size(); ++s) {
51  if (systematics[s]->GetName() == DoNotThrowLegacyCov[i]) {
52  systematics[s]->SetParameters();
53  break;
54  }
55  }
56  }
57 
58  // Set groups to prefit values if they were set to not be varies
59  if(ModelSystematic && ParameterGroupsNotVaried.size() > 0) {
60  ModelSystematic->SetGroupOnlyParameters(ParameterGroupsNotVaried);
61  }
62 
64  if (ModelSystematic && !ParameterOnlyToVary.empty()) {
65  for (int i = 0; i < ModelSystematic->GetNumParams(); ++i) {
66  // KS: If parameter is in map then we are skipping this, otherwise for params that we don't want to vary we simply set it to prior
67  if (ParameterOnlyToVary.find(i) == ParameterOnlyToVary.end()) {
69  }
70  }
71  }
72 }
int GetNumParams() const
Get total number of parameters.
void SetParProp(const int i, const double val)
Set proposed parameter value.
double GetParInit(const int i) const
Get prior parameter value.
void SetGroupOnlyParameters(const std::string &Group, const std::vector< double > &Pars={})
KS Function to set to prior parameters of a given group or values from vector.

◆ SetupSampleInformation()

void PredictiveThrower::SetupSampleInformation ( )
private

Setup sample information.

Definition at line 75 of file PredictiveThrower.cpp.

75  {
76 // *************************
78  for (size_t iPDF = 0; iPDF < samples.size(); iPDF++) {
79  TotalNumberOfSamples += samples[iPDF]->GetNSamples();
80  }
81 
87 
89 
90 
91  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
92  MC_Hist_Toy[sample].resize(Ntoys);
93  W2_Hist_Toy[sample].resize(Ntoys);
94  }
95  int counter = 0;
96  for (size_t iPDF = 0; iPDF < samples.size(); iPDF++) {
97  for (int SampleIndex = 0; SampleIndex < samples[iPDF]->GetNSamples(); ++SampleIndex) {
98  SampleInfo[counter].Name = samples[iPDF]->GetSampleTitle(SampleIndex);
99  SampleInfo[counter].LocalId = SampleIndex;
100  SampleInfo[counter].SamHandler = samples[iPDF];
101  SampleInfo[counter].Dimenstion = SampleInfo[counter].SamHandler->GetNDim(SampleIndex);
102  counter++;
103  }
104  }
105  SampleInfo[TotalNumberOfSamples].Name= "Total";
106 }

◆ SetupToyGeneration()

void PredictiveThrower::SetupToyGeneration ( std::vector< std::string > &  ParameterGroupsNotVaried,
std::unordered_set< int > &  ParameterOnlyToVary,
std::vector< const M3::float_t * > &  BoundValuePointer,
std::vector< std::pair< double, double >> &  ParamBounds 
)
private

Setup useful variables etc before stating toy generation.

Let's ask the manager what are the file with covariance matrix

Definition at line 110 of file PredictiveThrower.cpp.

113  {
114 // *************************
115  int counter = 0;
116  for (size_t s = 0; s < systematics.size(); ++s) {
117  auto* MaCh3Params = dynamic_cast<ParameterHandlerGeneric*>(systematics[s]);
118  if(MaCh3Params) {
119  ModelSystematic = MaCh3Params;
120  counter++;
121  }
122  }
123 
125 
126  if(Is_PriorPredictive) {
127  MACH3LOG_INFO("You've chosen to run Prior Predictive Distribution");
128  } else {
129  auto PosteriorFileName = Get<std::string>(fitMan->raw()["Predictive"]["PosteriorFile"], __FILE__, __LINE__);
130  //KS: We use MCMCProcessor to get names of covariances that were actually used to produce given chain
131  MCMCProcessor Processor(PosteriorFileName);
132  Processor.Initialise();
133 
134  // For throwing FD predictions from ND-only chain we have to allow having different yaml configs
135  auto AllowDifferentConfigs = GetFromManager<bool>(fitMan->raw()["Predictive"]["AllowDifferentConfigs"], false, __FILE__, __LINE__);
136 
138  YAML::Node ConfigInChain = Processor.GetCovConfig(kXSecPar);
139  if(ModelSystematic) {
140  YAML::Node ConfigNow = ModelSystematic->GetConfig();
141  if (!compareYAMLNodes(ConfigNow, ConfigInChain))
142  {
143  if(AllowDifferentConfigs){
144  MACH3LOG_WARN("Yaml configs used for your ParameterHandler for chain you want sample from ({}) and one currently initialised are different", PosteriorFileName);
145  } else {
146  MACH3LOG_ERROR("Yaml configs used for your ParameterHandler for chain you want sample from ({}) and one currently initialised are different", PosteriorFileName);
147  throw MaCh3Exception(__FILE__ , __LINE__ );
148  }
149  }
150  }
151  }
152  if(counter > 1) {
153  MACH3LOG_ERROR("Found {} ParmaterHandler inheriting from ParameterHandlerGeneric, I can accept at most 1", counter);
154  throw MaCh3Exception(__FILE__, __LINE__);
155  }
156 
157  for (size_t s = 0; s < systematics.size(); ++s) {
158  NModelParams += systematics[s]->GetNumParams();
159  }
160 
161  if (ModelSystematic) {
162  auto ThrowParamGroupOnly = GetFromManager<std::vector<std::string>>(fitMan->raw()["Predictive"]["ThrowParamGroupOnly"], {}, __FILE__, __LINE__);
163  auto UniqueParamGroup = ModelSystematic->GetUniqueParameterGroups();
164  auto ParameterOnlyToVaryString = GetFromManager<std::vector<std::string>>(fitMan->raw()["Predictive"]["ThrowSinlgeParams"], {}, __FILE__, __LINE__);
165 
166  if (!ThrowParamGroupOnly.empty() && !ParameterOnlyToVaryString.empty()) {
167  MACH3LOG_ERROR("Can't use ThrowParamGroupOnly and ThrowSinlgeParams at the same time");
168  throw MaCh3Exception(__FILE__, __LINE__);
169  }
170 
171  if (!ParameterOnlyToVaryString.empty()) {
172  MACH3LOG_INFO("I will throw only: {}", fmt::join(ParameterOnlyToVaryString, ", "));
173  std::vector<int> ParameterVary(ParameterOnlyToVaryString.size());
174 
175  for (size_t i = 0; i < ParameterOnlyToVaryString.size(); ++i) {
176  ParameterVary[i] = ModelSystematic->GetParIndex(ParameterOnlyToVaryString[i]);
177  if (ParameterVary[i] == M3::_BAD_INT_) {
178  MACH3LOG_ERROR("Can't proceed if param {} is missing", ParameterOnlyToVaryString[i]);
179  throw MaCh3Exception(__FILE__, __LINE__);
180  }
181  }
182  ParameterOnlyToVary = std::unordered_set<int>(ParameterVary.begin(), ParameterVary.end());
183  } else {
184  MACH3LOG_INFO("I have following parameter groups: {}", fmt::join(UniqueParamGroup, ", "));
185  if (ThrowParamGroupOnly.empty()) {
186  MACH3LOG_INFO("I will vary all");
187  } else {
188  std::unordered_set<std::string> throwOnlySet(ThrowParamGroupOnly.begin(), ThrowParamGroupOnly.end());
189  ParameterGroupsNotVaried.clear();
190 
191  for (const auto& group : UniqueParamGroup) {
192  if (throwOnlySet.find(group) == throwOnlySet.end()) {
193  ParameterGroupsNotVaried.push_back(group);
194  }
195  }
196 
197  MACH3LOG_INFO("I will vary: {}", fmt::join(ThrowParamGroupOnly, ", "));
198  MACH3LOG_INFO("Exclude: {}", fmt::join(ParameterGroupsNotVaried, ", "));
199  }
200  }
201  }
202 
203  auto paramNode = fitMan->raw()["Predictive"]["ParameterBounds"];
204  for (const auto& p : paramNode) {
205  // Extract name
206  std::string name = p[0].as<std::string>();
207 
208  // Extract bounds: min and max
209  double minVal = p[1][0].as<double>();
210  double maxVal = p[1][1].as<double>();
211  ParamBounds.emplace_back(minVal, maxVal);
212 
213  for (size_t s = 0; s < systematics.size(); ++s) {
214  for(int iPar = 0; iPar < systematics[s]->GetNParameters(); iPar++){
215  if(systematics[s]->GetParFancyName(iPar) == name){
216  BoundValuePointer.push_back(systematics[s]->RetPointer(iPar));
217  break;
218  }
219  }
220  }
221  if(ParamBounds.size() != BoundValuePointer.size()){
222  MACH3LOG_ERROR("Ddin't find paramter {}", name);
223  throw MaCh3Exception(__FILE__,__LINE__);
224  }
225  MACH3LOG_INFO("Parameter: {} with : [{}, {}]", name, minVal, maxVal);
226  }
227  if(Is_PriorPredictive && ParamBounds.size() > 0) {
228  MACH3LOG_ERROR("Additional bounds not supported by prior predictive right now");
229  throw MaCh3Exception(__FILE__,__LINE__);
230  }
231 }
@ kXSecPar
Definition: MCMCProcessor.h:46
bool compareYAMLNodes(const YAML::Node &node1, const YAML::Node &node2, bool Mute=false)
Compare if yaml nodes are identical.
Definition: YamlHelper.h:186
Class responsible for processing MCMC chains, performing diagnostics, generating plots,...
Definition: MCMCProcessor.h:61
int GetParIndex(const std::string &name) const
Get index based on name.
YAML::Node GetConfig() const
Getter to return a copy of the YAML node.
Class responsible for handling of systematic error parameters with different types defined in the con...
std::vector< std::string > GetUniqueParameterGroups() const
KS: Get names of all unique parameter groups.
constexpr static const int _BAD_INT_
Default value used for int initialisation.
Definition: Core.h:55

◆ Study1DProjections()

void PredictiveThrower::Study1DProjections ( const std::vector< TDirectory * > &  SampleDirectories) const
private

Load 1D projections and later produce violin plots for each.

Definition at line 638 of file PredictiveThrower.cpp.

638  {
639 // *************************
640  MACH3LOG_INFO("Starting {}", __func__);
641 
642  TDirectory * ogdir = gDirectory;
643  auto PosteriorFileName = Get<std::string>(fitMan->raw()["Predictive"]["PosteriorFile"], __FILE__, __LINE__);
644  // Open the ROOT file
645  int originalErrorWarning = gErrorIgnoreLevel;
646  gErrorIgnoreLevel = kFatal;
647 
648  TFile* file = TFile::Open(PosteriorFileName.c_str(), "READ");
649 
650  gErrorIgnoreLevel = originalErrorWarning;
651  TDirectory* ToyDir = file->GetDirectory("Toys_1DHistVar");
652  // If toys not amiable in posterior file this means they must be in output file
653  if(ToyDir == nullptr) {
654  ToyDir = outputFile->GetDirectory("Toys_1DHistVar");
655  }
656  // [sample], [toy], [dim]
657  std::vector<std::vector<std::vector<std::unique_ptr<TH1D>>>> ProjectionToys(TotalNumberOfSamples);
658  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
659  ProjectionToys[sample].resize(Ntoys);
660  const int nDims = SampleInfo[sample].Dimenstion;
661  for (int iToy = 0; iToy < Ntoys; ++iToy) {
662  ProjectionToys[sample][iToy].resize(nDims);
663  }
664  }
665 
666  for (int iToy = 0; iToy < Ntoys; ++iToy) {
667  if (iToy % 100 == 0) MACH3LOG_INFO(" Loaded Projection toys {}", iToy);
668  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
669  const int nDims = SampleInfo[sample].Dimenstion;
670  for(int iDim = 0; iDim < nDims; iDim ++){
671  std::string ProjectionSuffix = "_1DProj" + std::to_string(iDim) + "_" + std::to_string(iToy);
672  TH1D* MCHist1D = static_cast<TH1D*>(ToyDir->Get((SampleInfo[sample].Name + ProjectionSuffix).c_str()));
673  ProjectionToys[sample][iToy][iDim] = M3::Clone(MCHist1D);
674  }
675  } // end loop over samples
676  } // end loop over toys
677  file->Close(); delete file;
678  if(ogdir){ ogdir->cd(); }
679 
680  ProduceSpectra(ProjectionToys, SampleDirectories, "mc");
681  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
682  const int nDims = SampleInfo[sample].Dimenstion;
683  // KS: We only care about doing projections for 2D, for 1D we have well 1D, for beyond 2D we have flattened TH1D
684  if(nDims == 2){
685  auto hist = Data_Hist[sample].get();
686  SampleDirectories[sample]->cd();
687 
688  std::string nameX = "Data_" + SampleInfo[sample].Name + "_Dim0";
689  std::string nameY = "Data_" + SampleInfo[sample].Name + "_Dim1";
690 
691  if(std::string(hist->ClassName()) == "TH2Poly") {
692  TAxis* xax = ProjectionToys[sample][0][0]->GetXaxis();
693  TAxis* yax = ProjectionToys[sample][0][1]->GetXaxis();
694 
695  std::vector<double> XBinning(xax->GetNbins()+1);
696  std::vector<double> YBinning(yax->GetNbins()+1);
697 
698  for(int i=0;i<=xax->GetNbins();++i)
699  XBinning[i] = xax->GetBinLowEdge(i+1);
700 
701  for(int i=0;i<=yax->GetNbins();++i)
702  YBinning[i] = yax->GetBinLowEdge(i+1);
703 
704  TH1D* ProjectionX = PolyProjectionX(static_cast<TH2Poly*>(hist), nameX.c_str(), XBinning, false);
705  TH1D* ProjectionY = PolyProjectionY(static_cast<TH2Poly*>(hist), nameY.c_str(), YBinning, false);
706 
707  ProjectionX->SetDirectory(nullptr);
708  ProjectionY->SetDirectory(nullptr);
709 
710  ProjectionX->Write(nameX.c_str());
711  ProjectionY->Write(nameY.c_str());
712 
713  delete ProjectionX;
714  delete ProjectionY;
715  } else { //TH2D
716  TH1D* ProjectionX = static_cast<TH2D*>(hist)->ProjectionX(nameX.c_str());
717  TH1D* ProjectionY = static_cast<TH2D*>(hist)->ProjectionY(nameY.c_str());
718 
719  ProjectionX->SetDirectory(nullptr);
720  ProjectionY->SetDirectory(nullptr);
721 
722  ProjectionX->Write(nameX.c_str());
723  ProjectionY->Write(nameY.c_str());
724  delete ProjectionX;
725  delete ProjectionY;
726  }
727  }
728  }
729 }
TH1D * PolyProjectionX(TObject *poly, std::string TempName, const std::vector< double > &xbins, const bool computeErrors)
WP: Poly Projectors.
TH1D * PolyProjectionY(TObject *poly, std::string TempName, const std::vector< double > &ybins, const bool computeErrors)
WP: Poly Projectors.
void ProduceSpectra(const std::vector< std::vector< std::vector< std::unique_ptr< TH1D >>>> &Toys, const std::vector< TDirectory * > &Director, const std::string suffix) const
Produce Violin style spectra.

◆ StudyBetaParameters()

void PredictiveThrower::StudyBetaParameters ( TDirectory *  PredictiveDir)
private

Evaluate prior/post predictive distribution for beta parameters (used for evaluating impact MC statistical uncertainty)

  1. Initialise Beta histogram
  2. Fill histograms, thread safe as all histograms are allocated before and we loop over samples

ROOT enumerates from 1 while MaCh3 from 0

  1. Write to file

Definition at line 1337 of file PredictiveThrower.cpp.

1337  {
1338 // *************************
1339  bool StudyBeta = GetFromManager<bool>(fitMan->raw()["Predictive"]["StudyBetaParameters"], true, __FILE__, __LINE__ );
1340  if (StudyBeta == false) return;
1341 
1342  MACH3LOG_INFO("Starting {}", __func__);
1343  TDirectory* BetaDir = PredictiveDir->mkdir("BetaParameters");
1344  std::vector<std::vector<std::unique_ptr<TH1D>>> BetaHist(TotalNumberOfSamples);
1345  std::vector<TDirectory *> DirBeta(TotalNumberOfSamples);
1346  // initialise directory for each sample
1347  for (int sample = 0; sample < TotalNumberOfSamples; ++sample) {
1348  BetaDir->cd();
1349  DirBeta[sample] = BetaDir->mkdir(SampleInfo[sample].Name.c_str());
1350  }
1351 
1353  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1354  const int nDims = SampleInfo[iSample].Dimenstion;
1355  // Use any histogram that defines the binning structure
1356  TH1* RefHist = Data_Hist[iSample].get();
1357  BetaHist[iSample] = PerBinHistogram(RefHist, iSample, nDims, "Beta_Parameter");
1358  // Change x-axis title
1359  for (size_t i = 0; i < BetaHist[iSample].size(); ++i) {
1360  BetaHist[iSample][i]->GetXaxis()->SetTitle("beta parameter");
1361  }
1362  }
1363 
1365  #ifdef MULTITHREAD
1366  #pragma omp parallel for
1367  #endif
1368  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1369  const int nDims = SampleInfo[iSample].Dimenstion;
1370  const auto likelihood = SampleInfo[iSample].SamHandler->GetTestStatistic();
1371  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1372  if (nDims == 2) {
1373  if(std::string(Data_Hist[iSample]->ClassName()) == "TH2Poly") {
1374  for (int i = 1; i <= static_cast<TH2Poly*>(Data_Hist[iSample].get())->GetNumberOfBins(); ++i) {
1375  const double Data = Data_Hist[iSample]->GetBinContent(i);
1376  const double MC = MC_Hist_Toy[iSample][iToy]->GetBinContent(i);
1377  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(i);
1378 
1379  const double BetaParam = GetBetaParameter(Data, MC, w2, likelihood);
1380  BetaHist[iSample][i-1]->Fill(BetaParam, ReweightWeight[iToy]);
1381  } // end loop over poly bins
1382  } else {
1383  const int nX = Data_Hist[iSample]->GetNbinsX();
1384  const int nY = Data_Hist[iSample]->GetNbinsY();
1385  for (int iy = 1; iy <= nY; ++iy) {
1386  for (int ix = 1; ix <= nX; ++ix) {
1387  const int FlatBin = (iy-1) * nX + (ix-1);
1388 
1389  const double Data = Data_Hist[iSample]->GetBinContent(ix, iy);
1390  const double MC = MC_Hist_Toy[iSample][iToy]->GetBinContent(ix, iy);
1391  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(ix, iy);
1392 
1393  const double BetaParam = GetBetaParameter(Data, MC, w2, likelihood);
1394  BetaHist[iSample][FlatBin]->Fill(BetaParam, ReweightWeight[iToy]);
1395  }
1396  } // end loop over x
1397  } // end loop over y
1398  } else {
1399  int nbinsx = Data_Hist[iSample]->GetNbinsX();
1400  for (int ix = 1; ix <= nbinsx; ++ix) {
1402  const double Data = Data_Hist[iSample]->GetBinContent(ix);
1403  const double MC = MC_Hist_Toy[iSample][iToy]->GetBinContent(ix);
1404  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(ix);
1405 
1406  const double BetaParam = GetBetaParameter(Data, MC, w2, likelihood);
1407  BetaHist[iSample][ix-1]->Fill(BetaParam, ReweightWeight[iToy]);
1408  } // end loop over bins
1409  }
1410  } // end loop over toys
1411  } // end loop over samples
1412 
1414  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1415  for (size_t iBin = 0; iBin < BetaHist[iSample].size(); iBin++) {
1416  DirBeta[iSample]->cd();
1417  BetaHist[iSample][iBin]->Write();
1418  }
1419  DirBeta[iSample]->Close();
1420  delete DirBeta[iSample];
1421  }
1422  BetaDir->Close();
1423  delete BetaDir;
1424 
1425  PredictiveDir->cd();
1426 }
double GetBetaParameter(const double data, const double mc, const double w2, const TestStatistic TestStat)
KS: Calculate Beta parameter which will be different based on specified test statistic.

◆ StudyBIC()

void PredictiveThrower::StudyBIC ( const std::vector< std::unique_ptr< TH1 >> &  PostPred_mc,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_w 
)
private

Study Bayesian Information Criterion (BIC) The BIC is defined as:

\[ \mathrm{BIC} = -2 \log L + k \log(n) \]

where:

  • \(L\) is the likelihood evaluated at the fitted model
  • \(k\) is the number of model parameters
  • \(n\) is the number of observations

[13]

Definition at line 1584 of file PredictiveThrower.cpp.

1585  {
1586 // ****************
1587  //make fancy event rate histogram
1588  double DataRate = 0.0;
1589  double BinsRate = 0.0;
1590  double TotalLLH = 0.0;
1591  #ifdef MULTITHREAD
1592  #pragma omp parallel for reduction(+:DataRate, BinsRate, TotalLLH)
1593  #endif
1594  for (int i = 0; i < TotalNumberOfSamples; ++i)
1595  {
1596  auto SampleHandler = SampleInfo[i].SamHandler;
1597  auto* h = Data_Hist[i].get();
1598  DataRate += h->Integral();
1599  if (auto h1 = dynamic_cast<TH1D*>(h)) {
1600  BinsRate += h1->GetNbinsX();
1601  } else if (auto h2 = dynamic_cast<TH2D*>(h)) {
1602  BinsRate += h2->GetNbinsX() * h2->GetNbinsY();
1603  } else if (auto h2poly = dynamic_cast<TH2Poly*>(h)) {
1604  BinsRate += h2poly->GetNumberOfBins();
1605  } else {
1606  MACH3LOG_WARN("Unknown histogram type in DataHist[{}]", i);
1607  }
1608  TotalLLH += CalcLLH(Data_Hist[i].get(), PostPred_mc[i].get(), PostPred_w[i].get(), SampleHandler);
1609  }
1610 
1611  const double EventRateBIC = GetBIC(TotalLLH, DataRate, NModelParams);
1612  const double BinBasedBIC = GetBIC(TotalLLH, BinsRate, NModelParams);
1613  MACH3LOG_INFO("Calculated Bayesian Information Criterion using global number of events: {:.2f}", EventRateBIC);
1614  MACH3LOG_INFO("Calculated Bayesian Information Criterion using global number of bins: {:.2f}", BinBasedBIC);
1615  MACH3LOG_INFO("Additional info: NModelParams: {}, DataRate: {:.2f}, BinsRate: {:.2f}", NModelParams, DataRate, BinsRate);
1616 }
double GetBIC(const double llh, const int data, const int nPars)
Get the Bayesian Information Criterion (BIC) or Schwarz information criterion (also SIC,...

◆ StudyDIC()

void PredictiveThrower::StudyDIC ( const std::vector< std::unique_ptr< TH1 >> &  PostPred_mc,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_w 
)
private

KS: Get the Deviance Information Criterion (DIC) The deviance is defined as:

\[ D(\theta) = -2 \log L(\theta) \]

The DIC statistic is then:

\[ \mathrm{DIC} = D(\hat{\theta}) + 2p_D \]

where:

\[ p_D = \bar{D} - D(\hat{\theta}) \]

is the effective number of parameters, and

\[ \bar{D} = E_{\theta|y}[D(\theta)] \]

[32] [34]

Definition at line 1620 of file PredictiveThrower.cpp.

1621  {
1622 // ****************
1623  //The posterior mean of the deviance
1624  double Dbar = 0.;
1625  double TotalLLH = 0.0;
1626 
1627  #ifdef MULTITHREAD
1628  #pragma omp parallel for reduction(+:Dbar)
1629  #endif
1630  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample)
1631  {
1632  auto SampleHandler = SampleInfo[iSample].SamHandler;
1633  TotalLLH += CalcLLH(Data_Hist[iSample].get(), PostPred_mc[iSample].get(), PostPred_w[iSample].get(), SampleHandler);
1634  double LLH_temp = 0.;
1635  for (int iToy = 0; iToy < Ntoys; ++iToy)
1636  {
1637  LLH_temp += CalcLLH(Data_Hist[iSample].get(), MC_Hist_Toy[iSample][iToy].get(), W2_Hist_Toy[iSample][iToy].get(), SampleHandler);
1638  }
1639  Dbar += LLH_temp;
1640  }
1641  Dbar = Dbar / Ntoys;
1642 
1643  // A point estimate of the deviance
1644  const double Dhat = TotalLLH;
1645 
1646  //Effective number of parameters
1647  const double p_D = std::fabs(Dbar - Dhat);
1648 
1649  //Actual test stat
1650  const double DIC_stat = Dhat + 2 * p_D;
1651  MACH3LOG_INFO("Effective number of parameters following DIC formalism is equal to: {:.2f}", p_D);
1652  MACH3LOG_INFO("DIC test statistic = {:.2f}", DIC_stat);
1653 }

◆ StudyInformationCriterion()

void PredictiveThrower::StudyInformationCriterion ( M3::kInfCrit  Criterion,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_mc,
const std::vector< std::unique_ptr< TH1 >> &  PostPred_w 
)
private

Information Criterion.

Definition at line 1554 of file PredictiveThrower.cpp.

1556  {
1557 // ****************
1558  MACH3LOG_INFO("******************************");
1559  switch(Criterion) {
1560  case M3::kInfCrit::kBIC:
1561  // Study Bayesian Information Criterion
1562  StudyBIC(PostPred_mc, PostPred_w);
1563  break;
1564  case M3::kInfCrit::kDIC:
1565  // Study Deviance Information Criterion
1566  StudyDIC(PostPred_mc, PostPred_w);
1567  break;
1568  case M3::kInfCrit::kWAIC:
1569  // Study Watanabe-Akaike information criterion (WAIC)
1570  StudyWAIC();
1571  break;
1573  MACH3LOG_ERROR("kInfCrits is not a valid kInfCrit!");
1574  throw MaCh3Exception(__FILE__, __LINE__);
1575  default:
1576  MACH3LOG_ERROR("UNKNOWN Information Criterion SPECIFIED!");
1577  MACH3LOG_ERROR("You gave {}", static_cast<int>(Criterion));
1578  throw MaCh3Exception(__FILE__ , __LINE__ );
1579  }
1580  MACH3LOG_INFO("******************************");
1581 }
void StudyBIC(const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
Study Bayesian Information Criterion (BIC) The BIC is defined as:
void StudyWAIC()
KS: Get the Watanabe-Akaike information criterion (WAIC)
void StudyDIC(const std::vector< std::unique_ptr< TH1 >> &PostPred_mc, const std::vector< std::unique_ptr< TH1 >> &PostPred_w)
KS: Get the Deviance Information Criterion (DIC) The deviance is defined as:
@ kInfCrits
This only enumerates.
@ kBIC
Bayesian Information Criterion.
@ kDIC
Deviance Information Criterion.

◆ StudyWAIC()

void PredictiveThrower::StudyWAIC ( )
private

KS: Get the Watanabe-Akaike information criterion (WAIC)

WAIC is a fully Bayesian measure of model fit that estimates the predictive accuracy, taking into account the effective number of parameters in the model.

It is defined as:

\[ \text{WAIC} = -2 \left( \text{lppd} - p_\text{WAIC} \right), \]

where

\[ \text{lppd} = \sum_{i=1}^n \log \left( \frac{1}{S} \sum_{s=1}^S p(y_i \mid \theta_s) \right) \]

is the log pointwise predictive density and

\[ p_\text{WAIC} = \sum_{i=1}^n \text{Var}_{s=1,\dots,S} \left[ \log p(y_i \mid \theta_s) \right] \]

is the effective number of parameters (variance of log-likelihood over posterior samples).

[13] [18]

Definition at line 1691 of file PredictiveThrower.cpp.

1691  {
1692 // ****************
1693  // log pointwise predictive density
1694  double lppd = 0.;
1695  // effective number of parameters
1696  double p_WAIC = 0.;
1697 
1698  #ifdef MULTITHREAD
1699  #pragma omp parallel for reduction(+:lppd, p_WAIC)
1700  #endif
1701  for (int iSample = 0; iSample < TotalNumberOfSamples; ++iSample) {
1702  auto SampleHandler = SampleInfo[iSample].SamHandler;
1703  auto* hData = Data_Hist[iSample].get();
1704 
1705  if (auto h2poly = dynamic_cast<TH2Poly*>(hData)) {
1706  // TH2Poly: irregular bins, linear indexing
1707  for (int i = 1; i <= h2poly->GetNumberOfBins(); ++i) {
1708  const double data = Data_Hist[iSample]->GetBinContent(i);
1709  double mean_llh = 0.;
1710  double sum_exp_llh = 0;
1711  double mean_llh_squared = 0.;
1712 
1713  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1714  const double mc = MC_Hist_Toy[iSample][iToy]->GetBinContent(i);
1715  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(i);
1716  // Get the -log-likelihood for this sample and bin
1717  double neg_LLH_temp = SampleHandler->GetTestStatLLH(data, mc, w2);
1718  AccumulateWAICToy(neg_LLH_temp, mean_llh, mean_llh_squared, sum_exp_llh);
1719  }
1720  AccumulateWAICBin(mean_llh, mean_llh_squared, sum_exp_llh, Ntoys, lppd, p_WAIC);
1721  }
1722  } else if (auto h2 = dynamic_cast<TH2D*>(hData)) {
1723  // TH2D: nested loops over X and Y
1724  for (int ix = 1; ix <= h2->GetNbinsX(); ++ix) {
1725  for (int iy = 1; iy <= h2->GetNbinsY(); ++iy) {
1726  const double data = hData->GetBinContent(ix, iy);
1727  double mean_llh = 0.;
1728  double mean_llh_squared = 0.;
1729  double sum_exp_llh = 0.;
1730  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1731  const double mc = MC_Hist_Toy[iSample][iToy]->GetBinContent(ix, iy);
1732  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(ix, iy);
1733  // Get the -log-likelihood for this sample and bin
1734  double neg_LLH_temp = SampleHandler->GetTestStatLLH(data, mc, w2);
1735  AccumulateWAICToy(neg_LLH_temp, mean_llh, mean_llh_squared, sum_exp_llh);
1736  }
1737  AccumulateWAICBin(mean_llh, mean_llh_squared, sum_exp_llh, Ntoys, lppd, p_WAIC);
1738  }
1739  }
1740  } else if (auto h1 = dynamic_cast<TH1D*>(hData)) {
1741  // TH1D: 1D histogram
1742  for (int iBin = 1; iBin <= h1->GetNbinsX(); ++iBin) {
1743  const double data = hData->GetBinContent(iBin);
1744  double mean_llh = 0.;
1745  double mean_llh_squared = 0.;
1746  double sum_exp_llh = 0.;
1747  for (int iToy = 0; iToy < Ntoys; ++iToy) {
1748  const double mc = MC_Hist_Toy[iSample][iToy]->GetBinContent(iBin);
1749  const double w2 = W2_Hist_Toy[iSample][iToy]->GetBinContent(iBin);
1750 
1751  // Get the -log-likelihood for this sample and bin
1752  double neg_LLH_temp = SampleHandler->GetTestStatLLH(data, mc, w2);
1753  AccumulateWAICToy(neg_LLH_temp, mean_llh, mean_llh_squared, sum_exp_llh);
1754  }
1755  AccumulateWAICBin(mean_llh, mean_llh_squared, sum_exp_llh, Ntoys, lppd, p_WAIC);
1756  }
1757  }
1758  }
1759 
1760  // Compute WAIC, see Eq. 13 in Gelman2014
1761  double WAIC = -2 * (lppd - p_WAIC);
1762  MACH3LOG_INFO("Effective number of parameters following WAIC formalism is equal to: {:.2f}", p_WAIC);
1763  MACH3LOG_INFO("WAIC = {:.2f}", WAIC);
1764 }
void AccumulateWAICToy(const double neg_LLH_temp, double &mean_llh, double &mean_llh_squared, double &sum_exp_llh)
void AccumulateWAICBin(double &mean_llh, double &mean_llh_squared, double &sum_exp_llh, const unsigned int Ntoys, double &lppd, double &p_WAIC)

◆ WriteToy()

void PredictiveThrower::WriteToy ( TDirectory *  ToyDirectory,
TDirectory *  Toy_1DDirectory,
TDirectory *  Toy_2DDirectory,
const int  iToy 
)
private

Save histograms for a single MCMC Throw/Toy.

Definition at line 365 of file PredictiveThrower.cpp.

368  {
369 // *************************
370  int SampleCounter = 0;
371  for (size_t iPDF = 0; iPDF < samples.size(); iPDF++)
372  {
373  auto* SampleHandler = samples[iPDF];
374  for (int iSample = 0; iSample < SampleHandler->GetNSamples(); ++iSample)
375  {
376  ToyDirectory->cd();
377 
378  auto SampleName = SampleHandler->GetSampleTitle(iSample);
379  const TH1* MCHist = SampleHandler->GetMCHist(iSample);
380  MC_Hist_Toy[SampleCounter][iToy] = M3::Clone(MCHist, SampleName + "_mc_" + std::to_string(iToy));
381  MC_Hist_Toy[SampleCounter][iToy]->Write();
382 
383  const TH1* W2Hist = SampleHandler->GetW2Hist(iSample);
384  W2_Hist_Toy[SampleCounter][iToy] = M3::Clone(W2Hist, SampleName + "_w2_" + std::to_string(iToy));
385  W2_Hist_Toy[SampleCounter][iToy]->Write();
386 
387  // now get 1D projection for every dimension
388  Toy_1DDirectory->cd();
389  for(int iDim = 0; iDim < SampleHandler->GetNDim(iSample); iDim++) {
390  std::string ProjectionName = SampleHandler->GetKinVarName(iSample, iDim);
391  std::string ProjectionSuffix = "_1DProj" + std::to_string(iDim) + "_" + std::to_string(iToy);
392 
393  auto hist = SampleHandler->Get1DVarHist(iSample, ProjectionName);
394  hist->SetTitle((SampleName + ProjectionSuffix).c_str());
395  hist->SetName((SampleName + ProjectionSuffix).c_str());
396  hist->Write();
397  }
398 
399  Toy_2DDirectory->cd();
400  // now get 2D projection for every combination
401  for(int iDim1 = 0; iDim1 < SampleHandler->GetNDim(iSample); iDim1++) {
402  for (int iDim2 = iDim1 + 1; iDim2 < SampleHandler->GetNDim(iSample); ++iDim2) {
403  // Get the names for the two dimensions
404  std::string XVarName = SampleHandler->GetKinVarName(iSample, iDim1);
405  std::string YVarName = SampleHandler->GetKinVarName(iSample, iDim2);
406 
407  // Get the 2D histogram for this pair
408  auto hist2D = SampleHandler->Get2DVarHist(iSample, XVarName, YVarName);
409 
410  // Write the histogram
411  std::string suffix2D = "_2DProj_" + std::to_string(iDim1) + "_vs_" + std::to_string(iDim2) + "_" + std::to_string(iToy);
412  hist2D->SetTitle((SampleName + suffix2D).c_str());
413  hist2D->SetName((SampleName + suffix2D).c_str());
414  hist2D->Write();
415  }
416  }
417  SampleCounter++;
418  }
419  }
420 }

Member Data Documentation

◆ Data_Hist

std::vector<std::unique_ptr<TH1> > PredictiveThrower::Data_Hist
private

Vector of Data histograms.

Definition at line 302 of file PredictiveThrower.h.

◆ FullLLH

bool PredictiveThrower::FullLLH
private

KS: Use Full LLH or only sample contribution based on discussion with Asher we almost always only want the sample likelihood.

Definition at line 283 of file PredictiveThrower.h.

◆ Is_PriorPredictive

bool PredictiveThrower::Is_PriorPredictive
private

Whether it is Prior or Posterior predictive.

Definition at line 287 of file PredictiveThrower.h.

◆ MC_Hist_Toy

std::vector<std::vector<std::unique_ptr<TH1> > > PredictiveThrower::MC_Hist_Toy
private

Vector of MC histograms per sample and toy experiment. Indexed as [sample][toy].

Definition at line 310 of file PredictiveThrower.h.

◆ MC_Nom_Hist

std::vector<std::unique_ptr<TH1> > PredictiveThrower::MC_Nom_Hist
private

Vector of MC histograms.

Definition at line 304 of file PredictiveThrower.h.

◆ ModelSystematic

ParameterHandlerGeneric* PredictiveThrower::ModelSystematic
private

Pointer to El Generico.

Definition at line 299 of file PredictiveThrower.h.

◆ NModelParams

int PredictiveThrower::NModelParams
private

KS: Count total number of model parameters which can be used for stuff like BIC.

Definition at line 285 of file PredictiveThrower.h.

◆ Ntoys

int PredictiveThrower::Ntoys
private

Number of toys we are generating analysing.

Definition at line 296 of file PredictiveThrower.h.

◆ PenaltyTerm

std::vector<double> PredictiveThrower::PenaltyTerm
private

Penalty term values for each toy by default 0.

Definition at line 318 of file PredictiveThrower.h.

◆ ReweightWeight

std::vector<double> PredictiveThrower::ReweightWeight
private

Reweighting factors applied for each toy, by default 1.

Definition at line 316 of file PredictiveThrower.h.

◆ SampleInfo

std::vector<PredictiveSample> PredictiveThrower::SampleInfo
private

Handy struct for all sample info.

Definition at line 293 of file PredictiveThrower.h.

◆ StandardFluctuation

bool PredictiveThrower::StandardFluctuation
private

KS: We have two methods for Poissonian fluctuation.

Definition at line 321 of file PredictiveThrower.h.

◆ TotalNumberOfSamples

int PredictiveThrower::TotalNumberOfSamples
private

Number of toys we are generating analysing.

Definition at line 290 of file PredictiveThrower.h.

◆ W2_Hist_Toy

std::vector<std::vector<std::unique_ptr<TH1> > > PredictiveThrower::W2_Hist_Toy
private

Vector of W² histograms per sample and toy experiment. Indexed as [sample][toy]

Definition at line 313 of file PredictiveThrower.h.

◆ W2_Nom_Hist

std::vector<std::unique_ptr<TH1> > PredictiveThrower::W2_Nom_Hist
private

Vector of W2 histograms.

Definition at line 306 of file PredictiveThrower.h.


The documentation for this class was generated from the following files: