MaCh3  2.4.2
Reference Guide
Public Member Functions | Private Attributes | List of all members
BinningHandler Class Reference

KS: Class handling binning for multiple samples. More...

#include <Samples/BinningHandler.h>

Collaboration diagram for BinningHandler:
[legend]

Public Member Functions

 BinningHandler ()
 Constructor. More...
 
virtual ~BinningHandler ()
 destructor More...
 
int FindGlobalBin (const int iSample, const std::vector< const double * > &KinVar, const std::vector< int > &NomBin) const
 Find Global bin including. More...
 
int FindNominalBin (const int iSample, const int iDim, const double Var) const
 Find the nominal bin for a given variable in a given sample and dimension. More...
 
int GetGlobalBinSafe (const int iSample, const std::vector< int > &Bins) const
 Get gloabl bin based on sample, and dimension of each sample with additional checks. More...
 
int GetBinSafe (const int iSample, const std::vector< int > &Bins) const
 Get gloabl bin based on sample, and dimension of each sample without any safety checks. More...
 
int GetNBins () const
 Get total number of bins over all samples/kinematic bins etc. More...
 
int GetNBins (const int iSample) const
 Get total number of bins over for a given sample. More...
 
std::string GetBinName (const int GlobalBin) const
 Get fancy name for a given bin, to help match it with global properties. More...
 
std::string GetBinName (const int iSample, const int SampleBin) const
 Get fancy name for a given bin, to help match it with global properties. More...
 
std::string GetBinName (const int iSample, const std::vector< int > &Bins) const
 Get fancy name for a given bin, to help match it with global properties. More...
 
std::vector< double > GetBinEdges (const int iSample, const int iDim) const
 Get N-dim bin edges for a given sample. More...
 
int GetNAxisBins (const int iSample, const int iDim) const
 Get Number of N-axis bins for a given sample. More...
 
bool IsUniform (const int iSample) const
 Tells whether given sample is using unform binning. More...
 
int GetSampleStartBin (const int iSample) const
 Get bin number corresponding to where given sample starts. More...
 
int GetSampleEndBin (const int iSample) const
 Get bin number corresponding to where given sample ends. More...
 
const std::vector< BinInfoGetNonUniformBins (const int iSample) const
 Return NonUnifomr bins to for example check extent etc. More...
 
void SetGlobalBinNumbers ()
 Sets the GlobalOffset for each SampleBinningInfo to enable linearization of multiple 2D binning samples. More...
 
void SetupSampleBinning (const YAML::Node &Settings, SampleInfo &SingleSample)
 Function to setup the binning of your sample histograms and the underlying arrays that get handled in fillArray() and fillArray_MP(). More...
 

Private Attributes

int TotalNumberOfBins
 Total number of bins. More...
 
std::vector< SampleBinningInfoSampleBinning
 Binning info for individual sample. More...
 

Detailed Description

KS: Class handling binning for multiple samples.

Introduction

Each sample can define its own binning in an arbitrary number of dimensions. Internally, every sample's multi-dimensional binning is linearised into a single 1D array. All samples are then concatenated into one global bin index space, allowing the entire analysis to be treated as a single large vector of bins.

The concept of a "global bin" refers to the position of a bin in this linearised, analysis-wide bin index space. Local (sample) bins are always enumerated starting from zero, while global bins span all samples consecutively.

Example layout of global bins with offsets:

Sample 0 (GlobalOffset = 0, nBins = 4):
Local bins: [0] [1] [2] [3]
Global bins: [0] [1] [2] [3]
Sample 1 (GlobalOffset = 4, nBins = 3):
Local bins: [0] [1] [2]
Global bins: [4] [5] [6]
Sample 2 (GlobalOffset = 7, nBins = 2):
Local bins: [0] [1]
Global bins: [7] [8]
Global bin index space:
------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
------------------------------------------------

Uniform and Non-Uniform Binning Scheme

MaCh3 supports Uniform and Non-Uniform binning scheme

In the non-uniform scheme, bin sizes may vary along each dimension, but all bins are required to be axis-aligned hyper-rectangles. Arbitrary or irregular bin shapes are not supported like banana-shape.

Example of Uniform

+--------+--------+--------+--------+
| Bin 0 | Bin 1 | Bin 2 | Bin 3 |
| (0,0) | (1,0) | (2,0) | (3,0) |
+--------+--------+--------+--------+
| Bin 4 | Bin 5 | Bin 6 | Bin 7 |
| (0,1) | (1,1) | (2,1) | (3,1) |
+--------+--------+--------+--------+
| Bin 8 | Bin 9 | Bin 10 | Bin 11 |
| (0,2) | (1,2) | (2,2) | (3,2) |
+--------+--------+--------+--------+

Example of Non-Uniform

+--------+------------+-------+---------------------+
| Bin 0 | Bin 1 | Bin 2 | Bin 3 |
| | | | |
+--------+------------+-------+---------------------+
| Bin 4 | | | |
| | | Bin 6 | |
+--------+ Bin 5 +-------+ |
| Bin 7 | | | Bin 9 |
| | | Bin 8 | |
+--------+------------+-------+---------------------+

Bin Finding Algorithm

Since MaCh3 supports event migration bin finding algorithm must be fast to efficiently be able find bin during running fit. MaCh3 is caching nominal bin with idea that during fit migration should be around this nominal bin. Thus MaCh3 first checks if after shift event falls into Nom-bin and later adjacent. If not backs to binary search.

Uniform

In case of uniform binning above algorithm is easy to test as one performs it for every dimension independently i.e. find X-bin, then Y etc. After which can find bin in flattened 1D space.

Non-Uniform

Internally, non-uniform binning is implemented using two levels:

  1. MegaBins (mapping bins) These form a coarse, uniform grid that spans the full phase space. Each MegaBin acts as a container for one or more non-uniform bins.
  2. Non-Uniform bins The actual analysis bins, defined as hyper-rectangles with arbitrary extents inside a MegaBin.

Bin finding procedure

For a given event, the bin-finding algorithm proceeds as follows:

  1. Locate the MegaBin using the same fast per-dimension logic as in uniform binning.
  2. Once the MegaBin is identified, loop over all non-uniform bins associated with that MegaBin.
  3. The first bin whose extents fully contain the event is selected.
  4. If no bin matches, the event is assigned to the under/overflow bin.
Author
Kamil Skwarczynski
Dan Barrow

Definition at line 121 of file BinningHandler.h.

Constructor & Destructor Documentation

◆ BinningHandler()

BinningHandler::BinningHandler ( )

Constructor.

Definition at line 4 of file BinningHandler.cpp.

4  {
5 // ************************************************
6 }

◆ ~BinningHandler()

virtual BinningHandler::~BinningHandler ( )
inlinevirtual

destructor

Definition at line 128 of file BinningHandler.h.

128 {};

Member Function Documentation

◆ FindGlobalBin()

int BinningHandler::FindGlobalBin ( const int  iSample,
const std::vector< const double * > &  KinVar,
const std::vector< int > &  NomBin 
) const

Find Global bin including.

Parameters
iSampleindex of a given sample
KinVarVector of pointers to kinematic variable like Erec
NomBinVector of nominal bin indices for this event, one per dimension.

Definition at line 42 of file BinningHandler.cpp.

44  {
45 // ************************************************
46  //DB Find the relevant bin in the PDF for each event
47  const int Dim = static_cast<int>(KinVar.size());
48  const SampleBinningInfo& _restrict_ Binning = SampleBinning[NomSample];
49  int GlobalBin = 0;
50 
51  for(int i = 0; i < Dim; ++i) {
52  const double Var = *KinVar[i];
53  const int Bin = Binning.FindBin(i, Var, NomBin[i]);
54  // KS: If we are outside of range in only one dimension this mean out of bounds, we can simply quickly finish
55  if(Bin < 0) return M3::UnderOverFlowBin;
56  // KS: inline GetBin computation to avoid any memory allocation, which in reweight loop is very costly
57  GlobalBin += Bin * Binning.Strides[i];
58  }
59 
60  if(Binning.Uniform) {
61  GlobalBin += static_cast<int>(Binning.GlobalOffset);
62  return GlobalBin;
63  } else {
64  const auto& _restrict_ BinMapping = Binning.BinGridMapping[GlobalBin];
65  const size_t nNonUniBins = BinMapping.size();
66  for(size_t iBin = 0; iBin < nNonUniBins; iBin++) {
67  const int BinNumber = BinMapping[iBin];
68  const auto& _restrict_ NonUniBin = Binning.Bins[BinNumber];
69  if(NonUniBin.IsEventInside(KinVar)){
70  return BinNumber + Binning.GlobalOffset;
71  }
72  }
73  MACH3LOG_DEBUG("Didn't find any bin so returning UnderOverFlowBin");
74  return M3::UnderOverFlowBin;
75  }
76 }
#define _restrict_
KS: Using restrict limits the effects of pointer aliasing, aiding optimizations. While reading I foun...
Definition: Core.h:108
#define MACH3LOG_DEBUG
Definition: MaCh3Logger.h:34
std::vector< SampleBinningInfo > SampleBinning
Binning info for individual sample.
constexpr static const int UnderOverFlowBin
Mark bin which is overflow or underflow in MaCh3 binning.
Definition: Core.h:91
KS: Struct storing all information required for sample binning.

◆ FindNominalBin()

int BinningHandler::FindNominalBin ( const int  iSample,
const int  iDim,
const double  Var 
) const

Find the nominal bin for a given variable in a given sample and dimension.

Parameters
iSampleSample index
iDimDimension index (0 = X, 1 = Y, ...)
VarKinematic variable value

Definition at line 79 of file BinningHandler.cpp.

81  {
82 // ************************************************
83  const SampleBinningInfo& info = SampleBinning[iSample];
84 
85  const auto& edges = info.BinEdges[iDim];
86 
87  // Outside binning range
88  if (Var < edges.front() || Var >= edges.back()) {
89  return M3::UnderOverFlowBin;
90  }
91  return static_cast<int>(std::distance(edges.begin(), std::upper_bound(edges.begin(), edges.end(), Var)) - 1);
92 }
std::vector< std::vector< double > > BinEdges
Vector to hold N-axis bin-edges.

◆ GetBinEdges()

std::vector<double> BinningHandler::GetBinEdges ( const int  iSample,
const int  iDim 
) const
inline

Get N-dim bin edges for a given sample.

Parameters
iSampleindex of a given sample
iDimdimension for which we extract bin edges

Definition at line 171 of file BinningHandler.h.

171 {return SampleBinning[iSample].BinEdges.at(iDim);};

◆ GetBinName() [1/3]

std::string BinningHandler::GetBinName ( const int  GlobalBin) const

Get fancy name for a given bin, to help match it with global properties.

Parameters
GlobalBinGlobal Bin integrated over all samples

Definition at line 210 of file BinningHandler.cpp.

210  {
211 // ************************************************
212  int SampleBin = GetSampleFromGlobalBin(SampleBinning, GlobalBin);
213  int LocalBin = GetLocalBinFromGlobalBin(SampleBinning, GlobalBin);
214  return GetBinName(SampleBin, LocalBin);
215 }
int GetSampleFromGlobalBin(const std::vector< SampleBinningInfo > &BinningInfo, const int GlobalBin)
Get the sample index corresponding to a global bin number.
int GetLocalBinFromGlobalBin(const std::vector< SampleBinningInfo > &BinningInfo, const int GlobalBin)
Get the local (sample) bin index from a global bin number.
std::string GetBinName(const int GlobalBin) const
Get fancy name for a given bin, to help match it with global properties.

◆ GetBinName() [2/3]

std::string BinningHandler::GetBinName ( const int  iSample,
const int  SampleBin 
) const

Get fancy name for a given bin, to help match it with global properties.

Parameters
iSampleindex of a given sample
SampleBinGlobal Bin for a given sample

Definition at line 156 of file BinningHandler.cpp.

156  {
157 // ************************************************
158  const auto& Binning = SampleBinning[iSample];
159 
160  // Safety checks
161  if (SampleBin < 0 || SampleBin >= static_cast<int>(Binning.nBins)) {
162  MACH3LOG_ERROR("Requested bin {} is out of range for sample {}", SampleBin, iSample);
163  throw MaCh3Exception(__FILE__, __LINE__);
164  }
165  std::string BinName;
166 
167  if(Binning.Uniform) {
168  int Dim = static_cast<int>(Binning.Strides.size());
169  std::vector<int> Bins(Dim, 0);
170  int Remaining = SampleBin;
171 
172  // Convert the flat/global bin index into per-dimension indices
173  // Dim0 is the fastest-changing axis, Dim1 the next, etc.
174  //
175  // For example (2D):
176  // x = bin % Nx
177  // y = bin / Nx
178  //
179  // For 3D:
180  // x = bin % Nx
181  // y = (bin / Nx) % Ny
182  // z = bin / (Nx * Ny)
183  for (int i = 0; i < Dim; ++i) {
184  const int nBinsDim = static_cast<int>(Binning.BinEdges[i].size()) - 1;
185  Bins[i] = Remaining % nBinsDim;
186  Remaining /= nBinsDim;
187  }
188 
189  for (int i = 0; i < Dim; ++i) {
190  if (i > 0) BinName += ", ";
191  const double min = Binning.BinEdges[i].at(Bins[i]);
192  const double max = Binning.BinEdges[i].at(Bins[i] + 1);
193  BinName += fmt::format("Dim{} ({:g}, {:g})", i, min, max);
194  }
195  } else{
196  const BinInfo& bin = Binning.Bins[SampleBin];
197  const int Dim = static_cast<int>(bin.Extent.size());
198 
199  for (int i = 0; i < Dim; ++i) {
200  if (i > 0) BinName += ", ";
201  const double min = bin.Extent[i][0];
202  const double max = bin.Extent[i][1];
203  BinName += fmt::format("Dim{} ({:g}, {:g})", i, min, max);
204  }
205  }
206  return BinName;
207 }
#define MACH3LOG_ERROR
Definition: MaCh3Logger.h:37
Custom exception class used throughout MaCh3.
KS: This hold bin extents in N-Dimensions allowing to check if Bin falls into.
std::vector< std::array< double, 2 > > Extent

◆ GetBinName() [3/3]

std::string BinningHandler::GetBinName ( const int  iSample,
const std::vector< int > &  Bins 
) const

Get fancy name for a given bin, to help match it with global properties.

Parameters
iSampleindex of a given sample
BinsVector of bin indices along each dimension

Definition at line 144 of file BinningHandler.cpp.

144  {
145 // ************************************************
146  const auto& Binning = SampleBinning[iSample];
147  if(!Binning.Uniform) {
148  MACH3LOG_ERROR("When using Non-Uniform binning for sample {} please use One bin instead of Axis bins", iSample);
149  throw MaCh3Exception(__FILE__, __LINE__);
150  }
151  return GetBinName(iSample, GetBinSafe(iSample, Bins));
152 }
int GetBinSafe(const int iSample, const std::vector< int > &Bins) const
Get gloabl bin based on sample, and dimension of each sample without any safety checks.

◆ GetBinSafe()

int BinningHandler::GetBinSafe ( const int  iSample,
const std::vector< int > &  Bins 
) const

Get gloabl bin based on sample, and dimension of each sample without any safety checks.

Parameters
iSampleindex of a given sample
BinsVector of bin indices along each dimension

Definition at line 95 of file BinningHandler.cpp.

95  {
96 // ************************************************
97  const int GlobalBin = SampleBinning[Sample].GetBinSafe(Bins);
98  return GlobalBin;
99 }

◆ GetGlobalBinSafe()

int BinningHandler::GetGlobalBinSafe ( const int  iSample,
const std::vector< int > &  Bins 
) const

Get gloabl bin based on sample, and dimension of each sample with additional checks.

Parameters
iSampleindex of a given sample
BinsVector of bin indices along each dimension

Definition at line 102 of file BinningHandler.cpp.

102  {
103 // ************************************************
104  const int GlobalBin = SampleBinning[Sample].GetBinSafe(Bins) + static_cast<int>(SampleBinning[Sample].GlobalOffset);
105  return GlobalBin;
106 }

◆ GetNAxisBins()

int BinningHandler::GetNAxisBins ( const int  iSample,
const int  iDim 
) const

Get Number of N-axis bins for a given sample.

Parameters
iSampleindex of a given sample
iDimdimension for which we extract number of bins

Definition at line 218 of file BinningHandler.cpp.

218  {
219 // ************************************************
220  const auto& Binning = SampleBinning[iSample];
221  if(!Binning.Uniform) {
222  MACH3LOG_ERROR("When using Non-Uniform binning for sample {} please use global bin instead of {}", iSample, __func__);
223  throw MaCh3Exception(__FILE__, __LINE__);
224  } else{
225  return static_cast<int>(Binning.AxisNBins.at(iDim));
226  }
227 }

◆ GetNBins() [1/2]

int BinningHandler::GetNBins ( ) const
inline

Get total number of bins over all samples/kinematic bins etc.

Definition at line 153 of file BinningHandler.h.

153 {return TotalNumberOfBins;};
int TotalNumberOfBins
Total number of bins.

◆ GetNBins() [2/2]

int BinningHandler::GetNBins ( const int  iSample) const
inline

Get total number of bins over for a given sample.

Definition at line 155 of file BinningHandler.h.

155 {return static_cast<int>(SampleBinning[iSample].nBins);};

◆ GetNonUniformBins()

const std::vector< BinInfo > BinningHandler::GetNonUniformBins ( const int  iSample) const

Return NonUnifomr bins to for example check extent etc.

Definition at line 237 of file BinningHandler.cpp.

237  {
238 // ************************************************
239  const auto& Binning = SampleBinning[iSample];
240  if(!Binning.Uniform) {
241  return Binning.Bins;
242  } else{
243  MACH3LOG_ERROR("{} for sample {} will not work becasue binnin is unfiorm", __func__, iSample);
244  throw MaCh3Exception(__FILE__, __LINE__);
245  }
246 }

◆ GetSampleEndBin()

int BinningHandler::GetSampleEndBin ( const int  iSample) const

Get bin number corresponding to where given sample ends.

Parameters
iSampleindex of a given sample

Definition at line 115 of file BinningHandler.cpp.

115  {
116 // ************************************************
117  if (Sample == static_cast<int>(SampleBinning.size()) - 1) {
118  return GetNBins();
119  } else {
120  return static_cast<int>(SampleBinning[Sample+1].GlobalOffset);
121  }
122 }
int GetNBins() const
Get total number of bins over all samples/kinematic bins etc.

◆ GetSampleStartBin()

int BinningHandler::GetSampleStartBin ( const int  iSample) const

Get bin number corresponding to where given sample starts.

Parameters
iSampleindex of a given sample

Definition at line 109 of file BinningHandler.cpp.

109  {
110 // ************************************************
111  return static_cast<int>(SampleBinning[Sample].GlobalOffset);
112 }

◆ IsUniform()

bool BinningHandler::IsUniform ( const int  iSample) const

Tells whether given sample is using unform binning.

Parameters
iSampleindex of a given sample

Definition at line 230 of file BinningHandler.cpp.

230  {
231 // ************************************************
232  const auto& Binning = SampleBinning[iSample];
233  return Binning.Uniform;
234 }

◆ SetGlobalBinNumbers()

void BinningHandler::SetGlobalBinNumbers ( )

Sets the GlobalOffset for each SampleBinningInfo to enable linearization of multiple 2D binning samples.

Definition at line 126 of file BinningHandler.cpp.

126  {
127 // ************************************************
128  if (SampleBinning.empty()) {
129  MACH3LOG_ERROR("No binning samples provided.");
130  throw MaCh3Exception(__FILE__, __LINE__);
131  }
132 
133  int GlobalOffsetCounter = 0;
134  for(size_t iSample = 0; iSample < SampleBinning.size(); iSample++){
135  SampleBinning[iSample].GlobalOffset = GlobalOffsetCounter;
136  GlobalOffsetCounter += SampleBinning[iSample].nBins;
137  }
138  // lastly modify total number of bins
139  TotalNumberOfBins = GlobalOffsetCounter;
140 }

◆ SetupSampleBinning()

void BinningHandler::SetupSampleBinning ( const YAML::Node &  Settings,
SampleInfo SingleSample 
)

Function to setup the binning of your sample histograms and the underlying arrays that get handled in fillArray() and fillArray_MP().

Definition at line 13 of file BinningHandler.cpp.

13  {
14 // ************************************************
15  MACH3LOG_INFO("Setting up Sample Binning");
16  //Binning
17  SingleSample.VarStr = Get<std::vector<std::string>>(Settings["VarStr"], __FILE__ , __LINE__);
18  SingleSample.nDimensions = static_cast<int>(SingleSample.VarStr.size());
19 
20  SampleBinningInfo SingleBinning;
21  bool Uniform = Get<bool>(Settings["Uniform"], __FILE__ , __LINE__);
22  if(Uniform == false) {
23  auto Bins = Get<std::vector<std::vector<std::vector<double>>>>(Settings["Bins"], __FILE__, __LINE__);
24  SingleBinning.InitNonUniform(Bins);
25  } else {
26  auto Bin_Edges = Get<std::vector<std::vector<double>>>(Settings["VarBins"], __FILE__ , __LINE__);
27  SingleBinning.InitUniform(Bin_Edges);
28  }
29  if(SingleSample.VarStr.size() != SingleBinning.BinEdges.size()) {
30  MACH3LOG_ERROR("Number of variables ({}) does not match number of bin edge sets ({}) in sample config '{}'",
31  SingleSample.VarStr.size(), SingleBinning.BinEdges.size(),SingleSample.SampleTitle);
32  throw MaCh3Exception(__FILE__, __LINE__);
33  }
34 
35  SampleBinning.emplace_back(SingleBinning);
36 
37  // now setup global numbering
39 }
#define MACH3LOG_INFO
Definition: MaCh3Logger.h:35
void SetGlobalBinNumbers()
Sets the GlobalOffset for each SampleBinningInfo to enable linearization of multiple 2D binning sampl...
void InitUniform(const std::vector< std::vector< double >> &InputEdges)
Initialise Uniform Binning.
void InitNonUniform(const std::vector< std::vector< std::vector< double >>> &InputBins)
Initialise Non-Uniform Binning.
int nDimensions
Keep track of the dimensions of the sample binning.
std::vector< std::string > VarStr
the strings associated with the variables used for the binning e.g. "RecoNeutrinoEnergy"
std::string SampleTitle
the name of this sample e.g."muon-like"

Member Data Documentation

◆ SampleBinning

std::vector<SampleBinningInfo> BinningHandler::SampleBinning
private

Binning info for individual sample.

Definition at line 199 of file BinningHandler.h.

◆ TotalNumberOfBins

int BinningHandler::TotalNumberOfBins
private

Total number of bins.

Definition at line 197 of file BinningHandler.h.


The documentation for this class was generated from the following files: