Madminer Versions Save

Machine learning–based inference toolkit for particle physics

v0.4.5

4 years ago

New features

Histograms in AsymptoticLimits now by default use one set of weighted events for all parameter points, reweighting them appropriately

API / breaking changes

AsymptoticLimits functions keyword n_toys_per_theta renamed to n_histo_toys

Internal changes

Refactored HDF5 file access

v0.4.4

4 years ago

New features

More options for some plotting functions

Bug fixes

Fixed AsymptoticLimits functions for LikelihoodEstimator instances
Fixed bug in histogram_of_fisher_information()
Fixed bug in particle smearing caused by floating point precision by changing < to <=
Increased Python 2 compatibility

v0.4.3

4 years ago

Bug fixes:

Fixed wrong results and crazy run time in AsymptoticLimits when more than one parameter batch was used
Fixed wrong factor of 1 / test_split in histogram_of_fisher_information()
Fixed AsymptoticLimits not accepting LikelihoodEstimator instances to calculate the kinematic likelihood

v0.4.2

4 years ago

New features:

AsymptoticLimits functions are more memory efficient. New keyword histo_theta_batchsize sets number of parameter points for which histogram data is loaded in parallel.
New keyword n_observables for AsymptoticLimits.observed_limits() allows overwriting the number of events (= rescaling the log likelihood).

Bug fixes:

Fixed bugs that caused the training of DoubleParameterizedRatioEstimator and LikelihoodEstimator instances to crash
Added more sanity checks when evaluating observables, both in LHEReader and DelphesReader

v0.4.1

4 years ago

New features:

LHEReader.add_observable_from_function() now expects functions with signature observable(truth_particles, leptons, photons, jets, met), where truth_particles are the original particles in the LHE file, and the other arguments have smearing applied to them.
New dof keyword in limit functions to overwrite degrees of freedom
Trimmed sampling to remove events with largest weights

Breaking / API changes:

Changed construction of parameter grid in limit functions to numpy's "ij" mode to make 2d and >2d cases more consistent

Bug fixes:

Fixed bugs with efficiencies in LHEReader

v0.4.0

5 years ago

New features:

Smarter sampling: MadMiner now keeps track of which events where generated (sampled) from which benchmark point (at the MadGraph stage). The new keyword sample_only_from_closest_benchmark in the SampleAugmenter functions and plot_distributions() then allows the user to restrict the unweighting / resampling at some parameter point to events from the closest benchmark point. This can significantly reduce the weights of individual events and thus reduce the variance.

API / breaking changes:

k-factors are now automatically added when there are subsamples generated at different benchmarks. For instance, if we add a sample with 30k events generated at theta0 and a sample with 70k events generated at theta1, and calculate cross sections from the full sample, MadMiner will automatically apply a k-factor of 0.3 and 0.7 to the two samples.

Bug fixes:

Various small bug fixes, mostly related to nuisance parameters

Internal changes:

More information stored in MadMiner HDF5 file

v0.3.1

5 years ago

New features:

Fisher information in histograms can be calculated for a custom binning
MET noise in LHEReader
Number of toys for AsymptoticLimits histograms are now a keyword and not limited to 1k

Bug fixes:

Fixed wrong cross sections in AsymptoticLimits
Fixed unzipping of event samples to support older versions of gunzip
Various smaller bug fixes

Tutorials and documentation:

Restructured tutorials

Internal changes:

Improved automatic testing of likelihood ratio estimation

v0.3.0

5 years ago

New features:

Nuisance parameters for ratio-based methods! This required a major refactoring of madminer.sampling.
New madminer.limits submodule that calculates expected and observed p-values.
SALLY can profile the estimated score over nuisance parameters (then its output is n-dimensional rather than (n+k)-dimensional, where n is the number of parameters of interest and k is the number of nuisance parameters).
New madminer.analysis submodule with a generic DataAnalyzer class that can be used to access the general setup, weighted events, and cross sections. The classes SampleAugmenter, FisherInformation and AsymptoticLimits subclass it, leading to a more unified interface.
Sampling speed-up with parallelization when using n_processes>1 for any sampling call.

Breaking / API changes:

New file format for trained neural networks (only the *.json files are different) -- old models won't load
The one-size-fits-all class MLForge is replaced by four different classes ParameterizedRatioEstimator, DoubleParameterizedRatioEstimator, LikelihoodEstimator, and ScoreEstimator.
EnsembleForge is now Ensemble, with renamed functions and less clutter (at the cost of some unimportant functionality)
Renaming: DelphesProcessor -> DelphesReader, LHEProcessor -> LHEReader
madminer.morphing now lives in madminer.utils.morphing, and the Morpher class was renamed PhysicsMorpher (since we also have a NuisanceMorpher class)
The madminer.sampling has changed in a number of ways. The high-level functions have been renamed: extract_samples_train_plain() -> sample_train_plain(), extract_samples_train_local() -> sample_train_local(), extract_samples_train_global() -> train_samples_density(), extract_samples_train_ratio() -> sample_train_ratio(), extract_samples_train_more_ratios() -> sample_train_more_ratios(), extract_samples_test() -> sample_test(), extract_cross_sections() -> cross_sections(). In addition to the physical parameters theta, they all now take descriptions of the nuisance parameters nu as input argument, for which new helper functions exist.
The helper functions in madminer.sampling are also different: constant_benchmark_theta() -> benchmark(), multiple_benchmark_thetas() -> benchmarks(), constant_morphing_theta() -> morphing_point(), multiple_morphing_thetas() -> morphing_points, random_morphing_thetas() -> random_morphing_points()
New default evaluation split of 0.2 (so with the defaults, there will be 60% train, 20% validation, 20% test events)

Internal changes:

madminer.ml largely rewritten
madminer.sampling largely rewritten
madminere.utils.analysis got removed, the old functions are now split between madminer.analysis and madminer.sampling

v0.2.8

5 years ago

Breaking / API changes:

The keyword trainer in MLForge.train() has been renamed to optimizer
The gradient of network output wrt x is no longer calculated, the corresponding keywords in MLForge.evaluate() and MLForge.train() are gone
MLForge has new methods evaluate_log_likelihood(), evaluate_log_likelihood_ratio(), and evaluate_score(); MLForge.evaluate() just wraps around them now

Bug fixes:

Fixed a number of bugs affecting neural density estimation methods ("NDE" and "SCANDAL")
Fixed bugs when trying to use CASCAL

Internal changes:

Rewrote training procedure from scratch
Some more refactoring of the ML parts

v0.2.7

5 years ago

New features:

b and tau tags, can be used for cuts and observables
New plot_uncertainty() function
More options for plot_distributions()

Breaking / API changes:

In MLForge.evaluate() and EnsembleForge.evaluate(), the keyword x_filename is replaced by x, which supports ndarrays as an alternative to filenames

Documentation and examples:

Improved logging output
New test of nuisance setup and Fisher information calculation

Bug fixes:

Fixed MadMiner crashing with stable W bosons in LHE files
Fixed calculation of MET in LHEProcessor when visible particle pTs are smeared
Fixed range calculation for plot_distributions()