HackAnalysis
2
|
This is a hackable code for implementing particle level analyses which may have non-standard features, e.g. disappearing tracks. It is hosted at the HackAnalysis website
Version one is stored on the llpRecasting github: llpRecasting github
Together these make up 140 inverse femtobarns of run 2 data and are known as CMS_DT
With vesion 1.2 the HackAnalysis code developed along with the implementation in MadAnalysis and reported in arXiv:2112.05163 is finally released:
Notably in HackAnalysis pileup can be included.
This repository holds the code for recasting the above searches corresponding to v2 of "HackAnalysis" along with validation material and auxiliary scripts.
Needed are:
For HEPMC we need length to be mm and energy in GeV, so e.g. for 2.06.10:
Optionally the ONNX library can be linked
Some code has been taken/modified from the following:
Within the installation directory, edit the Paths.inc file:
If in any doubt about the YAMLpath, just look for where the directory src/yamlcpp is located within your YODA installation.
If you are using YODA 2, you must specify
USEYODATWO=1
For previous versions delete or comment out that line. This is because the include paths changed within Yoda from version 2 up.
All necessary tools can be installed and the Paths.inc file created by the provided installColliderTools.py
script.
Then the code can be compiled with
which will produce the executables
They can of course also be built separately. If you want to disable HEPMC (or you cannot install it) then you can skip that executable but you will also need to remove reference to the headers in the code. Similarly fastjet could equivalently (as the code currently runs) be replaced with fjcore with some small modifications, but if you want to trial e.g. pileup subtraction you need the full fastjet.
To test that the installation has succeeded, run the basic executable:
This will generate 1000 events for winos of 700 GeV and decay lenght 1000cm (as specified in the LOpythia.yaml file) and will produce LO.yoda, LO.eff and LO_cf.eff as outputs. You can then check that the results are sensible by executing the limits script:
which should tell you that the point is challenged by both the DT_CMS analysis (CLs limit ~0.93) and excluded by the HSCP_ATLAS analysis (CLs limit ~0.999). See below for more about this script.
Each of the three executables takes a YAML file as input in place of a load of settings. Examples are provided in the InputFiles subdirectory, so they can be launched by
The exception is the analyseHEPMC.exe which can also accept just the hepmc filename, e.g.:
In this case the code runs with default options.
This file will run leading order event generation within pythia, so requires a configuration file according to the standard pythia settings. It can run with multiple cores so is rather fast. However I have not included validation of this because it cannot accurately simulate the MET distribution, but it is the fastest way to test that things are running.
This file requires Les Houches Event files (e.g. from MadGraph) split into one chunk per core. This can be done with the provided splitlhe.py:
The file can be either compressed or uncompressed. The output will be a number of compressed files put into the subdirectory "Split" from where it is run labelled "Split/split_0.lhe.gz", "Split/split_1.lhe.gz", ... which can then be read by the main executable. In this case the YAML file looks like e.g.
Here we need a configuration file for pythia with any setting for the showering. Since the events should be generated with up to two additional partons to (more) accurately simulate the MET and pT distributions, sample configurations are provided for MLM and CKKW-L matching/merging.
Note that the matching/merging scale – as in MadGraph – needs to be set appropriately to the hard process being simulated. This is specified withing the .cfg files for pythia.
This executable requires a hepmc v2 file as input, either uncompressed – or compressed. This is especially intended for events simulated and showered within MadGraph. This can only function in single core mode (as it can only read one event at a time from the file ...) but then comes with no special settings, e.g.:
This is probably the simplest way to run the program, but is relatively slow due to the file manipulations and lack of multicore running (presumably putting the hepmc events into a fifo would be faster).
In order to save (a lot of) time when developing an analysis, it is possible to save the results of the "detector simulation" step. These can be created by setting the options in the yaml file when running one of the other executables:
The option for Store Hadrons
determines whether we store all the final-state hadronic particles. These add a significant amount to the file size and are only necessary if your analysis handles isolation itself, so by default they are dropped.
The created files are named hackanalysis_events_0.ha.gz
, hackanalysis_events_1
.ha.gz<tt>,..., up to the number of cores. In the yaml file foranalyseHAEVENT.exe` the stub for these should be given as:
or if only a single core is desired, the full name should be given:
In this way, millions of events can be reprocessed in times of a few seconds, leading to huge time savings when making modifications to an analysis. The event format is very similar to LHE files, and in the future if there is demand it could be easily made compatible.
The code is able to include the effect of pileup, even if it does not appear important for many analyses. The default pileup filename is minbias.dat.gz which should be in the directory from which the file is launched; to generate this some code is provided in the GeneratePileup directory.
The code writes an "efficiency file" which is an SLHA-style format with information about the cross-section and efficiencies of each cut; a "cutflow file" which is a rather verbose textfile with information about the cutflows; and a YODA file with the booked histograms. In addition, all results are stored in a json file, by default HackAnalysis.json
(but this can be changed by the parameter JSON File
in the yaml input file) which can be easily read by e.g. python programs for processing statistics, writing latex etc.
A script provided can be used with the efficiency file to calculate limits for the original analyses:
where the cross-section is optional. In the PYTHIA running modes the cross-section is taken from pythia and written into the efficiency file. However, you may want to use NLO-NLL values from elsewhere, in which case the option is useful. Moreover, in HEPMC mode the cross-section is not provided at all so it must be specified by the user. In the case of running from madgraph these can be found in the
file. Note that the output of the limit script will give a description of the merged regions, and the final lines should be like
By "CLs limit" we really mean 1-CLs; values above 0.95 are excluded at 95% confidence level.
A new python package is provided for general handling of the analyses that use pyhf or simplified likelihoods in the Statistics directory. These can be simply used by
Extensive validation has been performed, please see the github, website or the papers for more information (it is not included here to save space).
Thanks go to:
Having borrowed code from elsewhere it is only reasonable that this code can be reused/repurposed by anyone who finds it useful – provided that they credit its origin!
It is therefore understood to be released under the GNU GENERAL PUBLIC LICENSE v3 (as required by the corresponding licence of heputils).