% New Commands for variables %\newcommand{\MET}{$E_{T}^{miss} $\xspace} %\newcommand{\MTTWO}{$M_{T2} $\xspace} %\newcommand{\HT}{$H_{T} $\xspace} %\newcommand{\NJETS}{$N_{jets} $\xspace} %\newcommand{\NTOPS}{$N_{tops} $\xspace} %\newcommand{\NBJETS}{$N_{b-jets} $\xspace} %\newcommand{\ETA}{$\eta $\xspace} %\newcommand{\DELTAR}{$\Delta R $\xspace} %\newcommand{\PT}{$p_{T} $\xspace} %\newcommand{\DELTAPHI}{$\Delta\phi $\xspace} %\newcommand{\PHI}{$\phi $\xspace} %\newcommand{\GEV}{$GeV $\xspace} %\newcommand{\TTBAR}{$t\bar{t} $\xspace} \newcommand{\IDELTAPHI}{$\Delta\bar{\phi} $\xspace} %Background estimation \chapter{Background Estimation} \label{ch:bg} \section{Background from Top and $W$ decays} \label{sec:bg_twdecay} The background from \TTBAR , $W$ + jets and single-top events that are not removed by the lepton-veto and isolated track veto cuts discussed in Chapter~\ref{ch:analysis} are the largest background in the analysis. It can contain either a hadronically decaying tau or light leptons (electrons or muons) that are not isolated, not identified/reconstructed, or are out of the acceptance region. Both types of backgrounds are estimated using the “translation factor (TF) method”. Being a relatively new method, its validity is tested against the well established "classic lost lepton method"~\cite{2.3fbpaper}.\\ \subsection{Translation Factor Method} \label{sssec:tfmethod} When $W$ bosons decay into leptons ( $e$, $\mu$ or $\tau$) and neutrinos, energy carried by a neutrino results in \MET . The event can pass the selection criteria as described in section ~\ref{sec:bg_twdecay}. Events with taus, reconstructed as jets in the final state, can also pass the selection criteria. This background is estimated using a control region of single lepton events selected from data using search triggers. Both muon and electron control samples (CS) are used to reduce the statistical uncertainty. The muon control sample is prepared by requiring exactly one muon with the momentum $p_{T}^{\mu} > 10$ \GEV and $|\eta| < 2.4$ in $\mu + \text{jets}$ events and for the electron control sample we require $p_{T}^{e} > 10$ \GEV and $|\eta| < 2.5$. A cut on the transverse mass of the $W$, $m_{T} = \sqrt{2 p_{T}^{l}E_{T}^{miss}(1 - \cos \Delta \phi)} < 100$ \GEV , is required in order to select events containing a $W \rightarrow l \nu$ decay and to suppress possible new physics-signal contamination. Here, $l$ indicates either an electron or a muon and $\Delta\phi$ is the azimuthal angle between the $\vec{p_{T}}^{l}$ and the \MET directions.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_no_any_SF/NJets_muCS.pdf} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_no_any_SF/NJets_eleCS.pdf} \\ \caption{A shape comparison for the \NJETS for the muon (top) and electron (bottom) control samples.} \label{fig:shapeNj} \end{center} \end{figure} In order to predict the number of events in signal regions, we measure the “translation factor” (TF) from simulation. The translation factor is defined as $TF^{i} = N_{SR}^{i} /N_{CS}^{i}$, where $N_{SR}^{i}$ is the number of either hadronic tau or lost lepton events in the $i^{th}$ signal region and $N_{CS}^{i}$ is the number of CS events in the corresponding signal region. Apart from the difference of the $W$ boson decay, the CS and signal region have similar hadronic activities. The lepton (muon and electron) control sample is selected from either simulation or collider data applying the same criteria as discussed in Chapter~\ref{ch:analysis} except for the fact that we require exactly one muon for the muon CS and one electron for the electron CS.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_no_any_SF/NTops_muCS.pdf} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_no_any_SF/NTops_eleCS.pdf} \\ \caption{A shape comparison for the \NTOPS for the muon (top) and electron (bottom) control samples.} \label{fig:shapeNt} \end{center} \end{figure} The differences between MC and data were studied by shape comparisons with different kinematic variables for both electron and muon control samples. For a shape comparison, the overall MC has been scaled down by 73$\%$ for the muon CS and 71$\%$ for the electron CS. We find that there is a data versus MC shape difference for \NTOPS, \NJETS, \MTTWO , and \MET . The agreement between data and MC for \NBJETS is already quite good. Shape comparisons between MC and data for kinematic variables are shown in Figs.~\ref{fig:shapeNj} and~\ref{fig:shapeNt}. To account for data and MC differences, the MC sample is corrected by three scale factors; Initial State Radiation (ISR) correction, b-tagged jet scale factors, and Lepton Efficiency. The efficiency in lepton detection is very important because they are treated differently in CS where leptons are selected and the signal region where leptons are vetoed. Therefore we do not have a well-identified lepton to begin with and we thus use the following relation to propagate the correction factors: \begin{equation} N_{prod}^{i} = N_{lost}^{i} + N_{sel}^{i} \end{equation} \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_all_SF/NJets_muCS.pdf} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_all_SF/NJets_eleCS.pdf} \\ \end{center} \caption{Shape comparison for the \NTOPS for the muon CS (top) and electron CS (bottom) after applying various scale factors.} \label{fig:shapeNjsf} \end{figure} where the $N_{prod}^{i}$ is the total produced number of events in the $i^{th}$ signal region with a $W$ boson decaying to a muon, electron or hadronically decaying tau ($\tau _{h}$), $N_{lost}^{i}$ is the number of events that end up in our signal region after all the search cuts are applied including the lepton and isolated track veto. $N_{sel}^{i}$ is the number of events selected with an identified muon, electron or isolated tracks for vetoing. $N_{prod}^{i}$ remains unchanged regardless of corrections to the lepton data/MC scale factors. Therefore the change of the simulated number of events due to lepton SF on the N$_{sel}^{i}$ can be easily propagated to the quantity we are interested in, that is, the $N_{lost}^{i}$. \\\\ After applying scale factors, the event variable shape comparisons are shown in Figs.~\ref{fig:shapeNjsf} and~\ref{fig:shapeNtsf}. For the shape comparisons, the muon MC CS is scaled by 84$\%$ and electron MC CS is scaled by 83$\%$. We can clearly see the improvement of the \NJETS and \NTOPS distributions.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_all_SF/NTops_muCS.pdf} \includegraphics[width=0.75\linewidth]{figure/HadTau/comp_CS_all_SF/NTops_eleCS.pdf} \end{center} \caption{Shape comparison for the \NTOPS for the muon CS (top) and electron CS (bottom) after applying various scale factors.} \label{fig:shapeNtsf} \end{figure} A single translation factor in each search region is evaluated from simulated \TTBAR, $W$ + jets, and single-top events. The ratio of $\tau_{h}$ or lost lepton events after a full search selection cuts to the lepton CS events selected with the criteria discussed above determines the translation factor. Two sets of TF are measured for muon and electron CS separately. The uncertainty on these scale factors is included as a systematic uncertainty. The data-corrected translation factors for the lepton control sample are shown in Fig.~\ref{fig:tf}. As expected, within the uncertainties, the TF from electron and muon CS follow a similar trend across all the search bins. \\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/HadTau/comp_TF_hadtau_comb.pdf} \includegraphics[width=0.45\linewidth]{figure/HadTau/comp_TF_lostle_comb.pdf}\\ \end{center} \caption{Translation factors for the $\tau_{h}$ (left) and the lost lepton (right) background prediction with their uncertainties from limited MC statistics for both muon and electron CS.} \label{fig:tf} \end{figure} \subsection{Systematic uncertainties} \label{ssec:sysunc_tf} The major source of systematic uncertainty of this method is the statistical error on the translation factor. A translation factor includes all the cuts, correction factors, and selection efficiency. The uncertainty on lepton efficiency affects both lepton selection and lepton veto. The jet energy scale uncertainty and b-tag SF uncertainty affect the jet and b-jet selection, respectively. As the prediction is obtained by multiplying the translation factor with the data CS, the change in the prediction can be obtained by estimating the change in the translation factor. Each factor folded into the translation factor is varied by its uncertainties to determine the change in the ratio. All of the dominant sources of uncertainties and their contribution to the overall uncertainties are listed in the Table~\ref{tab:TAUpredsys} for hadronic tau background and in the Table~\ref{tab:LLpredsys} for lost lepton background.\\ \begin{table}[htbp] \fontsize{10 pt}{1.2 em} \selectfont \begin{centering} \caption{\label{tab:TAUpredsys} Contributions from different sources of systematic uncertainty to the $\tau_{h}$ background prediction.} \hspace*{-4ex} \begin{tabular}{|c|c|c|} \hline Process & Source & Effect on $\tau_{h}$ \\ & & Prediction in $\%$\\ \hline $\tau$ Translation factor & Statistics of MC SR and CS events & 1 to 50 \\ statistical error & & \\ \hline Lepton efficiency SF & Data-MC correction from & \\ (including isolated tracks) & tag and probe method and studies & 5 to 52 \\ \hline B-tag SF & Uncertainty on b-tag SF & 0 to 1 \\ \hline %$m_{T}$ cut efficiency & Variation of \met enery scale & 0 to 0.5 \\ %\hline %Isolated track veto & Data-MC correction on hadronic track veto efficiency of $\tauh$ & 4 to 6.5 \\ %\hline \MET mag and $\phi$ & Uncertainty related to \MET mag and $\phi$ & 0 to 54 \\ \hline JEC & Jet energy correction uncertainty & 0 to 52 \\ \hline ISR & Variation of ISR weight & 0 to 11 \\ \hline PDF & PDF uncertainty & 0 to 31 \\ \hline %Trigger & Uncertainty on trigger efficiency & TODO \\ %\hline \end{tabular} \par\end{centering} \end{table} \subsection{Prediction} \label{ssec:pred_hadtau} After applying the measured TF from simulation to the data CS events we can obtain the hadronic tau background predictions. Since we have data CS from both electron and muon channels, we average the predictions from both CS to estimate the final overall systematic uncertainty. Fig.~\ref{fig:llhtpred} shows the predictions for all search regions. The error bars in the figures include both statistic and total systematic uncertainties. The statistical uncertainties are the propagation of Poisson statistics of the observed data CS events from both electron and muon channels given by the Garwood interval~\cite{GarWOOD:1936}. While a Poisson distribution of mean $\mu_{s}$ has a variance equal to $\mu_{s}$, an interval $ (\mu_{s}+ \sqrt{\mu_{s}}, \mu_{s} - \sqrt{\mu_{s}})$ may result in under-coverage, especially if $\mu_{s}$ is not large. This may either arise in low-statistics histograms, or in high-statistics ones drawn on a semi-logarithmic scale. These "correct coverage" vertical bars, first derived by Garwood in 1936~\cite{GarWOOD:1936}, are obtained from the Neyman construction using the central interval convention.\\ \begin{table}[htbp] \fontsize{10 pt}{1.2 em} \selectfont \begin{centering} \caption{\label{tab:LLpredsys} Contributions from different sources of systematic uncertainty to the lost lepton background prediction.} \hspace*{-4ex} \begin{tabular}{|c|c|c|} \hline Process & Source & Effect on lost lepton \\ & & Prediction in $\%$\\ \hline Lost lepton translation & Statistics of MC & \\ factor statistical error & SR and CS events & 2 to 51 \\ \hline Lepton efficiency SF & Data-MC correction & \\ (including isolated tracks) & from tag and probe method and studies & 7 to 46 \\ \hline B-tag SF & Uncertainty on b-tag SF & 0 to 2 \\ \hline %$m_{T}$ cut efficiency & Variation of \met enery scale & 0 to 0.5 \\ %\hline %Isolated track veto & Data-MC correction on hadronic track veto efficiency of $\tauh$ & 4 to 6.5 \\ %\hline \MET mag and $\phi$ & Uncertainty related to \MET mag and $\phi$ & 0 to 40 \\ \hline JEC & Jet energy correction uncertainty & 0 to 56 \\ \hline ISR & Variation of ISR weight & 0 to 13 \\ \hline PDF & PDF uncertainty & 0 to 32 \\ \hline %Trigger & Uncertainty on trigger efficiency & TODO \\ %\hline \end{tabular} \par\end{centering} \end{table} \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/HadTau/pred_full_hadtau_comb.pdf} \includegraphics[width=0.45\linewidth]{figure/HadTau/pred_zoomin_hadtau_comb.pdf} \\ \includegraphics[width=0.45\linewidth]{figure/HadTau/pred_full_lostle_comb.pdf} \includegraphics[width=0.45\linewidth]{figure/HadTau/pred_zoomin_lostle_comb.pdf}\\ \end{center} \caption{Predicted $\tau_{h}$ background (top) and lost lepton background (bottom) yield for a 35.9 $fb^{-1}$ data for all the search regions. Right plots are a zoomed version of left plot. Both statistical and total systematic uncertainties are shown.} \label{fig:llhtpred} \end{figure} The prediction from an average TF method was compared with the classic and well established lost lepton method~\cite{Khachatryan:2015wza}. It is observed that both methods agree very well within statistical uncertainties. Comparison of prediction from both methods is shown in Fig.~\ref{fig:TFvsLL}. \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.8\linewidth]{figure/HadTau/v4_DataCardCampare_0_50_mu_cs.pdf} \end{center} \caption{Lost lepton background predictions on muon control sample, in red. The blue points are the results obtained with the average TF method. The uncertainties include both the statistical and systematic uncertainties.} \label{fig:TFvsLL} \end{figure} %Zinv \section{Backgrounds From Neutrinos in $Z$ Decays} \label{sec:bg_zinv} This background arises from $Z$ + jets events where the $Z$ boson decays into a pair of neutrinos. Neutrinos contribute to \MET and the presence of jets may allow the event to enter the search region. Due to a small branching ratio for $Z\rightarrow \nu\nu$ we do not have a large real data samples to study this irreducible background. Instead using the direct data-driven method, we use a data validated Monte-Carlo method. In this multistage process, the final estimate is taken from the $Z\rightarrow \nu\nu$ MC, which is corrected for data/MC differences observed in a control region with loosened cuts. \\ The yield of the $Z\rightarrow \nu\nu$ background prediction for each search bin B can be written as \begin{equation} \hat{N_{B}} = R_{norm} \cdot \sum_{events \epsilon B} S_{DY}(N_{jet}) W_{MC}. \label{eq:Znunu} \end{equation} where $\hat{N_{\text{B}}}$ the predicted number of $Z\rightarrow \nu\nu$ background events in each search bin B, and $W_{\text{MC}}$ a standard MC event weight that includes the estimated $Z\rightarrow \nu\nu$ cross-section, the data luminosity, the $b$ tag scale factors, and the measured trigger efficiency. Each MC event is corrected using two additional scale factors. The first, $R_{\text{norm}}$, is an overall normalization factor for the $Z\rightarrow \nu\nu$ simulation that is derived in a tight control region in data. This tight control region has the same selection as the search region, apart from the requirement that there be two muons (treated as if they were neutrinos) and that events with any b-tagged jet multiplicity are allowed, so it is a very good proxy for the signal region. The second scale factor, $S_{DY}$, depends on the \NJETS in the event and is derived in a loose control region in which the signal region requirements on \MET, \MTTWO, and \NTOPS in the event are relaxed. \\ The corrected MC is further validated in three steps.\\ \begin{itemize} \item To make sure that the Drell-Yann ($DY \rightarrow\mu\mu $) is a good proxy for $Z\rightarrow \nu\nu$, we need to match both samples. Some corrections are needed and we introduce a single scale factor. We eventually apply this scale factor to $Z\rightarrow \nu\nu$ MC. \item A second layer of validation is to use the loose control region, for which a reasonable number of events are available in data, to check the shape agreement between data and the simulated distributions. Any disagreements will be incorporated as a systematic uncertainty in the prediction. \item We also need to check the data/MC agreement in the loose region, where the shape systematics are assessed, versus the data/MC agreement in the tight region, which is the proxy for the region we want to predict. \\ \end{itemize} \subsection{Samples and Control Regions} \label{ssec:CRSamples} The data set used in the $Z\rightarrow\nu\nu$ background estimate corresponds to data taken with the dimuon trigger. These data samples contain exactly two oppositely charged selected leptons, either two muons for the $DY \rightarrow\mu\mu$ validation, or an electron and a muon for the validation of the \TTBAR MC. To guarantee that leptons originate in $Z$ decay, we require the dilepton invariant mass to be within the mass window of $81 < m_{ll} < 101$ \GEV . The loose and tight selections were made relative to the baseline cuts to match the control and signal region respectively. The main goal for the loose control region is to provide a data sample that is close to the signal region in terms of kinematic requirements, e.g. the number of jets, but is loose enough to have sufficient events to do a shape comparison for the main analysis variables. The tight region is very similar to the expected signal in the search region in terms of kinematic properties, but it suffers from a lack of the statistics. Therefore, we cannot bin it in all the search bins and only use it to derive an overall normalization for the simulation.\\ \subsection{Scale Factor Calculation} \begin{itemize} \item \textbf{\TTBAR Reweighing:} One of the ways the search region differs from the loose control region is that the later is allowed to have zero b-tagged jet bin. When $DY \rightarrow\mu\mu$ is used to predict the $Z\rightarrow \nu\nu$ background and one b-tagged jet is in an event, a significant number of dimuon events can come from \TTBAR processes. Therefore, to be able to properly validate the DY MC sample against data in the dimuon control region, we need a dependable prediction for the \TTBAR component in that region. Given that the $e\mu$ control region has the highest purity of \TTBAR events, we use this control region to validate \TTBAR MC. As per standard CMS SUSY group practices, we use a scale factor included to account for the initial state radiation (ISR reweighing) to make good matching. In Fig.~\ref{fig:ttbarISR} perfect agreement is observed between data and MC after applying ISR weight to \TTBAR MC. \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMC_SingleMuon_nb_elmuZinv_loose0_mt2_MET.pdf} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMC_SingleMuon_nt_elmuZinv_loose0_mt2_MET.pdf}\\ \end{center} \caption{Comparison between the \TTBAR and MC samples for the b-tagged jet multiplicity distribution (left) and the distribution of the number of reconstructed tops in the event (right), in the loose $e\mu$ control region after applying ISR to \TTBAR.} \label{fig:ttbarISR} \end{figure} \item \textbf{Data/MC correction factors from the loose $\mu\mu$ control region ($S_{DY}(N_{jet}$):} To validate DY samples with respect to data in the loose $\mu\mu$ control region which has high purity for the $DY \rightarrow\mu\mu$, it is important to correct for the \TTBAR contribution. Although there is reasonable agreement between data and MC after correction, lower jet multiplicity bins show larger shape disagreement. This disagreement can be seen in Fig.~\ref{fig:datamcDY}.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCwtt_SingleMuon_nj_muZinv_0b_loose0_mt2_MET.pdf} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCwtt_SingleMuon_nj_muZinv_g1b_loose0_mt2_MET.pdf}\\ \end{center} \caption{Comparison between DY data and MC for the jet multiplicity distribution in the loose $\mu\mu$ control region for events with 0 b-tagged jets (left) and ≥ 1b-tagged jets (right) after applying ISR to \TTBAR.} \label{fig:datamcDY} \end{figure} The Data/MC ratio to correct for the difference seen above is derived from the loose $\mu\mu$ region for the separate b-tagged jet bin. All other backgrounds mimicking the $Z\rightarrow \nu\nu$ process must be subtracted (except for \TTBAR which is corrected). The DY MC sample is reweighed with $S_{DY}(N_{\text{jet}})$, which is given by the equation \begin{equation} S_{DY}^{i} = \frac{ Data^{i} - S_{t\bar{t}}^{i} MC_{t\bar{t}}^{i} - MC_{other}^{i}}{MC_{DY}^{i}}, \end{equation} where $i$ denotes a given $N_{\text{jet}}$ bin. Discrepancies seen in \NJETS distribution in Fig.~\ref{fig:datamcDY} are gone after applying $S_{\text{DY}}(N_{\text{jet}})$ as seen in Fig.~\ref{fig:datamcDYCorr}. \\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCw_SingleMuon_nj_muZinv_0b_loose0_mt2_MET.pdf} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCw_SingleMuon_nj_muZinv_g1b_loose0_mt2_MET.pdf}\\ \end{center} \caption{Comparison between DY data and MC samples for the jet multiplicity distribution in the loose μμ control region for events with 0 b-tagged jets (left) and $\geq$1 b-tagged jets (right) after applying both the \TTBAR and DY scale factors.} \label{fig:datamcDYCorr} \end{figure} \item \textbf{$R_{\text{norm}}$ from tight control region:} So far we have derived scale factors from the loose control region. But as mentioned earlier, a good proxy for the search region is the tight control region. First, we apply the $N_{\text{jet}}$-dependent scale factors, $S_{\text{DY}}(N_{\text{jet}})$, $S_{t\bar{t}}(N_{\text{jet}})$, to the relevant MC samples. Then, the ratio of the total event yield in a data to that in the simulation for the tight control region, denoted by $R_{\text{norm}}$, is extracted. We find that \begin{equation} R_{\text{norm}} = 1.070 \pm 0.085, \end{equation} where the uncertainty includes only the statistical uncertainties on data and simulation. Comparisons between data and MC after applying all scale factors are shown in Fig.~\ref{fig:datamcFinalCorr}. \\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCww_SingleMuon_nj_muZinv_blnotag.pdf} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCww_SingleMuon_nt_muZinv_blnotag.pdf}\\ \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCww_SingleMuon_met_muZinv_blnotag.pdf} \includegraphics[width=0.45\linewidth]{figure/Zinvisible/DataMCww_SingleMuon_mt2_muZinv_blnotag.pdf}\\ \end{center} \caption{Comparison between DY data and MC samples for the \NJETS (top left), \NTOPS (top right), \MET ( bottom left) and \MTTWO(bottom right) in the tight $\mu\mu$ control region after applying both the \TTBAR and DY scale factors, as well as the normalization weight $R_{norm}$.} \label{fig:datamcFinalCorr} \end{figure} \end{itemize} %Stopped here \subsection{Systematic Uncertainty and Prediction} \label{ssec:sysPredZinv} The systematic uncertainties for the $Z \rightarrow\mu\mu$ background prediction fall into two categories: uncertainties associated with the use of MC simulations and uncertainties specifically associated with the background prediction method. The first set includes parton distribution function and renormalization/factorization scale factors, jet energy correction \MET uncertainties, b-tag scale factors, the trigger efficiency scale factors. Systematic uncertainties inherent to the prediction method include uncertainty in the normalization factor $R_{\text{norm}}$, differences between data and MC etc. Major sources and their contribution are shown in Table~\ref{tab:Zinv_systematics}. \begin{table}[htbp] \centering \caption{\label{tab:Zinv_systematics} Contributions from different sources of systematic uncertainty to the $Z\rightarrow\nu\nu$ background prediction.} \begin{tabular}{|c|c|} \hline Source & Relative Uncertainty in $\%$\\ \hline $R_\textrm{norm}$ & 7.9\% \\ \hline Data/MC shape differences & 9 -- 55\% \\ \hline Stat. uncertainty on Data/MC comparison & 11 -- 56\% \\ \hline $Z\rightarrow\nu\nu$ MC statistics & 1 -- 100\% \\ \hline Shape variation due to $\mu_R$, $\mu_F$ variations & ${<}1\%$ -- $39\%$ \\ \hline Shape variation due to PDF variations & $1\%$ -- $45\%$ \\ \hline Jet energy scale & $2\%$ -- $75\%$ \\ \hline \MET energy scale & $1\%$ -- $28\%$ \\ \hline b-tag SF & $1\%$ -- $23\%$ \\ \hline b-mistag SF & ${<}1\%$ -- $16\%$ \\ \hline Trigger efficiency & ${<}14\%$ \\ \hline %ISR Uncertainties & \end{tabular} \end{table} The $Z\rightarrow\nu\nu$ background prediction in each bin along with the statistical and systematic uncertainties is shown in Fig.~\ref{fig:ZinvPred}. For bins that have zero events, the statistical uncertainty is treated as the average weight (sum of the weights squared divided by the weight) times the Poisson error on 0, which is 1.8.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.85\linewidth]{figure/Zinvisible/moneyplot.pdf}\\ \end{center} \caption{$Z\rightarrow\nu\nu$ background prediction for all search bins, including the breakdown of the various uncertainties.} \label{fig:ZinvPred} \end{figure} \section{Background from QCD Multijet Events} \label{sec:bg_qcd} In the standard model, the QCD processes produce multijet events in the final state. If one or more jets energy is under-measured, the event may end up with a spurious imbalance of energy \MET . These types of events, which consist of multiple jets and missing energy, can easily enter our search regions. Even though it is very rare to incorrectly reconstruct a bottom or top quark event, the very large QCD cross section means that estimating their contribution in the signal region is necessary. Monte Carlo studies have shown that cuts on \MET and the angle between a jets and the direction of the missing energy (\DELTAPHI) between jets and \MET vector suppress most of the QCD backgrounds. But control samples used in the background estimation also have low statistics in the QCD background. Moreover, the contribution from \TTBAR processes makes it difficult to use the more common background estimation techniques, which would simply extrapolate QCD dominated distributions from the "sidebands" into the signal regions. The procedure to estimate the background involves selecting QCD enriched but signal depleted data samples from which \TTBAR , $Z$+jets, $W$+jets processes are subtracted. Due to lack of statistics, we use MC samples to derive the translation factor, although we normalize their values to a data measurement in the 200 \GEV $<$ \MET $<$ 250 \GEV bin, just below the signal region, where there are enough statistics. \\ \subsection{Translation Factor Method and Measurement} \label{ssec:qcdtf} We create a QCD enriched sideband sample by applying all baseline cuts to the data but we invert the \DELTAPHI requirement. Fig.~\ref{fig:QCDTopo} shows typical events that passed \DELTAPHI cut and inverted \DELTAPHI (\IDELTAPHI ) case.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.85\linewidth]{figure/QCD/DPhiEventVsFlipDPhiEvent.pdf} \end{center} \caption{(a) Example of an event passing the \DELTAPHI cut. \MET is well separated from jets and \PT of the leading three jets.(b) Example of an event failing the \DELTAPHI cut. \MET is well aligned with one of the leading jets and most likely arises from jet mismeasurement.} \label{fig:QCDTopo} \end{figure} The number of QCD events in \IDELTAPHI regions are obtained by subtracting the lost lepton, the hadronic tau, and the $Z\rightarrow\nu\nu$ contributions from data. \begin{equation} N^{\Delta\bar{\phi}}_{QCD} = N^{\Delta\bar{\phi}}_{Data} - N^{\Delta\bar{\phi}}_{LL} - N^{\Delta\bar{\phi}}_{\tau_{h}} - N^{\Delta\bar{\phi}}_{Z\rightarrow \nu\nu}, \label{eq:Nqcd} \end{equation} where $N^{\Delta\bar{\phi}}_{X}$ is the number of type X events in the \IDELTAPHI sideband. Contributions subtracted in equation ~\ref{eq:Nqcd} are estimated using the technique discussed in the previous sections. The translation factor, $T^{MC}_{QCD}$ , is defined as the ratio of the MC predictions for the \DELTAPHI and \IDELTAPHI samples: \begin{equation} T_{QCD}^{MC} = \frac{N^{\Delta\phi}_{MC-QCD}}{N^{\Delta\bar{\phi}}_{MC-QCD}}. \label{eq:TqcdMC} \end{equation} while the final QCD background prediction in the search regions is calculated as: \begin{equation} N^{SR}_{QCD} = N^{\Delta\bar{\phi}}_{QCD} \times T_{QCD}^{Scale}, \label{eq:QCDformula} \end{equation} where $N^{\Delta\bar{\phi}}_{QCD}$ comes from data (as defined in Eq.~\ref{eq:Nqcd}), and $T_{QCD}^{Scale}$ is the $T_{QCD}^{MC}$ normalized to a translation factor measured in the 200 \GEV $<$ \MET $<$ 250 \GEV sideband from data. This normalization provides a more accurate estimation of the true translation factors because although we trust (within the assigned uncertainties) the shape of the MC distributions utilized to calculate them, we do not trust their absolute values, which are corrected using the low 200 \GEV $<$ \MET $<$ 250 \GEV sideband $T_{QCD}^{Data}$ measurement. \\ \\ The procedure to derive the translation factors is the following: \begin{itemize} \item Calculate $T_{QCD}^{MC}$ from QCD MC \item Measure $T_{QCD}^{Data}$ from data in low \MET sideband \item Measure $T_{QCD}^{Scale}$ by normalizing the $T_{QCD}^{MC}$ versus \MET functions using the sideband $T_{QCD}^{Data}$ factors measured in real data from the 200 \GEV $<$ \MET $<$ 250 \GEV bin. $T_{QCD}^{Scale}$ are factors applied to get final QCD background predictions.\\ \end{itemize} \subsection{Systematic Uncertainties and Prediction} \label{ssec:qcdsyspred} A systematic uncertainty in the QCD multijet prediction for each search region is evaluated as the difference between the event yield obtained directly from the QCD multijet simulation for that region and the prediction obtained by applying the background prediction procedure to simulated QCD multijet samples (30$\%$ to 500$\%$). Additional sources of uncertainty are from the statistical uncertainty in the translation factors (30$\%$ to 300$\%$) and the subtraction of the non-QCD-multijet SM contributions to the QCD control sample (2$\%$ to 50$\%$). The validity of the method is checked by a closure test. In the closure test, direct simulated samples are compared with simulated samples treated as if they were data. \\ \section{Background From Other Processes} \label{sec:bg_other} Besides the dominant backgrounds discussed above, other SM backgrounds with small cross sections were considered and estimated for this analysis. Backgrounds from rare events contribute to only a small fraction of the total background and have only a small effect on the final result. These are diboson or multiboson processes associated with the production of top quark pairs. Fig.~\ref{fig:rare} shows $t\bar{t}Z$ and $t\bar{t}W^{+}$ production mechanisms in proton-proton collisions. \\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.4\linewidth]{figure/ttZ/TTZ.png} \includegraphics[width=0.4\linewidth]{figure/ttZ/TTW.png} \end{center} \caption{Dominant feynmann diagram for $t\bar{t}Z$(left) and $t\bar{t}W^{+}$(right).} \label{fig:rare} \end{figure} Estimates of the rates of rare background processes are taken directly from the simulation. Processes such as $t\bar{t}Z$ form irreducible backgrounds when the $Z$ decays to $\nu\nu$ and both top quarks decay hadronically. The $t\bar{t}Z$ cross section at 13 TeV is 782.6 pb so the predicted yield of $t\bar{t}Z$ events in the search bins is less than 10$\%$ of the total background. Given the small cross section associated with this process, we rely on simulation to generate a prediction, although this estimation is validated using real collider data. A generator-level veto of a $W$ or $Z$ decaying leptonically is applied to avoid double counting with the lost lepton and hadronic tau backgrounds. Except for the $t\bar{t}Z$ process, the other backgrounds are combined with the rare background. The yield of $t\bar{t}Z$ and rare processes are shown in Fig.~\ref{fig:rarettzy}. The yields of events in this sample from simulation and data are found to agree within a statistical uncertainty of 30$\%$, which is taken as the systematic uncertainty in the $t\bar{t}Z$ background estimate.\\ \begin{figure}[!htbp] \begin{center} \includegraphics[width=0.7\linewidth]{figure/ttZ/TTZ_Stat.pdf}\\ \includegraphics[width=0.7\linewidth]{figure/ttZ/Rare_Stat.pdf} \end{center} \caption{Yield of the $t\bar{t}Z$ (top) and rare background (bottom) prediction normalized to 36 fb$^{−1}$.} \label{fig:rarettzy} \end{figure}