1 Introduction

Events in which two hard parton-parton interactions occur within a single proton-proton (\(\text {p}\text {p}\)) collision—referred to as double-parton scattering (DPS) processes—have been discussed theoretically since the introduction of the parton model [1,2,3,4,5,6,7,8]. Experimentally, such processes have been studied at hadron colliders at different center-of-mass energies using multiple final states [9,10,11,12,13,14,15,16,17,18,19,20,21,22].

The cross section for a single hard scattering (SHS) can be factorized into a term containing the parton distribution functions (PDFs) and the partonic cross section of the process at hand, but this approach becomes nontrivial for DPS processes. Although the factorized partonic cross sections remain unchanged, the PDF term in the DPS case contains elements from two distinct partons in each proton. This term includes a distance parameter between the partons in the plane transverse to the direction of motion of each proton. Precise calculations of the involved dynamics have been carried out for such a case [7]. Assuming that both the partonic cross sections and the transverse and longitudinal parts of the PDF terms factorize, the DPS cross section can be written in a simplified model as

$$\begin{aligned} \sigma _{\mathrm {AB}}^{\mathrm {DPS}} = \frac{n}{2} \frac{\sigma _{\mathrm {A}} \sigma _{\mathrm {B}}}{\sigma _{\text {eff}}}, \end{aligned}$$
(1)

where “A” and “B” denote the SHS processes, and \(\sigma _{\mathrm {A}}\) and \(\sigma _{\mathrm {B}}\) are their respective production cross sections. The factor n is equal to unity if processes A and B are the same, and is equal to two otherwise. The parameter \(\sigma _{\text {eff}}\), the effective cross section of DPS processes, is related to the extent of the parton distribution in the plane orthogonal to the direction of motion of the protons. It was measured at different hadron colliders and center-of-mass energies in a variety of final-state processes with comparatively large uncertainties (\({\approx }30\)%). Its value ranges between 15 and \(26\,\text {mb} \) for processes involving a vector boson [13,14,15,16,17,18, 21, 23]. Significantly lower values, down to \(2.2\,\text {mb} \), are measured for processes involving heavy-flavor production [22].

One of the most promising processes to study DPS is the case in which both hard scatterings lead to the production of a \(\text {W}\) boson, and, in particular, the final state with two same-charge \(\text {W}\) bosons [24]. The SHS \(\text {W}^{\pm }\text {W}^{\pm }\) production includes two additional partons and its cross section is therefore suppressed at the matrix-element level. Figure 1 illustrates the production of \(\text {W}^{\pm }\text {W}^{\pm }\) via the DPS process (left) and via SHS processes (middle and right) at leading order (LO) in perturbative quantum chromodynamics (QCD). The absence of jets in the \(\text {W}^{\pm }\text {W}^{\pm }\) production via DPS at LO in perturbation theory provides an additional handle to reduce the contributions from the SHS backgrounds by introducing an upper limit on the number of jets. Moreover, when both \(\text {W}\) bosons decay leptonically, this event exhibits a clean final state in the detector, and the excellent reconstruction and resolution of leptons in the CMS detector provides an accurate measurement of the \(\text {W}\text {W}\) DPS cross section.

Fig. 1
figure 1

Schematic diagrams corresponding to the production of \(\text {W}^{\pm }\text {W}^{\pm }\) via the DPS process (left) and via SHS processes (middle and right), with both \(\text {W}\) bosons further decaying leptonically

However, DPS \(\text {W}\text {W}\) production has not been observed experimentally. An observation of this process would permit the validation of the factorization approach, which is prevalent in current Monte Carlo (MC) event generators. In addition, it is proposed that angular observables in the DPS \(\text {W}^{\pm }\text {W}^{\pm }\) process are sensitive to nontrivial longitudinal momentum correlations among the partons [25,26,27]. The DPS \(\text {W}^{\pm }\text {W}^{\pm }\) process also constitutes a background in searches for new physics at the CERN LHC, e.g., in searches for the electroweak production of supersymmetric particles [28]. A measurement of the DPS \(\text {W}\text {W}\) production cross section (\(\sigma _{\mathrm {DPS\,\text {W}\text {W}}}\)) would improve the reach of such searches.

A search for the production of \(\text {W}^{\pm }\text {W}^{\pm }\) via DPS was reported in the past by the CMS Collaboration using \(\text {p}\text {p}\) collisions at \(\sqrt{s} = 8\,\text {TeV} \), and an upper limit of \(0.32\,\text {pb} \) was set on its production cross section at 95% confidence level [19]. An increased production cross section at \(\sqrt{s} = 13\,\text {TeV} \) and a larger data set collected using the CMS detector allow a more detailed study of this rare and interesting physics process. This paper presents a measurement of this process performed with \(\text {p}\text {p}\) collision data, recorded using the CMS experiment at a center-of-mass energy of \(13\,\text {TeV} \) in 2016 and 2017. The analyzed data sample corresponds to an integrated luminosity of \(77.4{\,\text {fb}^{-1}} \).

The analysis focuses on the leptonic decay of the \(\text {W}\) bosons in final states consisting of a same-charge electron-muon (\(\text {e}^\pm {\upmu }^\pm \)) or dimuon (\({\upmu }^\pm {\upmu }^\pm \)) pair, which include small contributions from leptonic \(\tau \) decays. The dielectron final state is not considered because of the relatively higher level of backgrounds.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid, \(13\,\text {m} \) in length and \(6\,\text {m} \) in diameter, which provides an axial magnetic field of \(3.8\,\text {T} \). The bore of the solenoid is outfitted with various particle detection systems. Charged-particle trajectories are measured in the silicon pixel and strip trackers, covering \(0<\phi <2\pi \) in azimuth and \(|\eta |<2.5\), where the pseudorapidity \(\eta \) is defined as \(\eta = -\ln [\tan (\theta /2)]\), with \(\theta \) being the polar angle of the trajectory of the particle with respect to the counterclockwise direction. A crystal electromagnetic calorimeter (ECAL), and a brass/scintillator hadron calorimeter surround the tracking volume. In this analysis, the calorimetry provides high resolution energy and direction measurements of electrons and hadronic jets. A preshower detector consisting of two planes of silicon sensors interleaved with lead is located in front of the ECAL at \(|\eta |>1.479\). Muons are measured in gas detectors embedded in the steel flux-return yoke outside the solenoid. The detector is nearly hermetic, allowing energy balance measurements in the plane transverse to the beam directions. A two-tier trigger system selects the most interesting pp collision events for use in physics analysis [29]. A more detailed description of the CMS detector can be found in Ref. [30].

3 Event selection criteria

A particle-flow (PF) technique is used to reconstruct and identify particles in the event [31]. It combines subdetector-level information to reconstruct individual particles and identify them as charged and neutral hadrons, photons, and leptons. Electron and muon candidates are reconstructed by associating a charged-particle track reconstructed in the silicon detectors with a cluster of energy in the ECAL [32] or a track in the muon system [33]. These PF candidates are used to reconstruct higher-level objects, such as jets, hadronically decaying \(\tau \) leptons (\({\uptau }_\mathrm {h}\)), and missing transverse momentum (\(p_{\mathrm {T}} ^\text {miss}\)). The missing transverse momentum vector \({\vec {p}}_{\mathrm {T}}^{\text {miss}} \) is computed as the negative vector \(p_{\mathrm {T}}\) sum of all the PF candidates in an event, and its magnitude is denoted as \(p_{\mathrm {T}} ^\text {miss}\) [34]. The \({\uptau }_\mathrm {h}\) candidates are reconstructed via the “hadrons plus strips” algorithm [35] and are further selected using a multivariate (MVA) classifier to reduce the misidentification rate of light-quark and gluon jets.

The reconstructed vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is the primary \(\text {p}\text {p}\) interaction vertex. Jets are reconstructed from charged and neutral PF candidates clustered using the anti-\(k_{\mathrm {T}}\) clustering algorithm [36, 37] with a distance parameter of 0.4, as implemented in the FastJet package [37, 38].

Two \(\text {b}\) tagging algorithms, which depend on the year of data taking [39, 40], are used to identify jets originating from \(\text {b}\) quarks. They are based on neural networks and combine information on tracks and secondary vertices. The chosen working points correspond to a \(\text {b}\) tagging efficiency in the range of 80–90% and a mistagging rate around 10%. Reconstructed jets must not overlap with identified electrons, muons, or \({\uptau }_\mathrm {h}\) within \({\varDelta }R= \sqrt{\smash [b]{({\varDelta }\eta )^2+({\varDelta }\phi )^2}}<0.4\). To suppress jets originating from instrumental background or from additional \(\text {p}\text {p}\) interactions in the same and nearby bunch crossings (pileup), a jet quality requirement based on the energy fraction of neutral hadrons and charged hadrons associated with the primary vertex is applied [41]. The energy scale of jets is corrected for the nonlinear energy response of the calorimeters and the residual differences between the jet energy scale in the data and in the simulation, separately in the different data taking periods. The jet energy scale corrections are propagated to \(p_{\mathrm {T}} ^\text {miss}\).

Leptons are required to originate from the primary vertex of the event to mitigate pileup effects. An MVA classifier is used to distinguish between “prompt” electrons and muons coming from \(\text {W}\), \(\text {Z}\), or \({\uptau }\) lepton decays and “nonprompt” leptons originating from heavy-quark decays or quark and gluon jets incorrectly reconstructed as leptons. This MVA classifier is trained using a set of observables related to the lepton kinematics, isolation, and identification, as well as variables relating the lepton to the nearest reconstructed PF jet, as described in Ref. [42]. The requirement of this lepton MVA classifier, referred to as the “tight” selection, corresponds to a selection efficiency for prompt leptons of about 90%, and has an efficiency for nonprompt leptons at the percent level. Further selection criteria are applied to ensure the correct assignment of the electric charge in the reconstruction. These selection criteria include requirements on the number of hits in the pixel system for electrons and on the agreement in the charge assignments of multiple reconstruction algorithms for muons.

Events are selected using a combination of dilepton and single-lepton triggers with different lepton \(p_{\mathrm {T}}\) thresholds. The minimum \(p_{\mathrm {T}}\) threshold requirements on the leading (subleading) lepton for the electron-muon and dimuon triggers are 23 (8) and \(17\,(8)\,\text {GeV} \), respectively. The single-lepton triggers, used to increase the trigger efficiency, employ lepton \(p_{\mathrm {T}}\) thresholds of 32 or \(35\,\text {GeV} \) for electrons and 24 or \(27\,\text {GeV} \) for muons.

The signal process is characterized by the presence of a pair of leptons of the same electric charge, along with a moderate amount of \(p_{\mathrm {T}} ^\text {miss}\) originating from the neutrinos in the \(\text {W}\) boson decays.

A “loose” set of requirements is imposed to retain a large set of events for training the boosted decision trees (BDT) that separate the signal from the main backgrounds [43]. Events are selected by requiring exactly two leptons of the same charge, \(\text {e}^\pm {\upmu }^\pm \) or \({\upmu }^\pm {\upmu }^\pm \), with \(p_{\mathrm {T}}\) greater than \(25\,(20)\,\text {GeV} \) for the leading (subleading) lepton, and \(|\eta |<2.5\) (2.4) for electrons (muons). Events are vetoed if there are additional leptons fulfilling looser identification and isolation requirements. The \(p_{\mathrm {T}}\) thresholds for these additional leptons are \(7\,\text {GeV} \) for electrons, \(5\,\text {GeV} \) for muons, and \(20\,\text {GeV} \) for \({\uptau }_\mathrm {h}\) candidates. A lower threshold of \(15\,\text {GeV} \) is applied to \(p_{\mathrm {T}} ^\text {miss}\), which retains most of the signal events, while significantly reducing the contributions from QCD multijet production, i.e., events from heavy- and light-flavor jets produced via strong interactions. The signal process involves no jet activity at LO although around 25% of signal events contain at least one reconstructed jet with \(p_{\mathrm {T}} ^{\text {jet}}>30\,\text {GeV} \) within \(|\eta _{\text {jet}} |<2.5\). To ensure high signal efficiency, a requirement of at most one such jet is imposed. Processes with \(\text {b}\) quark jets, such as \(\hbox {t}{\bar{\hbox {t}}}\), are further suppressed by rejecting events with at least one \(\text {b}\)-tagged jet having \(p_{\mathrm {T}} ^{\text {b}\,\text {jet}}>25\,\text {GeV} \) and \(|\eta _{\text {b}\,\text {jet}} |<2.4\). The event selection criteria are summarized in Table 1.

Table 1 Event selection criteria

4 Simulated samples

A set of simulated samples is used to estimate the signal and some of the backgrounds, whereas other backgrounds are estimated using the data control regions, as described below.

The signal process is simulated at LO in perturbation theory using the pythia 8.226 [44] event generator with the underlying tune CUETP8M1 [23] for 2016, and pythia 8.230 with the tune CP5 [45] for 2017 conditions. The resulting values for \(\sigma _{\text {eff}}\) of the two pythia tunes are \(29.9\,\text {mb} \) for CUETP8M1 and \(19.5\,\text {mb} \) for CP5. The large difference between these values and the resulting tune dependence of \(\sigma _{\mathrm {DPS\,\text {W}\text {W}}}\) underline the importance of measuring \(\sigma _{\mathrm {DPS\,\text {W}\text {W}}}\) experimentally. For the interpretation of the results a production cross section of \(1.92 \,\text {pb} \), obtained with the CP5 tune, is used.

Another set of signal events is simulated using the MC event generator herwig++ [46] with tune CUETHppS1 [23] and the CTEQ6L1 [47] PDF set. The kinematic observables are described consistently with the pythia and herwig++ event generators. Neither the underlying generator tune, nor the different PDF sets used to generate the samples, impact the kinematic observables relevant to the analysis.

The \(\text {W}\text {Z}\) process is simulated at next-to-LO (NLO) with powheg version 2.0 [48, 49] and MadGraph 5_amc@nlo 2.3.3 [50]. The former is used for the central prediction of this background, while the latter is used for the study of systematic differences in kinematic distributions. The \(\text {W}{\upgamma }\) and \(\text {Z}{\upgamma }\) samples, relevant to the \(\text {e}^\pm {\upmu }^\pm \) final state, are generated with the MadGraph 5_amc@nlo event generator at NLO and LO, respectively. To account correctly for parton multiplicities larger than one in the matrix element calculations, the FxFx jet merging scheme [51] is used for the NLO samples, while the MLM jet merging scheme [52] is used for the LO samples. The background contributions arising from \(\text {W}{\upgamma }^{*}\) and \(\text {Z}\text {Z}\) production processes are simulated at NLO with the powheg event generator, while MadGraph 5_amc@nlo is used to simulate the SHS \(\text {W}\text {W}\) process.

The generators are interfaced with pythia to model parton showering and hadronization with the same underlying tunes used for the signal generation. The NNPDF PDF sets with version 3.0 [53] are used for 2016, while NNPDF v3.1 [54] PDF sets are used for 2017 conditions in the simulation of all processes. The CMS detector response is modeled in the simulated events using Geant4 [55], and are reconstructed with the same algorithms used for the data. Simulated events are weighted to reproduce the pileup distribution measured in the data. The average pileup in data was 23 in 2016 and 32 in 2017. The simulated MC events are scaled to correspond to the respective theoretical cross sections using the highest order prediction available in each case [56, 57].

5 Background estimation

Background processes can be separated into two categories. The first category consists of processes with genuine same-charge lepton pairs from leptonic decays of bosons produced in the hard scattering. These processes include first and foremost the \(\text {W}\text {Z}\) process, in which both bosons decay leptonically and one of the leptons from the \(\text {Z}\) boson decay is either out of detector acceptance or does not pass the identification criteria. Other such processes include \(\text {W}{\upgamma }^{*}\), \(\text {W}{\upgamma }\), \(\text {Z}{\upgamma }\), and \(\text {Z}\text {Z}\) production, as well as—to a lesser extent—the SHS \(\text {W}^{\pm }\text {W}^{\pm }\) and \(\text {W}\text {W}\text {W}\) processes. Processes involving associated production of \(\text {W}/\text {Z}\) bosons and photons contribute via asymmetric conversions of the photons into lepton pairs inside the detector. All these background components are estimated from MC simulation after applying scale factors to correct for residual differences between simulation and data in the selection, reconstruction, and the modeling of the trigger. These scale factors are measured using a “tag-and-probe” method [42].

The second category consists of two types of experimental backgrounds that resemble the production of prompt, same-charge lepton pairs. The first type includes nonprompt lepton backgrounds in which one or two of the selected leptons do not originate from the decay of a massive boson from the hard scattering. This background component is dominated by \(\text {W}\)+jets and QCD multijet events, with smaller contributions from \(\hbox {t}{\bar{\hbox {t}}}\) production. The second type of experimental background arises from the misassignment of the charge of an electron in the reconstruction and is dominated by \(\text {Z}\rightarrow \tau \tau \) when both \(\tau \) leptons decay leptonically to form an electron-muon pair.

Nonprompt leptons arise largely from leptonic heavy-flavor decays and from jets misidentified as leptons. The main difference between a nonprompt and a prompt lepton is the presence of larger hadronic activity around the lepton direction for the former. This hadronic activity influences the lepton isolation and identification variables, and consequently the lepton MVA classifier used for the selection of leptons. The selection criterion on this lepton MVA variable is relaxed to define loose lepton selection criteria, and the leptons selected with this relaxed MVA threshold are called “loose” leptons.

The lepton misidentification rate, which is defined as the probability of a “loose” nonprompt lepton to pass the “tight” lepton selection criteria, is estimated directly from the data in a sample dominated by nonprompt leptons from QCD multijet and \(\text {W}\)+jets processes [42]. This control sample is constructed by requiring exactly one “loose” lepton and at least one jet with \({\varDelta }R{(\text {jet},\,\ell )}>1.0\) away from the lepton. To suppress contributions of prompt leptons from electroweak processes, an upper limit of \(40\,\text {GeV} \) is imposed on both \(p_{\mathrm {T}} ^\text {miss}\) and the transverse mass of the lepton and \(p_{\mathrm {T}} ^\text {miss}\). The transverse mass of two objects is defined as

$$\begin{aligned} m_{\mathrm {T}} (1,\,2) = \sqrt{2p_{\mathrm {T}} ^{(1)}p_{\mathrm {T}} ^{(2)}[1 - \cos {{\varDelta }\phi (1,\,2)}]}, \end{aligned}$$
(2)

where \({\varDelta }\phi (1,\,2)\) corresponds to the azimuthal angular difference between the momenta of the two objects.

The residual contamination of prompt leptons in this control sample is subtracted using simulation. The lepton misidentification rate is measured separately for electrons and muons as a function of the lepton \(p_{\mathrm {T}}\) and \(|\eta |\).

To estimate the contribution of events with nonprompt leptons to the signal region, another control sample of events is defined in the data. It is composed of events in which either one or both leptons fail the lepton MVA selection criteria but pass the “loose” selection, resulting in a sample of “tight-loose” and “loose-loose” lepton pairs. These events are reweighted as a function of the lepton misidentification rates to obtain the estimated contribution from this background in the signal region.

Similar to the nonprompt lepton background, the probability for the charge of a lepton to be incorrectly reconstructed is calculated and applied to the selected opposite-charge dilepton events in data. This lepton \(p_{\mathrm {T}}\)-\(|\eta |\) dependent charge misidentification rate is measured in \(\text {Z}\rightarrow \text {e}\text {e}\) events as the ratio of same-charge to opposite-charge dilepton events. Its value ranges from 0.02 (0.01)% for electrons in the barrel to 0.40 (0.16)% for electrons in the endcaps for 2016 (2017) data. The charge misidentification rate for muons is negligible.

6 Multivariate classifier training

The major background contributions arise from \(\text {W}\text {Z}\) production and processes with nonprompt leptons. To separate the signal from these two background components, two separate MVA classifiers are trained using a set of kinematic variables.

The \(\text {W}\text {Z}\) background is kinematically very similar to the signal, because they both have two prompt leptons with moderate \(p_{\mathrm {T}} ^\text {miss}\). Neither the signal nor the \(\text {W}\text {Z}\) process feature any hadronic activity in the form of high-\(p_{\mathrm {T}}\) jets at LO, and the masses of the bosons decaying to leptons are very similar, resulting in similar \(p_{\mathrm {T}}\) spectra for the leptons. The main difference between the signal and \(\text {W}\text {Z}\) production is that in \(\text {W}\text {Z}\) production the bosons share a Lorentz boost along the z-axis, whereas the bosons in the signal process are approximately uncorrelated.

In the case of nonprompt lepton production, dominated by \(\text {W}\)+jets and QCD multijet processes, the kinematic differences with respect to the signal are larger. However, these processes have production cross sections that are orders of magnitude larger than that for the signal process. Therefore, even with a low probability of passing the event selection criteria, the impact of these background processes is considerable.

A BDT-based framework combines this information to discriminate between the signal and the background events. The BDT training against the \(\text {W}\text {Z}\) sample is done using its simulated sample, whereas the training against nonprompt leptons is carried out using a “tight-loose” control sample in data.

The following set of eleven input variables based on the lepton and event kinematics are used to train the BDTs:

  • \(p_{\mathrm {T}} ^{\ell _1}\) and \(p_{\mathrm {T}} ^{\ell _2}\): transverse momenta of the two leptons;

  • \(p_{\mathrm {T}} ^\text {miss}\);

  • \(\eta _{\ell _1} \eta _{\ell _2}\): product of pseudorapidities of the two leptons;

  • \(|\eta _{\ell _1}{+}\eta _{\ell _2} |\): absolute sum of \(\eta \) of the two leptons;

  • \(m_{\mathrm {T}} (\ell _1,\,p_{\mathrm {T}} ^\text {miss})\): transverse mass of the leading lepton and \(p_{\mathrm {T}} ^\text {miss}\);

  • \(m_{\mathrm {T}} (\ell _1,\,\ell _2)\): transverse mass of the two leptons;

  • \(|{\varDelta }\phi (\ell _1,\,\ell _2) |\): azimuthal angular separation between the leptons;

  • \(|{\varDelta }\phi (\ell _2,\,p_{\mathrm {T}} ^\text {miss}) |\): azimuthal angular separation between the subleading lepton and \(p_{\mathrm {T}} ^\text {miss}\);

  • \(|{\varDelta }\phi (\ell _1\ell _2,\,\ell _2) |\): azimuthal angular separation between the dilepton system and the subleading lepton;

  • \(m_{\mathrm {T2}} (\ell _{1},\,\ell _{2})\): the so-called “stransverse mass” of the dilepton and \(p_{\mathrm {T}} ^\text {miss}\) system [58, 59].

The stransverse mass is defined as

$$\begin{aligned} m_{\mathrm {T2}} (\ell _{1},\,\ell _{2}) = \min \limits _{\smash [b]{{{\vec {p}}_{\mathrm {T}} ^{\text {miss}\,(1)}}} {+}\smash [b]{{{\vec {p}}_{\mathrm {T}} ^{\text {miss}\,(2)}}} ={\vec {p}}_{\mathrm {T}}^{\text {miss}}} \left[ \mathrm {max}\left( m_{\mathrm {T}} ^{(1)}, m_{\mathrm {T}} ^{(2)}\right) \right] , \end{aligned}$$
(3)

in which \({\vec {p}}_{\mathrm {T}}^{\text {miss}}\) is divided into two missing momentum vectors, \(\smash [b]{{{\vec {p}}_{\mathrm {T}} ^{\text {miss}\,(1)}}} \) and \(\smash [b]{{{\vec {p}}_{\mathrm {T}} ^{\text {miss}\,(2)}}} \), to produce the transverse masses \(m_{\mathrm {T}} ^{(1/2)}\) with the leptons in the event. In the case where both leptons and both neutrinos originate from mother particles of equal mass, the \(m_{\mathrm {T2}} (\ell _{1},\,\ell _{2}) \) variable exhibits an end point at the mother particle mass. All these variables show significant discrimination between the signal and background processes. The background estimations describe the data well for these variables.

The two classifiers are mapped into a single two-dimensional (2D) classifier by combining contiguous regions in the 2D plane of the two separate classifiers. These regions are chosen to optimize the constraining power of the maximum likelihood fit. Namely, bins are chosen so that some exhibit large signal-to-background ratio, while others are chosen to have small signal contribution, but large contributions of either of the two main backgrounds. In total, the 2D plane is split into 15 such bins, on which the final fit is performed. Several different choices of mapping the 2D plane into a one-dimensional (1D) classifier are tested according to these criteria, and the one exhibiting the largest expected significance for the signal is chosen.

Events are analyzed separately in the two distinct lepton flavor channels and the two—positive and negative—charge configurations. Because the signal process is expected to be enhanced in the \(\ell ^+\ell ^+\) configuration, while the background processes exhibit more symmetry between the two charges, the classification into the two charge configurations increases the sensitivity of the analysis.

7 Systematic uncertainties

Various sources of systematic uncertainties, experimental and theoretical, can be grouped into two categories. The first type changes the overall normalization of one or more processes, whereas the second one can change both the normalization and the shape of the final 1D classifier distribution. Their values and their correlation structure among the different data-taking periods and processes are described below.

The uncertainty in the integrated luminosity is 2.5 (2.3)% for the 2016 (2017) data-taking period [60, 61]. The two values are considered uncorrelated between the two years and are applied to all background processes estimated from simulation, as well as the signal.

The dominant source of experimental systematic uncertainty is associated with the method adopted for the estimation of nonprompt lepton contributions. A normalization uncertainty of 40 (25)% for the \(\text {e}^\pm {\upmu }^\pm \) (\({\upmu }^\pm {\upmu }^\pm \)) final state is applied to account for the observed variations in the performance of the background estimation method when applied to MC simulations. Variations in the misidentification rate as a function of \(p_{\mathrm {T}}\) and \(\eta \) of the leptons are included in addition to the uncertainties stemming from the kinematics of the event sample used to measure the lepton misidentification rate. These kinematic variations are estimated by varying the \(p_{\mathrm {T}}\) of the jets in this sample, leading to shape variations of the order of 5–10% for the final classifier. The overall normalization uncertainty is considered correlated between the years, but uncorrelated between the two flavor channels, whereas the shape uncertainties are considered fully uncorrelated between the years and flavor channels.

A 30% normalization uncertainty is applied to the “charge misid.” background in the \(\text {e}^\pm {\upmu }^\pm \) final state, covering the differences in the measurement of the charge misidentification rate in data and simulation. This uncertainty is treated as fully correlated between the years.

Normalization uncertainties for the main backgrounds estimated from simulation are derived in dedicated 3 (4) lepton control regions for the \(\text {W}\text {Z}\) (\(\text {Z}\text {Z}\)) processes. The scale factors for the \(\text {W}\text {Z}\) and \(\text {Z}\text {Z}\) processes are measured to be \(1.01\pm 0.16\) and \(0.97\pm 0.06\), respectively. The normalization uncertainties are estimated from the statistical uncertainty and purity of these control samples and their scale factors, and take values of 16 (6)% for the \(\text {W}\text {Z}\) (\(\text {Z}\text {Z}\)) process. A 50% normalization uncertainty is applied to all other simulated backgrounds, accounting for the theoretical uncertainties in the predicted cross sections and the lack of proper control samples in the data. The shape of the \(\text {W}\text {Z}\) process is allowed to vary between the shapes coming from the two event generators, powheg and MadGraph 5_amc@nlo, and the corresponding uncertainty is considered correlated between 2016 and 2017. The shape agreement in the kinematic observables between the prediction and observation is of the order of 5% in the \(\text {W}\text {Z}\) and \(\text {Z}\text {Z}\) control regions. It is assumed that the shape agreement is similar in all remaining simulation-derived backgrounds. Therefore a 5% shape uncertainty is applied to these components, which allows the shape of the final classifier to vary by up to 5% linearly and quadratically along the classifier distribution. The effect of variations in the renormalization and factorization scales is negligible for the most important background component, \(\text {W}\text {Z}\), and is therefore neglected. Uncertainties in the PDF sets are expected to play a small role compared to the uncertainties described above. Both the relevant generator level distributions such as the rapidity of the \(\text {W}\) boson and the observable kinematic variables used in the analysis are consistent between NNPDF sets v3.0 and v3.1. An additional complication in the estimation of the uncertainty in the PDFs arises because the standard procedures for evaluating such uncertainties are ill-defined in the case of the signal process. For instance, varying a PDF set by any number of replicas is an inadequate estimation of the modeling uncertainty that emerges because the two PDF terms of the separate hard scatters are factorized when simulating signal events. Therefore, such uncertainties are not considered. Rather, future measurements with larger data sets will allow the study of the production cross section differentially in observables that are sensitive to such nonfactorization effects.

The uncertainty in the pileup modeling is 1% in the total yield for all simulated backgrounds and the signal, and is assumed correlated among all flavors, charges, and years. No significant differences in the kinematics are observed because of the pileup modeling. The uncertainty in the \(\text {b}\) tagging is considerably smaller than the statistical uncertainty in the simulated samples and is therefore neglected. The acceptance effect of the uncertainty in the jet energy scale is 2% in the signal and the simulated background samples and is considered fully correlated among all channels.

The trigger efficiency uncertainty associated with the combination of single-lepton and dilepton triggers is 1–2%, whereas the uncertainty in the data-to-simulation scale factors for the “loose” lepton selection is 2%. These uncertainties are considered correlated among the flavor channels but uncorrelated between the years. The uncertainty in the “tight” lepton selection is 2–3%, and is considered correlated between the two years.

Any residual model dependence of the signal process is estimated by allowing the shape of the DPS \(\text {W}\text {W}\) process to vary between the pythia and herwig++ simulations. The corresponding variations in the final BDT classifier are small.

Finally, the statistical uncertainty arising from the limited number of events in the simulated samples is included independently for each bin of the final discriminant distribution for each final state and the two data-taking periods, and is treated as fully uncorrelated [62].

8 Statistical analysis and results

Results are obtained after combining all the background and signal processes in the two separate flavor configurations, \(\text {e}{\upmu }\) and \({\upmu }{\upmu }\), and two separate charge configurations, \(\ell ^+\ell ^+\) and \(\ell ^-\ell ^-\), resulting in four independent distributions of the final BDT classifier. The final maximum likelihood fit is performed simultaneously in these four distinct flavor and charge categories [63,64,65]. The classification of events into the two charge configurations increases the sensitivity of the analysis by 10%.

Systematic uncertainties are represented in the likelihood by individual nuisance parameters, and are profiled in the fit as described in Ref. [66]. The number of events in each bin of the final classifier distribution used to extract the signal is modeled as a Poisson random variable, with a mean value that is equal to the sum of signal and background contributions.

In total, 4921 events are observed in the four lepton-charge and flavor combinations. Table 2 summarizes the yields of the various background and signal components along with their associated total uncertainties after the ML fit (postfit).

Table 2 Postfit background and signal yields and their uncertainties, and the observed event counts in the four charge and flavor combinations. The uncertainties include both statistical and systematic components. The SHS \(\text {W}^{\pm }\text {W}^{\pm }\) and \(\text {W}\text {W}\text {W}\) contributions are grouped as the “Rare” background

Figure 2 shows the distribution of the final BDT classifier in the two charge configurations in the \(\text {e}{\upmu }\) channel in the upper row, and the two charge configurations in the \({\upmu }{\upmu }\) channel in the lower row, under the same scenario as in Table 2, i.e., postfit background and signal yields, together with the postfit total uncertainties.

Fig. 2
figure 2

Distribution of the final BDT classifier output for \(\text {e}{\upmu }\) (upper) and \({\upmu }{\upmu }\) (lower) final states, in the positive (left) and negative (right) charge configurations. Observed data are shown in black markers while the backgrounds and signal are shown in colored histograms with their postfit yields. The SHS \(\text {W}^{\pm }\text {W}^{\pm }\) and \(\text {W}\text {W}\text {W}\) contributions are grouped in the “Rare” background category. The bottom panels show the ratio of data to the sum of all background contributions in the black markers along with the signal shown using a red line. The band represents the postfit background uncertainty, which includes both the statistical and systematic components

Although the fit is performed with all the kinematic requirements applied, the following cross sections are quoted as inclusive production cross sections for DPS \(\text {W}\text {W}\). The kinematic acceptance, defined as the ratio of events having a same-charge electron-muon or dimuon pair from the \(\text {W}\) boson decays and passing the analysis-level kinematic selection to the total number of generated events, is measured using the pythia generator. In this definition, the leptons are used at the “dressed” level where the momentum of a lepton is defined by combining its pre-final-state radiation four-momentum with that of photons radiated within a cone defined by \({\varDelta }R= 0.1\) around the lepton. The kinematic acceptance is measured to be \(4.70\pm 0.02\,\text {(stat)} \pm 0.94\,\text {(model)} \) %. The model uncertainty accounts for the differences in acceptance measured using different PDF sets (NNPDF v3.0 and NNPDF v3.1), different pythia generator tunes (CUETP8M1 and CP5), and with different event generators (pythia and herwig++). This uncertainty is dominated by the differences seen between the pythia and herwig++ event generators.

The prediction of any DPS \(\text {W}\text {W}\) cross section suffers from large uncertainties. For the factorization approach from Eq. (1), the largest uncertainty comes from the imprecise knowledge of \(\sigma _{\text {eff}}\), which differs substantially between different measurements in different final states [17]. Any predicted cross section from an MC simulation, such as the one obtained from pythia also suffers from large uncertainties because of the tuning of generator parameters sensitive to the modeling of the underlying event. Although the kinematic observables are tested to be unaffected by these tuning parameters, the predicted cross section varies by as much as 50%. It is therefore essential to interpret any “predicted” number in the following, either from the factorization approach or from pythia, only as a rough estimate rather than a precisely derived quantity. Conversely, any observed cross section and the corresponding significance do not depend on the predicted cross section, but only on the kinematics of the MC generator. These limitations emphasize the importance of measuring the cross section of the DPS \(\text {W}\text {W}\) process from data.

For this analysis, two predicted cross sections are used. The pythia event generator with the CP5 tune gives a cross section of \(1.92\,\text {pb} \). Alternatively, using Eq. (1) with the highest order cross section for inclusive \(\text {W}\) boson production and decay at next-to-NLO accuracy in QCD and NLO in electroweak corrections [67, 68], \(189\pm 7 \,\text {nb} \), along with \(\sigma _{\text {eff}} = 20.7\pm 6.6 \,\text {mb} \) [17], results in an expected cross section for the inclusive DPS \(\text {W}\text {W}\) process of \(0.87\pm 0.28 \,\text {pb} \). The value for \(\sigma _{\text {eff}}\) is chosen as a representative number from a DPS cross section measurement based on a final state containing a \(\text {W}\) boson. A different choice of \(\sigma _{\text {eff}}\) would alter the prediction of the cross section from the factorization approach accordingly.

The following quantities are obtained from the simultaneous fit to the final BDT classifier in the four lepton charge and flavor combinations:

  • the expected significance assuming the signal process follows the pythia kinematics with the input cross section as \(\sigma _{\mathrm {DPS\,\text {W}\text {W},\,exp}}^{{\textsc {pythia}}}\);

  • the expected significance assuming the signal process exhibits pythia-like kinematics with a production cross section, \(\sigma _{\mathrm {DPS\,\text {W}\text {W},\,exp}}^{\text {factorized}}\), extracted based on the factorization approach using the inclusive \(\text {W}\) production cross section and value of \(\sigma _{\text {eff}}\) mentioned above;

  • the observed cross section \(\sigma _{\mathrm {DPS\,\text {W}\text {W},\,obs}}\) and the corresponding significance, assuming pythia-like kinematics, independent of the assumed cross section;

  • \(\sigma _{\text {eff}}\) using the inclusive \(\text {W}\) production cross section and \(\sigma _{\mathrm {DPS\,\text {W}\text {W},\,obs}}\).

A maximum likelihood fit is performed separately for different lepton charge configurations and their combination. The values obtained for the DPS \(\text {W}^{\pm }\text {W}^{\pm }\) cross section are then extrapolated to the inclusive \(\text {W}\text {W}\) phase space. Table 3 summarizes the numbers extracted from the maximum likelihood fit to the final classifier distribution for the combination of the \(\ell ^+\ell ^+\) and \(\ell ^-\ell ^-\) final states.

Table 3 Results obtained from the maximum likelihood fit to the final classifier distribution

The observed inclusive DPS \(\text {W}\text {W}\) production cross section is \(1.41\pm 0.28\,\text {(stat)} \pm 0.28\,\text {(syst)} \,\text {pb} \) with an observed significance of 3.9 standard deviations with respect to the background-only hypothesis. This value lies between the prediction from pythia, which gives a cross section of \(1.92\,\text {pb} \) with an expected significance of 5.4 standard deviations, and the one of the factorization approach, which predicts a cross section of \(0.87\,\text {pb} \) with an expected significance of 2.5 standard deviations.

The values of the inclusive DPS \(\text {W}\text {W}\) production cross sections, obtained from the positive and negative lepton charge configurations, along with their combination, are shown in Fig. 3. The expected values for \(\sigma _{\mathrm {DPS\,\text {W}\text {W}}}\), taken from pythia and the factorization approach, are also shown. The positive charge configuration results in a measured inclusive cross section of \(1.36\pm 0.33\,\text {(stat)} \pm 0.32\,\text {(syst)} \,\text {pb} \), whereas for the negative charge configuration the value is \(1.96\pm 0.54\,\text {(stat)} \pm 0.51\,\text {(syst)} \,\text {pb} \).

Fig. 3
figure 3

Observed cross section values for inclusive DPS \(\text {W}\text {W}\) production from the two lepton charge configurations and their combination. These values are obtained from the extrapolation of the observed DPS \(\text {W}^{\pm }\text {W}^{\pm }\) cross section to the inclusive \(\text {W}\text {W}\) case. The statistical and systematic uncertainties are shown using shaded bands. The predictions from pythia and from the factorization approach are represented with the red dotted and green dashed lines, respectively

A value of \(\sigma _{\text {eff}}\) is extracted from Eq. (1) in the following way. The SHS cross sections for inclusive \(\text {W}\) boson production are taken from theoretical calculations at next-to-NLO in QCD and NLO in electroweak corrections, as described before. These cross sections are then combined with the measured DPS \(\text {W}^{\pm }\text {W}^{\pm }\) cross section, extrapolated to the full \(\text {W}\text {W}\) phase space, to extract a value for \(\sigma _{\text {eff}}\). This procedure results in a value for \(\sigma _{\text {eff}}\) of \(12.7^{+ 5.0}_{- 2.9} \,\text {mb} \), consistent with previous measurements of this quantity from other final states [20]. This hybrid approach employed for calculating \(\sigma _{\text {eff}}\) using, on the one hand, a theoretical prediction and, on the other hand, the measured DPS \(\text {W}\text {W}\) cross section results from the following consideration. Because the statistical uncertainty dominates the measured \(\sigma _{\mathrm {DPS\,\text {W}\text {W}}}\) and the leading systematic uncertainties are specific to the \(\ell ^{\pm }\ell ^{\pm }\) final state, these would not cancel with the uncertainties in a measurement of the single \(\text {W}\) boson production cross section. Therefore, the benefit of measuring the single \(\text {W}\) boson production cross section to extract a fully experimental value for \(\sigma _{\text {eff}}\) is negligible.

9 Summary

A study of \(\text {W}\text {W}\) production from double-parton scattering (DPS) processes in proton-proton collisions at \(\sqrt{s} = 13\,\text {TeV} \) has been reported. The analyzed data set corresponds to an integrated luminosity of \(77.4{\,\text {fb}^{-1}} \), collected using the CMS detector in 2016 and 2017 at the LHC. The \(\text {W}\text {W}\) candidates are selected in same-charge electron-muon or dimuon events with moderate missing transverse momentum and low jet multiplicity. Multivariate classifiers based on boosted decision trees are used to discriminate between the signal and the dominant background processes. A maximum likelihood fit is performed to extract the signal cross section, which is compared to the predictions from simulation and from an approximate factorization approach. A measurement of the DPS \(\text {W}\text {W}\) cross section is achieved for the first time, and a cross section of \(1.41\pm 0.28\,\text {(stat)} \pm 0.28\,\text {(syst)} \,\text {pb} \) is extracted with an observed significance of 3.9 standard deviations. This cross section leads to an effective cross section parameter of \(\sigma _{\text {eff}}=12.7^{+ 5.0}_{- 2.9} \,\text {mb} \). The results in this paper constitute the first evidence for \(\text {W}\text {W}\) production from DPS.