1 Introduction

In high-energy hadron–hadron collisions, the vector bosons \(W\) and \(Z/\gamma ^*\) are produced via quark–antiquark annihilation [1], and can be observed with very small backgrounds by using their leptonic decay modes. The vector bosons have non-zero momentum transverse to the beam direction due to the emission of quarks and gluons from the initial-state partons as well as to the intrinsic transverse momentum of the initial-state partons in the proton. Phenomenologically, the spectrum at low transverse momentum of the Z boson, \(p_{\text {T}} ^{\ell \ell }\) , reconstructed through the decay into a pair of charged leptons, can be described using soft-gluon resummation [2,3,4,5,6,7] and non-perturbative models to account for the intrinsic transverse momentum of partons. At high \(p_{\text {T}} ^{\ell \ell }\) the spectrum can be calculated by fixed-order perturbative quantum chromodynamics (QCD) predictions [8,9,10,11,12], and next-to-leading-order electroweak (NLO EW) effects are expected to be important [13,14,15]. Parton-shower models [16,17,18] or resummation may be matched to fixed-order calculations to describe the full spectrum.

A precise measurement of the \(p_{\text {T}} ^{\ell \ell }\) spectrum provides an important input to the background prediction in searches for beyond the Standard Model (SM) processes, e.g. in the monojet signature [19], as well as to SM precision measurements. In particular, the measurement of the mass of the \(W\) boson [20] relies on the measurement of the \(p_{\text {T}} ^{\ell \ell }\) distribution to constrain the transverse momentum spectrum of the \(W\) boson, \(p_{\text {T}} ^{W}\) , since a direct measurement of the transverse momentum distribution of \(W\) bosons is experimentally challenging [21]. The \(p_{\text {T}} ^{\ell \ell }\) spectrum was measured previously in proton–proton (pp) collisions at the Large Hadron Collider (LHC) by the ATLAS Collaboration at centre-of-mass energies of \(\sqrt{s} = 7\) \(\text {Te}\text {V}\) and 8 \(\text {Te}\text {V}\) [22, 23], including several mass regions near and away from the \(Z\) -boson resonance. Related measurements were also made by the CMS [24,25,26,27,28] and the LHCb [29,30,31] collaborations at the LHC and by the CDF [32] and D0 [33, 34] collaborations in \(p\bar{p}\) collisions at the TeVatron.

Compared to measurements at lower \(\sqrt{s}\), \(Z\) -boson production at 13 \(\text {Te}\text {V}\) is characterized by a smaller parton momentum fraction of the colliding protons, leading to a different flavour composition and a larger phase space for hard QCD radiation. A precise measurement will test this energy dependence and play an important role in future studies of the \(W\) -boson mass using the 13 \(\text {Te}\text {V}\) data.

The granularity of the measurement in the low-\(p_{\text {T}} ^{\ell \ell }\) domain is limited by the lepton momentum resolution. To overcome this limitation, the \(\phi _\eta ^*\) observable was introduced [35] as an alternative probe of \(p_{\text {T}} ^{\ell \ell }\) . It is defined as

$$ \phi _\eta ^* = \tan \left( \frac{\pi -\Delta \phi }{2}\right) \times \sin (\theta ^*_\eta )\,, $$

where \(\Delta \phi \) is the azimuthal angle in radians between the two leptons. The angle \(\theta _\eta ^*\) is a measure of the scattering angle of the leptons relative to the proton beam direction in the rest frame of the dilepton system and is defined by \(\cos (\theta _\eta ^*) = \tanh [(\eta ^- - \eta ^+)/2]\), where \(\eta ^-\) and \(\eta ^+\) are the pseudorapiditiesFootnote 1 of the negatively and positively charged lepton, respectively. Therefore, \(\phi ^{*}_{\eta }\) depends exclusively on the directions of the two leptons, which are measured more precisely than their momenta.

In this paper, measurements of the \(p_{\text {T}} ^{\ell \ell }\) and the \(\phi ^{*}_{\eta }\) spectra are presented using pp collision data at \(\sqrt{s} =13\) \(\text {Te}\text {V}\) collected in 2015 and 2016 with the ATLAS detector, corresponding to an integrated luminosity of 36.1 fb\(^{-1}\). Both the dielectron and dimuon final states \(Z/\gamma ^* \rightarrow \ell \ell \) (\(\ell = e\) or \(\mu \)) are analysed in a dilepton mass window of \(m_{\ell \ell } = 66\)–116 \(\text {Ge}\text {V}\). The measurement is performed in a fiducial phase space that is close to the detector acceptance for leptons in transverse momentum \(p_{\text {T}} ^\ell \) and pseudorapidity \(\eta _\ell \).

2 The ATLAS detector

The ATLAS experiment uses a multipurpose detector [36,37,38] with a cylindrical geometry and almost \(4\pi \) coverage in solid angle. The collision point is surrounded by tracking detectors, collectively referred to as the inner detector (ID), followed by a superconducting solenoid providing a 2 T axial magnetic field, a calorimeter system and a muon spectrometer. The ID provides precise measurements of charged-particle tracks in the pseudorapidity range \(|\eta |<2.5\). It consists of three subdetectors arranged in a coaxial geometry around the beam axis: a silicon pixel detector, a silicon microstrip detector and a transition radiation tracker.

The electromagnetic calorimeter covers the region \(|\eta |<3.2\) and is based on a high-granularity, lead/liquid- argon (LAr) sampling technology. The hadronic calorimeter uses a steel/scintillator-tile detector in the region \(|\eta |<1.7\) and a copper/LAr detector in the region \(1.5<|\eta |<3.2\). The forward calorimeter (FCAL) covers the range \(3.2< |\eta | < 4.9\) and also uses LAr as the active material and copper or tungsten absorbers for the EM and hadronic sections, respectively.

The muon spectrometer (MS) consists of separate trigger and high-precision tracking chambers to measure the deflection of muons in a magnetic field generated by three large superconducting toroids arranged with an eightfold azimuthal coil symmetry around the calorimeters. The high-precision chambers cover a range of \(|\eta |<2.7\). The muon trigger system covers the range \(|\eta |<2.4\) with resistive-plate chambers in the barrel and thin-gap chambers in the endcap regions.

A two-level trigger system is used to select events in real time [39]. It consists of a hardware-based first-level trigger and a software-based high-level trigger. The latter employs algorithms similar to those used offline and is used to identify electrons and muons.

3 Analysis methodology

3.1 Description of the measurements

The \(Z\)-boson differential cross-sections are measured as a function of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) separately for the dielectron and dimuon decay channels. Only small background contributions are expected. The results are reported within a fiducial phase space chosen to be close to the experimental acceptance defined by the lepton transverse momenta \(p_{\text {T}} ^\ell > 27\) \(\text {Ge}\text {V}\), the absolute lepton pseudorapidity \(|\eta _\ell | < 2.5\) and the dilepton invariant mass \(m_{\ell \ell } = 66\)–116 \(\text {Ge}\text {V}\).

The lepton kinematics can be described at different levels regarding the effects of final-state photon radiation (QED FSR). Cross-sections at Born level employ the lepton kinematics before QED FSR, while the bare level is defined by leptons after emission of QED FSR. A dressed lepton is defined by combining the bare four-momenta of each lepton with that of QED FSR photons radiated from the lepton within a cone of size \(\Delta R = 0.1\) around the lepton. The results in this paper are reported at the dressed and Born levels.

The differential cross-sections in \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) are measured and their normalized spectra derived. The total systematic uncertainty of the latter is significantly reduced due to large correlations in many sources of uncertainty between the measurement bins.

3.2 Simulated event samples

Events simulated by Monte Carlo (MC) generators are used to predict the detector response to the signal process in order to correct the data for detector inefficiencies and resolution as well as to estimate most of the background from processes other than \(Z/\gamma ^* \rightarrow \ell \ell \) in the selected data sample.

The \(Z/\gamma ^* \rightarrow \ell \ell \) signal process was generated with the Powheg-Box V1 MC event generator [40,41,42,43] at next-to-leading order in \(\alpha _{\text {S}} \) interfaced to Pythia 8.186  [17] for the modelling of the parton shower, hadronization, and underlying event, with parameters set according to the AZNLO tune [22]. The CT10 (NLO) set of parton distribution functions (PDF)  [44] was used for the hard-scattering processes, whereas the CTEQ6L1 PDF set [45] was used for the parton shower. The effect of final-state photon radiation was simulated with Photos++ v3.52 [46, 47]. The EvtGen v1.2.0 program [48] was used to decay bottom and charm hadrons.

Powheg +Pythia 8 was also used to simulate the majority of the background processes considered. The \(Z \rightarrow \tau \tau \) and the diboson processes WW, WZ and ZZ [49] (requiring \(m_{\ell \ell }>4\) \(\text {Ge}\text {V}\) for any pair of same-flavour opposite-charge leptons) used the same tune and PDF as the signal process. The \(t\bar{t}\) and single-top-quark [50, 51] backgrounds to the dielectron channel were simulated with Powheg +Pythia 6 [52] with the P2012 tune [53] and CT10 PDF, while for the dimuon channel Powheg +Pythia 8 with the A14 tune [54] and the NNPDF3.0 PDF [55] was used. It was found that the prediction of the \(t\bar{t}\) background is in very good agreement for both generators. The photon-induced background \(\gamma \gamma \rightarrow \ell \ell \) was generated with Pythia 8 using the NNPDF2.3 QED PDF [56].

The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) was modelled by overlaying the hard-scattering event with simulated minimum-bias events generated with Pythia 8.186 using the MSTW2008LO set of PDFs [57] and the A2 tune [58]. The simulated event samples were reweighted to describe the distribution of the number of pile-up interactions in the data, and further reweighted such that the distribution of the longitudinal position of the primary pp collision vertex matches that in data. The primary vertex is defined as the vertex with at least two reconstructed tracks with \(p_{\text {T}} >0.4\) GeV and with the highest sum of squared transverse momenta of associated tracks. The Geant 4 program was used to simulate the passage of particles through the ATLAS detector [59, 60]. The simulated events are reconstructed with the same analysis procedure as the data. The reconstruction, trigger and isolation efficiencies as well as lepton momentum scale and resolution in the MC simulation are corrected to match those determined in data [61,62,63].

3.3 Event selection

Candidate Z \(\rightarrow ee\) events are triggered requiring at least one identified electron with \(p_{\text {T}} > 24\) \(\text {Ge}\text {V}\) in 2015 and \(p_{\text {T}} > 26\) \(\text {Ge}\text {V}\) in 2016 data [64]. In addition to the increased \(p_{\text {T}} \) threshold, the electron also has to satisfy isolation criteria in the 2016 data. Candidate \(Z \rightarrow \mu \mu \) events were recorded with triggers that require at least one isolated muon with \(p_{\text {T}} > 20\) \(\text {Ge}\text {V}\) in 2015 and \(p_{\text {T}} > 26\) \(\text {Ge}\text {V}\) in 2016 data.

Electron candidates are reconstructed from clusters of energy in the electromagnetic calorimeter matched to ID tracks [62]. They are required to have \(p_{\text {T}} > 27\) \(\text {Ge}\text {V}\) and \(|\eta |<2.47\) (excluding the transition regions between the barrel and the endcap electromagnetic calorimeters, \(1.37<|\eta |<1.52\)). Electron candidates are required to pass the ‘medium‘ identification requirement, and are also required to be isolated according to the ‘gradient’ isolation criterion [62].

Muon candidates are reconstructed by combining tracks reconstructed in the inner detector with tracks reconstructed in the MS [61]. They are required to have \(p_{\text {T}} > 27\) \(\text {Ge}\text {V}\) and \(|\eta |<2.5\) and satisfy identification criteria corresponding to the ‘medium’ working point [61]. Track quality requirements are imposed to suppress backgrounds, and the muon candidates are required to be isolated according to the ‘gradient’ isolation criterion [61], which is \(p_{\text {T}}\)- and \(\eta \)-dependent and based on the calorimeter and track information.

Electron and muon candidates are required to originate from the primary vertex. Thus, the significance of the track’s transverse impact parameter calculated relative to the beam line, \(|d_0/\sigma _{d_{0}}|\), must be smaller than 3.0 for muons and less than 5.0 for electrons. Furthermore, the longitudinal impact parameter, \(z_0\) (the difference between the z-coordinate of the point on the track at which \(d_0\) is defined and the longitudinal position of the primary vertex), is required to satisfy \(|z_0 \cdot \sin (\theta )|<\) 0.5 mm.

Events are required to contain exactly two same-flavour leptons passing the lepton selection. The two leptons must be of opposite electric charge and their invariant mass must satisfy \(66< m_{\ell \ell } < 116\) \(\text {Ge}\text {V}\). No additional veto on the presence of leptons of different flavour is applied. Table 1 shows the number of events satisfying the above selection criteria in the electron channel and the muon channel. Also given are the estimated contributions from the background sources described below in Section 3.4.

Table 1 Selected signal candidate events in data for both decay channels as well as the expected background contributions including their total uncertainties

3.4 Estimation of backgrounds

The backgrounds from all sources other than multijet processes are estimated using the MC samples detailed in Sect. 3.2. The number and properties of the background events where one or two reconstructed lepton candidates originate from hadrons or hadron decay products, i.e. multijet processes as well as W+jets, are estimated using the data-driven techniques described in the following for both decay channels.

In the electron channel, a multijet-dominated sample is selected from data with two same-charge electron candidates satisfying the ‘loose’ identification criteria, but not the ‘medium’ criteria [62], i.e. they are more likely to be caused by misidentified jets. This sample is collected by a di-electron trigger without isolation criteria [64]. In the muon channel, a multijet sample is obtained by selecting two same-charge muons. The residual contamination from processes with prompt leptons is estimated using the simulation and subtracted.

The normalization of the multijet template in the electron channel is determined in a fit to the distribution of the electron isolation using all event-selection criteria except those for the isolation variables. Systematic uncertainties in the normalization are estimated by varying the fit range on the electron isolation distribution.

In the muon channel, the normalization is obtained using the ratio of number of opposite-charge dimuon events to the number of same-charge dimuon events where the muons fail to satisfy the isolation criterion. Assuming no correlation between the isolation of muons in multijet events and their charge, this ratio can be applied to a control sample, defined by pairs of isolated same-charge muons passing the signal-kinematic selection, to determine the multijet contamination in the signal region. The systematic uncertainty in the estimate is obtained by varying the isolation criterion for the muons.

The total fraction of selected data events originating from background processes is about 0.6% in both the electron and muon channels. The background is dominated by contributions from diboson and \(t\bar{t}\) processes. An overview of the estimated number of background events is given in Table 1, together with the corresponding total uncertainties.

Figure 1 shows the dilepton invariant mass and the lepton pseudorapidity distribution, for the electron and muon channels separately. The predictions are in fair agreement with the data. The impact of the residual differences between these distributions on the \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) measurements is estimated by reweighting the MC signal sample to data and then repeating the measurement procedure. Figure 2 compares the measured \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) distributions for both channels with the signal MC predictions. The disagreement between the data and the predictions for large values of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) is expected because Powheg +Pythia 8 is effectively a computation at leading-order in \(\alpha _{\text {S}} \) in this region.

Fig. 1
figure 1

The distribution of events passing the selection requirements in the electron channel (left) and muon channel (right) as a function of dilepton invariant mass \(m_{\ell \ell }\) (upper row) and lepton pseudorapidity \(\eta \) (lower row), the latter with one entry for each lepton per event. The MC signal sample is simulated using Powheg +Pythia 8. The statistical uncertainties of the data points are generally smaller than the size of the markers. The predictions of the MC signal sample together with the MC background samples are normalized to the integral of the data and the total experimental uncertainty of the predicted values is shown as a grey band in the ratio of the prediction to data

Fig. 2
figure 2

The distribution of events passing the selection requirements in the electron channel (left) and muon channel (right) as a function of dilepton transverse momentum (upper row) and \(\phi ^{*}_{\eta }\) (lower row). The MC signal sample is simulated using Powheg +Pythia 8. The statistical uncertainties of the data points are generally smaller than the size of the markers. The predictions are normalized to the integral of the data and the total experimental uncertainty of the predicted values is shown as a grey band in the ratio of the prediction to data

3.5 Correction for detector effects

The production cross-section times the branching ratio for decays into a single lepton flavour are measured in fiducial volumes as defined in Sect. 3.1. Integrated fiducial cross-sections in the electron and muon channels are computed following the equation

$$\begin{aligned} \sigma ^\mathrm {fid}_{Z/\gamma ^* \rightarrow \ell \ell } = \frac{N_\mathrm {Data}-N_\mathrm {Bkg}}{C_Z \cdot L}, \end{aligned}$$

where \(N_\mathrm {Data}\) is the number of observed signal candidates and \(N_\mathrm {Bkg}\) is the number of background events expected in the selected sample. The integrated luminosity of the sample is \(L = 36.1\) fb\(^{-1}\). A correction for the event detection efficiency is applied with the factor \(C_Z\), which is obtained from the simulation of signal events as the ratio of the sum of event weights after simulation, reconstruction and selection, to the sum of MC event weights for events satisfying the fiducial requirements. The factor \(C_Z\) is affected by experimental uncertainties, described in Sect. 4, while theory and modelling uncertainties are negligible.

The differential distributions within the fiducial volume are corrected for detector effects and bin-to-bin migrations using an iterative Bayesian unfolding method [65,66,67]. First, the data are corrected for events that pass the detector-level selection but not the particle-level selection. Then, the iterative Bayesian unfolding technique is used as a regularized way to correct for the detector resolution in events that pass both the detector-level and particle-level selections. The method is applied with four iterations implemented in the RooUnfold framework [67]. After the application of the response matrix, a final correction is applied to account for events that pass the particle-level but not detector-level selection, resulting in unfolded distributions on Born and dressed particle level. The response matrices, which connect the distributions at reconstruction and particle level, as well as the correction factors are derived using the Powheg +Pythia signal MC sample.

4 Statistical and systematic uncertainties

Uncertainties in the measurement are assessed for each aspect of the analysis, including the background subtraction, event detection efficiencies, response matrix, and unfolding method. The entire analysis procedure is repeated for each systematic uncertainty. Each source of uncertainty is varied to estimate the effect on the final result.

The effect on the measurement from the size of the data and MC samples is estimated by generating pseudo-experiment variations of the respective samples. The resulting statistical uncertainties are considered as uncorrelated between bins and between channels.

Uncertainties in the scale and resolution of the electron energy scale [63] and muon momentum scale [61] are among the dominant uncertainties in the \(p_{\text {T}} ^{\ell \ell }\) measurement. Furthermore, uncertainties related to lepton reconstruction and selection efficiencies are considered [39, 61, 62, 64], covering the lepton identification, reconstruction, isolation, triggering and track-to-vertex matching processes. The lepton related systematic uncertainties have only a small statistical component. There is an additional uncertainty in the muon channel to cover charge-dependent biases in the muon momentum measurement. The majority of these experimental uncertainties are considered correlated between bins of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) . An exception are the components of the reconstruction and identification efficiencies which have a significant statistical component due to the limited number of events in the data samples used to derive the efficiency corrections. Uncertainties related to electron or muon reconstruction and identification are always assumed to be uncorrelated with each other. They dominate the uncertainty in the fiducial cross-section measurement.

The uncertainties in the MC background estimates are obtained by independently varying the theory cross-sections used to normalize the corresponding samples and observing the effect on the measured \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) cross-sections. These background uncertainties are considered correlated between bins of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) and between the electron and muon channels. As described in Sect. 3.4, the uncertainty in the multijet background in the electron channel is obtained by changing the input range of the template used to estimate the multijet background. For the muon channel, the measurement is performed again with a modified isolation variable used in the normalization procedure. The differences between the nominal and modified measurements are used as uncertainty. The estimated multijet backgrounds are assumed to be uncorrelated between the channels.

An uncertainty is derived to cover the mis-modelling of the simulated pile-up activity following the measurement of the cross-section of inelastic pp collisions [68]. Also, the uncertainty in modelling the distribution of the longitudinal position of the primary vertex is considered. These uncertainties are treated as correlated between the electron channel and muon channel.

The uncertainty from the unfolding method is determined by repeating the procedure with a different simulation where the nominal particle-level spectrum is reweighted so that the simulated detector-level spectrum is in good agreement with the data. The modified detector-level distribution is unfolded with the nominal response matrix and the difference between the result and the reweighted particle-level spectrum is taken as the bias of the unfolding method due to the choice of prior. The closure of the unfolding procedure is also tested using the generator-level distributions of the Sherpa MC sample described in Sect. 5.2, where consistent results within the assigned unfolding uncertainties are found.

The uncertainty from the choice of PDF used in the signal MC generator is evaluated by reweighting the signal MC simulation to the 52 error sets of the CT10 PDF set and computing the resulting variation of the results [44, 69]. The differences found in this way are negligible, similar to scale-choice uncertainties. The uncertainty in the combined 2015–2016 integrated luminosity is 2.1% [70], obtained using the LUCID-2 detector [71] for the primary luminosity measurements. This uncertainty only applies to the absolute cross-section measurements.

Table 2 Overview of the detector efficiency correction factors, \(C_Z\), for the electron and muon channels and their systematic uncertainty contributions

The experimental uncertainties of \(C_Z\) for the integrated fiducial cross-section measurements in the electron and muon channels are summarized in Table 2. The electron identification efficiency and muon reconstruction efficiency contribute a large fraction of the total systematic uncertainty for both the integrated and absolute differential measurements. These uncertainties are greatly reduced for the normalized measurement of differential distributions. A summary of the uncertainties in the normalized differential cross-section measurements is provided in Fig. 3 as a function of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) for both decay channels. The statistical uncertainties for the electron and muon channel measurements are a combination of the uncertainties due to limited data and MC sample sizes. The systematic uncertainties are divided into categories and originate from lepton scales and resolutions, reconstruction and identification efficiencies, as well as the MC signal modelling in the unfolding procedure and further smaller uncertainty sources such as the subtraction of background contributions. These smaller contributions are summarized as “other” uncertainties.

Fig. 3
figure 3

The systematic uncertainties for the electron channel measurement (left) and muon channel measurement (right) for the normalized \(p_{\text {T}} ^{\ell \ell }\) (upper row) and normalized \(\phi ^{*}_{\eta }\) (lower row). The statistical uncertainties are a combination of the uncertainties due to limited data and MC sample sizes. The \(p_{\text {T}} ^{\ell \ell } {}\) distribution is split into linear and logarithmic scales at 30 \(\text {Ge}\text {V}\). Some uncertainties are larger than 2% for \(p_{\text {T}} ^{\ell \ell } {}>200\) GeV and hence cannot be displayed. The corresponding uncertainties are also summarized in Table 4

5 Results and discussion

5.1 Combination

The fiducial cross-sections measured in the \(Z/\gamma ^* \rightarrow ee\) and \(Z/\gamma ^* \rightarrow \mu \mu \) channels are presented in Table 3 including statistical, systematic and luminosity uncertainties. When correcting for the more restrictive fiducial volume definition, these results are in good agreement with the previous ATLAS measurements at 13 \(\text {Te}\text {V}\) [72]. The electron- and muon-channel cross-sections are combined using \(\chi ^2\) minimization, following the best linear unbiased estimator prescription (Blue) [73,74,75]. The combination is performed on Born level, resulting in a combined cross-section of \(\sigma _\mathrm {fid} (pp\rightarrow Z/\gamma ^* \rightarrow \ell \ell ) =\) 736.2 ± 0.2(stat) ± 6.4 (sys) ± 14.7 (lumi) pb (Table 3)Footnote 2. There is a reduction of the uncertainty compared to individual electron- and muon-channel measurements since the dominant detector-related systematic uncertainty sources are largely uncorrelated. The uncertainties due to pile-up, physics modelling and luminosity are treated as correlated between the two decay channels.

Table 3 Measured integrated cross-section in the fiducial volume in the electron and muon decay channels at Born level and their combination as well as the theory prediction at NNLO in \(\alpha _{\text {S}} \) using the CT14 PDF set

The normalized differential cross-sections \(1/\sigma _\mathrm {fid} \times \mathrm {d}\sigma _\mathrm {fid}/\mathrm {d}p_{\text {T}} ^{\ell \ell } {}\) and \(1/\sigma _\mathrm {fid} \times \mathrm {d}\sigma _\mathrm {fid}/\mathrm {d}\phi ^{*}_{\eta } {}\) measured in the two decay channels as well as their combination are illustrated in Fig. 4. When building the \(\chi ^2\) for combination procedure, the measurement uncertainties are separated into those from bin-to-bin uncorrelated sources and those from bin-to-bin correlated sources, the latter largely reduced due to the normalization by the fiducial cross-section. The normalized differential measurements are combined at Born level following the Blue prescription. The resulting \(\chi ^2/N_\mathrm {dof} = 47/44\) for the combination for \(p_{\text {T}} ^{\ell \ell }\) and the \(\chi ^2/N_\mathrm {dof} = 32/36\) for \(\phi ^{*}_{\eta }\) indicate good agreement between the two channels.Footnote 3 The combined precision is between 0.1% and 0.5% for \(p_{\text {T}} ^{\ell \ell } {} < 100\) \(\text {Ge}\text {V}\), rising to 10% towards the high end of the spectrum, where the overall precision is limited by the data and MC sample size. The combined results for both distributions are presented in Table 4 including statistical and bin-to-bin uncorrelated and correlated systematic uncertainties. The measurement results are reported at Born level and factors \(k_\mathrm {dr}\), the binwise ratio of dressed and born level results, are given to transfer to the dressed particle level.

Table 4 The measured combined normalized differential cross-sections, divided by the bin-width, in the fiducial volume at Born level as well as a factor \(k_\mathrm {dr}\) to translate from the Born particle level to the dressed particle level
Fig. 4
figure 4

The measured normalized cross section as a function of \(p_{\text {T}} ^{\ell \ell } {}\) (left) and \(\phi ^{*}_{\eta }\) (right) for the electron and muon channels and the combined result as well as their ratio together with the total uncertainties, shown as a blue band. The pull distribution between the electron and muon channels, defined as the difference between the two channels divided by the combined uncorrelated uncertainty, is also shown. The \(p_{\text {T}} ^{\ell \ell } {}\) distribution is split into linear and logarithmic scales at 30 \(\text {Ge}\text {V}\)

5.2 Comparison with predictions

The integrated fiducial cross-section is compared with a fixed-order theory prediction that is computed in the same way as in Ref. [76]. The speed-optimized DYTurbo [77] version of the DYNNLO 1.5 [10] program with the CT14 NNLO set of PDFs [78] is used to obtain a prediction at next-to-next-to-leading order (NNLO) in \(\alpha _{\text {S}} \) in the \(G_\mu \) EW scheme [79]. The FEWZ 3.1 [9] program is used to compute next-to-leading-order (NLO) electroweak corrections and to cross-check the DYNNLO calculation. The prediction is shown in Table 3 together with its uncertainties estimated as follows. The dominant uncertainty is from limited knowledge of the proton PDFs and is estimated using the eigenvectors of the respective CT14 PDF set, rescaled from 90% to 68% confidence level. The uncertainties due to the strong coupling constant are estimated by varying \(\alpha _{\text {S}} \) by \(\pm 0.001\). Missing higher-order QCD corrections are estimated by variations of the renormalization (\(\mu _\textsc {r} \)) and factorization (\(\mu _\textsc {f} \)) scales by factors of two with an additional constraint of \(0.5\le \mu _\textsc {r}/\mu _\textsc {f} \le 2\) around the nominal value of \(m_{\ell \ell }\). The deviation from the FEWZ calculation is taken as an intrinsic uncertainty in the NNLO QCD calculation. A more detailed discussion of the agreement with theory predictions using different PDF sets is given in Ref. [72].

Fig. 5
figure 5

Comparison of the normalized \(p_{\text {T}} ^{\ell \ell }\) (left) and \(\phi ^{*}_{\eta }\) (right) distributions predicted by different computations: Pythia 8 with the AZ tune, Powheg +Pythia 8 with the AZNLO tune, Sherpa v2.2.1 and RadISH with the Born level combined measurement. The uncertainties of the measurement are shown as vertical bars and uncertainties of the Sherpa and RadISH predictions are indicated by the coloured bands

The differential measurements are compared with a variety of predictions of the \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta } \) spectra that are based on different theoretical approaches to take into account both the soft and hard emissions from the initial state (ISR). Unless stated otherwise, the predictions do not consider NLO EW effects. The comparisons between the combined result corrected to QED Born level and the various predictions are shown in Figs. 5 and 6. Systematic uncertainties in the theoretical predictions are evaluated for this comparison where feasible.

The first prediction is obtained from Pythia 8 with matrix elements at LO in \(\alpha _{\text {S}} \) supplemented with a parton shower with the AZ set of tuned parameters [22]. The AZ tune optimized the intrinsic \(k_\mathrm {T}\) and parton shower ISR parameters to optimally describe the ATLAS 7 \(\text {Te}\text {V}\) \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) data [22, 80]. It was later used in the measurement of the W-boson mass using 7 \(\text {Te}\text {V}\) data [20], which requires a high-precision description of the W-boson transverse momentum spectrum at low \(p_{\text {T}}\).

The second prediction is based on Powheg +Pythia 8 using NLO matrix elements with the Pythia 8 parton shower parameters set according to the AZNLO tune [22] derived using the same data as the Pythia 8 AZ tune. The predictions using the AZ and AZNLO tunes describe the 13 \(\text {Te}\text {V}\) data to within 2–4% in the region of low \(p_{\text {T}} ^{\ell \ell } < 40\) \(\text {Ge}\text {V}\) and \(\phi ^{*}_{\eta } < 0.5\), and in this region the prediction using the Pythia 8 AZ tune is the one that agrees best with the data. This shows that predictions based on tunes to 7 \(\text {Te}\text {V}\) collision data can also provide a good description at significantly higher centre-of-mass energies for low \(p_{\text {T}} ^{\ell \ell }\) . At high \(p_{\text {T}} ^{\ell \ell }\) these predictions are well below the data due to missing higher-order matrix elements, similar to the situation observed at lower \(\sqrt{s}\).

The third prediction is simulated with the Sherpa  v2.2.1 [18] generator. In this set-up, NLO-accurate matrix elements for up to two partons, and LO-accurate matrix elements for up to four partons are calculated with the Comix [81] and OpenLoops [82, 83] libraries. The default Sherpa parton shower [84] based on Catani–Seymour dipole factorisation and the cluster hadronization model [85] is used with the dedicated set of tuned parameters developed by the authors for the NNPDF3.0nnlo PDF set [55]. The NLO matrix elements of a given parton multiplicity are matched to the parton shower using a colour-exact variant of the MC@NLO algorithm [86]. Different parton multiplicities are then merged into an inclusive sample using an improved CKKW matching procedure [87, 88] which is extended to NLO accuracy using the MEPS@NLO prescription [89]. The merging threshold is set to 20 \(\text {Ge}\text {V}\). Uncertainties from missing higher orders are evaluated [90] using seven variations of the QCD factorization and renormalization scales in the matrix elements by factors of 0.5 and 2, avoiding variations in opposite directions. For the computation of uncertainties in the normalized spectra the effect of a certain variation is fully correlated across the full spectrum and an envelope of all variations is taken at the end, which results in uncertainties of 3–4% at low \(p_{\text {T}} ^{\ell \ell }\) and up to 25% at high \(p_{\text {T}} ^{\ell \ell }\) . The effects of uncertainties in the PDF set are evaluated using 100 replica variations and are found to be very small, typically \(<1\%\) up to \(p_{\text {T}} ^{\ell \ell } < 100\) \(\text {Ge}\text {V}\) and a few percent above. Sherpa does describe the data in the high \(p_{\text {T}} ^{\ell \ell } > 30\) \(\text {Ge}\text {V}\) and \(\phi ^{*}_{\eta } > 0.1\) region to within about 4% up to the point where statistical uncertainties in the data exceed that level, which is better than the uncertainty estimate obtained from scale variations. On the other hand, the Sherpa prediction disagrees with the shape of the data at low \(p_{\text {T}} ^{\ell \ell } < 25\) \(\text {Ge}\text {V}\) and somewhat less with the \(\phi ^{*}_{\eta }\) distribution. The data may be useful in improving the parton shower settings in this regime.

Finally a prediction based on the RadISH program [91, 92] is presented that combines a fixed-order NNLO prediction of Z+jet production (\(O(\alpha _{\text {S}} ^3)\)) from NNLOjet [93] with resummation of \(\log (m_{\ell \ell }/p_{\text {T}} ^{\ell \ell })\) terms at next-to-next-to-next-to-leading-logarithm (N\(^3\)LL) accuracy [7]. The NNPDF3.1nnlo set of PDFs [94] is used with QCD scales set to \(\mu _\textsc {r} =\mu _\textsc {f} =\sqrt{(m_{\ell \ell })^2+(p_{\text {T}} ^{\ell \ell })^2}\) and the resummation scale set to \(Q = m_{\ell \ell }/2\). Uncertainties in this prediction are derived from variations of \(\mu _\textsc {r} \) and \(\mu _\textsc {f} \) in the same way as for the Sherpa prediction described above and, in addition, two variations of Q by a factor of two up and down, assuming that the effects of scale variations are fully correlated across the full spectrum. Within the uncertainties of typically 1–3% the RadISH prediction agrees with the data over the full spectrum of \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) , apart from a small tension in the very low \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) region where non-perturbative effects are relevant, highlighting the benefits of this state-of-the-art prediction.

Fig. 6
figure 6

Comparison of the normalized \(p_{\text {T}} ^{\ell \ell }\) distribution in the range \(p_{\text {T}} ^{\ell \ell } > 10\) \(\text {Ge}\text {V}\). The Born level combined measurement is compared with predictions by Sherpa v2.2.1, fixed-order NNLOjet and NNLOjet supplied with NLO electroweak corrections. The uncertainties in the measurement are shown as vertical bars and the uncertainties in the predictions are indicated by the coloured bands

Figure 6 compares the \(p_{\text {T}} ^{\ell \ell }\) measurement with predictions in the range of \(p_{\text {T}} ^{\ell \ell } > 10\) \(\text {Ge}\text {V}\). In addition to the Sherpa prediction described above, the data are compared with the fixed-order NNLOjet prediction described above both with and without NLO EW corrections [13]. The NNLOjet prediction is only expected to describe the data at sufficiently large \(p_{\text {T}} ^{\ell \ell } {} \gtrsim 15\,\text {Ge}\text {V}\), while deviations for smaller values are expected due to large logarithms \(\ln (m_{\ell \ell }/p_{\text {T}} ^{\ell \ell })\) [93]. At the highest \(p_{\text {T}} ^{\ell \ell }\) values probed, the application of NLO EW leads to a suppression of up to \(20\%\) due to large Sudakov logarithms. The theoretical uncertainties on these corrections are not shown, but have been elsewhere estimated to be up to 5% for \(p_{\text {T}} ^{\ell \ell } {} \approx 1\,\text {Te}\text {V}\) [15]. In this region, NNLOjet without NLO EW corrections is generally above the data, and when including these corrections it tends to be lower than the data. However, the difference is not significantly larger than the uncertainties in the measurements.

6 Conclusion

Measurements of the \(Z/\gamma ^* \rightarrow ee\) and \(Z/\gamma ^* \rightarrow \mu \mu \) cross-sections, differential in the transverse momentum and \(\phi ^{*}_{\eta }\) , have been performed in a fiducial volume defined by \(p_{\text {T}} ^\ell > 27\) \(\text {Ge}\text {V}\), \(|\eta _\ell | < 2.5\) and \(66< m_{\ell \ell } < 116~\text {Ge}\text {V}\), using 36.1 fb\(^{-1}\)of data from proton–proton collisions recorded in 2015 and 2016 at a centre-of-mass energy of 13 \(\text {Te}\text {V}\) with the ATLAS experiment at the LHC. This data-set allows coverage of a kinematic range up to the \(\text {Te}\text {V}\)-range. The cross-section results from the individual channels were combined and good agreement between the two was observed. The relative precision of the combined result is better than 0.2% for \(p_{\text {T}} ^{\ell \ell } {}<30\,\text {Ge}\text {V}\), which provides crucial information to validate and tune MC event generators and will constrain models of vector-boson production in future measurements of the W-boson mass.

The integrated fiducial cross-section measurements are compared with fixed-order perturbative QCD predictions. Differential spectra in \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) are compared with a selection of calculations implementing resummation and non-perturbative effects through parton showers or analytic calculations. The predictions based on the Pythia 8 parton shower with parameters tuned to 7 \(\text {Te}\text {V}\) data are found to describe the 13 \(\text {Te}\text {V}\) data well at low \(p_{\text {T}} ^{\ell \ell }\) and \(\phi ^{*}_{\eta }\) . The Sherpa prediction based on merging of higher-order, high-multiplicity matrix elements gives an excellent description of the data at high \(p_{\text {T}} ^{\ell \ell } \), while the very accurate RadISH NNLO+N\(^3\)LL prediction agrees with data for the full spectrum. The fixed-order NNLOjet prediction with and without NLO EW effects describes the data well for high \(p_{\text {T}} ^{\ell \ell } \).