Modeling the Uncertainties of Solar System Ephemerides for Robust Gravitational-wave Searches with Pulsar-timing Arrays

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and

Published 2020 April 21 © 2020. The American Astronomical Society. All rights reserved.
, , Citation M. Vallisneri et al 2020 ApJ 893 112 DOI 10.3847/1538-4357/ab7b67

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/893/2/112

Abstract

The regularity of pulsar emissions becomes apparent once we reference the pulses' times of arrivals to the inertial rest frame of the solar system. It follows that errors in the determination of Earth's position with respect to the solar system barycenter can appear as a time-correlated bias in pulsar-timing residual time series, affecting the searches for low-frequency gravitational waves performed with pulsar-timing arrays. Indeed, recent array data sets yield different gravitational-wave background upper limits and detection statistics when analyzed with different solar system ephemerides. Crucially, the ephemerides do not generally provide usable error representations. In this article, we describe the motivation, construction, and application of a physical model of solar system ephemeris uncertainties, which focuses on the degrees of freedom (Jupiter's orbital elements) most relevant to gravitational-wave searches with pulsar-timing arrays. This model, BayesEphem, was used to derive ephemeris-robust results in NANOGrav's 11 yr stochastic-background search, and it provides a foundation for future searches by NANOGrav and other consortia. The analysis and simulations reported here suggest that ephemeris modeling reduces the gravitational-wave sensitivity of the 11 yr data set and that this degeneracy will vanish with improved ephemerides and with pulsar-timing data sets that extend well beyond a single Jovian orbital period.

Export citation and abstract BibTeX RIS

1. Introduction

Pulsar timing exploits the remarkable regularity of millisecond-pulsar emissions to extract accurate system parameters from time-of-arrival (TOA) data sets (Lorimer & Kramer 2012), by fitting precise timing models that account for all pulse delays and advances, from generation near the neutron stars to detection at the radio telescopes (Lommen & Demorest 2013). The largest time-dependent term in the model is the Rømer delay (Rømer 1676) due to the motion of Earth around the solar system barycenter (SSB), with magnitude ∼500 s. Solar system ephemerides (SSEs), such as those produced by the Jet Propulsion Laboratory (JPL; see Folkner et al. 2009, 2014, 2016; Folkner & Park 2016, 2018), are used to convert observatory TOAs to the notional coordinate time of an inertial frame centered at the SSB. It follows that errors in our estimate of Earth's trajectory around the SSB produce a time-dependent bias in the TOAs.

Throughout many years of pulsar-timing studies and discoveries, SSE errors were always considered negligible compared to all other sources of noise and uncertainty (Foster & Backer 1990a; Edwards et al. 2006a). TOAs are now being collected with ever greater time spans and timing precisions, especially so by the pulsar-timing array (PTA) collaborations seeking to detect gravitational waves (GWs) as correlated residuals (i.e., TOAs minus timing model) in multipulsar data sets (Sazhin 1978; Detweiler 1979; Foster & Backer 1990b; Hobbs 2013; McLaughlin 2013; Desvignes et al. 2016; Verbiest et al. 2016). The estimated magnitude of the GW signature is ∼100 ns or less, leading to the recent suggestion that SSE errors could measurably bias GW results obtained from these data sets (Tiburzi et al. 2016). Indeed, while SSE corrections and GWs have different spatial-correlation signatures across multiple pulsars, the two may yet be degenerate when sampled with a limited number of PTA pulsars (Roebber 2019).

Within the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), we encountered this bias first hand in early 2017, when we observed that switching between the more recent SSEs issued by JPL (specifically DE421, DE430, DE435, and DE436: Folkner et al. 2009, 2014, 2016; Folkner & Park 2016) led to discrepant results (GW upper limits and detection statistics) in the search for a stochastic signal from the population of supermassive black-hole binaries, formed in the centers of galaxies following major-merger events, as performed on the NANOGrav 11 yr data set (Arzoumanian et al. 2018a, 2018b). See Figure 1 for the Bayesian posterior probabilities of the GW-background (GWB) amplitude, as obtained with a range of SSEs. We verified that other recent PTA results (Shannon et al. 2015; Arzoumanian et al. 2016) would be similarly affected.

Figure 1.

Figure 1. (Left) Bayesian posteriors for the amplitude AGWB (at f = yr−1) of the GW stochastic background, modeled as a fixed power law with slope γ = 11/3, as appropriate for a population of inspiraling and GW-emitting supermassive black-hole binaries. The posteriors are computed for the NANOGrav 11 yr data set using individual JPL ephemerides (dotted lines) and BayesEphem (solid lines). The GW model ("model 2A" in Arzoumanian et al. 2018a) does not include Hellings–Downs correlations, which however modify results only marginally. The incorporation of explicit SSE uncertainties into the analysis via BayesEphem leads to substantially lower evidence of a GW background and to much greater consistency among the different SSEs. (Right) Same posteriors as in the left panel, shown with a logarithmic vertical scale that can be mapped to approximate GW vs. noise-only Bayes factors, by way of the Savage–Dickey formula $P({A}_{\mathrm{GWB}}=0)/P({A}_{\mathrm{GW}}\ne 0)=p({A}_{\mathrm{GWB}}=0| \mathrm{data})/{p}_{\mathrm{prior}}({A}_{\mathrm{GWB}}=0)$ (Dickey 1971). The application of BayesEphem brings all Bayes factors close to unity, indicating no evidence for GWs once SSE uncertainties are taken into consideration. Bayes factors and 95% AGWB upper limits calculated from these curves are listed in Tables 1 and 2.

Standard image High-resolution image

In this article, we report on BayesEphem, the physical model of SSE uncertainties that we developed and integrated with the NANOGrav GW analysis so that we could produce SSE-robust results (Arzoumanian et al. 2018a) by marginalizing our Bayesian statistics over the SSE model parameters. Our model complements published SSEs, which do not generally include usable time-domain representations of orbit uncertainties. We adopted the conservative goal of bridging the JPL SSEs so that our analysis would yield the same GW-amplitude posteriors, no matter which SSE was used to offset the TOAs initially. As discussed in Section 4, the crucial element in our approach turns out to be the modeling of uncertainties in Jupiter's orbit, which create apparent SSB motions with periods comparable to the duration of our data set, and with amplitudes of ∼50 m. These correspond to ∼170 ns delays, within the GW sensitivity of our PTA. By marginalizing GW posteriors over a set of SSE correction parameters that include Jupiter's orbital elements, we achieve our bridging criterion for the SSEs adopted in Arzoumanian et al. (2018a) and indeed also for the newer DE438 (Folkner & Park 2018); see Figure 1. By contrast, while we include SSB corrections due to perturbations in the masses of the outer planets (Champion et al. 2010), we conclude that the current sensitivity of our data set is insufficient to constrain these masses better than recent spacecraft tracking and Doppler data sets (Jacobson et al. 2000, 2006; Jacobson 2009, 2014; Caballero et al. 2018).

In our 11 yr analysis (Arzoumanian et al. 2018a), we declined to adopt an aggressive stance that would have given more credence to the more recent available SSEs (DE435 and DE436), which are based on longer sets of solar system observations that fully cover the span of our PTA data set, and on more sophisticated analysis techniques. The resulting GW-amplitude posteriors imply less evidence for GWs than those obtained with earlier SSEs (see Table 1) and intermediate estimates for 95% amplitude upper limits (see Table 2). Nevertheless, even if DE435 and DE436 are described as having only minor differences (Folkner et al. 2016; Folkner & Park 2016), the resulting posteriors are still at variance—especially so in the low GW-amplitude limit, which affects the Savage–Dickey Bayes ratio used as our GW detection statistic. Thus, uncertainty modeling remains important even if we concentrate on newer SSEs.

Table 1.  Bayesian Evidence for a GWB with Different SSEs

SSE No BayesEphem Set III Keplerian
DE421 10.6 0.68 0.68
DE430 23.7 0.72 0.71
DE435 2.0 0.76 0.72
DE436 6.2 0.80 0.72
DE438 40.7 0.91 0.87

Note. Savage–Dickey Bayes factors in favor of a common γ = 13/3 red-noise process ("model 2A" of a GWB in Arzoumanian et al. 2018a) in NANOGrav's 11 yr data set, obtained by fixing the fiducial SSE shown in column 1 and omitting (column 2) or applying BayesEphem in its Set III (column 3) or Keplerian-element (column 4) formulations. The numbers shown here were drawn from newly reproduced Monte Carlo runs and differ from those of Arzoumanian et al. (2018a) by small sampling errors.

Download table as:  ASCIITypeset image

Table 2.  95% Bayesian Upper Limits for AGWB

SSE No BayesEphem Set III Keplerian
DE421 1.54 × 10−15 1.32 × 10−15 1.35 × 10−15
DE430 1.76 × 10−15 1.31 × 10−15 1.33 × 10−15
DE435 1.59 × 10−15 1.38 × 10−15 1.40 × 10−15
DE436 1.64 × 10−15 1.38 × 10−15 1.41 × 10−15
DE438 1.94 × 10−15 1.45 × 10−15 1.44 × 10−15

Note. 95% Bayesian upper limits on the amplitude of a γ = 13/3 GWB in NANOGrav's 11 yr data set, listed as in Table 1. The numbers shown here were drawn from newly reproduced Monte Carlo runs and differ from those of Arzoumanian et al. (2018a) by sampling errors ∼0.02.

Download table as:  ASCIITypeset image

This article is organized as follows. In Section 2, we summarize the SSE production, and we describe the history and stated accuracy of the JPL SSEs adopted in our work; in Section 3, we discuss the TOA delays induced by SSE errors, which are partially absorbed by the timing model, and we identify Jupiter's and Saturn's orbits as the drivers behind GW-posterior discrepancies; in Section 4, we formulate BayesEphem and give details about its implementation in our PTA data-analysis software, Enterprise (Ellis et al. 2019); in Section 5, we report on the Bayesian posteriors (for the GW amplitude and for orbital-correction parameters) obtained with BayesEphem for NANOGrav's 11 yr data set, reproducing and expanding the results of Arzoumanian et al. (2018a); in Section 6, we present simulations that probe the reduction in GW sensitivity due to BayesEphem; in Section 7, we discuss other approaches toward SSE uncertainty modeling; and last, in Section 8 we offer our brief conclusions.

All computational results presented in this article were obtained with Enterprise (Ellis et al. 2019), used in conjunction with the stochastic sampler PTMCMCSampler40 and the pulsar-timing package Tempo2 (Hobbs & Edwards 2012). Code that supports all the calculations discussed here and that produces all the figures of this paper is available as a Jupyter notebook at https://github.com/nanograv/11yr_stochastic_analysis.

2. Solar System Ephemerides

To derive the JPL SSEs41 (and similarly for the French INPOP42 and Russian EPM43 ), the orbits of the Sun, the planets, and a large number of asteroids are fit to heterogeneous data sets collected over the last few decades. Measurement techniques include spacecraft ranging and Doppler tracking, direct planetary radar ranging, very long baseline interferometry of spacecraft, and (for the Moon) the laser ranging of retroreflectors left by the Apollo missions (see Verma 2013 for a review). The masses of minor bodies are also included as fit parameters, while the masses of the planets are held fixed to values determined separately from spacecraft data for each planet (Jacobson et al. 2000, 2006; Jacobson 2014, 2009).

The parameters of the fit are the initial conditions ("epoch" positions and velocities) for all the bodies (as well as minor-body masses); from these, orbits are integrated numerically, providing a reference solution used to compute measurement residuals. The integration is repeated with minor displacements in all fit parameters, yielding variational partials for the orbits. The fit parameters are then corrected by finding the linear combination of the partials that minimizes the residuals in least-squares fashion, and the scheme is repeated until the solution converges (see, e.g., Newhall et al. 1983).

The most complex aspect of the process is the modeling of observations for data sets that are both varied in technique and unique to each planet and each new spacecraft (Moyer 2003)—a Sisyphean task. Because it is difficult to assign realistic uncertainties to many of the measurements (and, perhaps more importantly, to estimate their systematic errors), the formal errors and covariances estimated with the least-squares procedure are considered unreliable and are not published with the best-fit orbits. Instead, orbit accuracy is assessed by analyzing model residuals and by comparing estimates that use different subsets of the data (Folkner & Park 2015, 2018).

In this paper, we work with the four JPL SSEs used to analyze the NANOGrav 11 yr data set (Arzoumanian et al. 2018a) as well as the more recent DE438. While we have also investigated some of the SSEs provided by INPOP (Fienga et al. 2014; Viswanathan et al. 2018), we find that they are not qualitatively different in terms of their GW constraints.

  • DE421 (Folkner et al. 2009): published in 2009, based on data through 2007. The orbits of the inner planets are known to subkilometer accuracy—those of Jupiter and Saturn to tens of kilometers; Uranus and Neptune are not well determined. The axes of the ephemeris are oriented with the International Celestial Reference Frame (Ma et al. 1998) with accuracy ≲1 mas.
  • DE430 (Folkner et al. 2014): published in 2014, based on data through 2013. Orbit integration relies on a more sophisticated dynamical model. The Saturn orbit is more accurate thanks to the improved treatment of range measurements to the Cassini spacecraft (Hees et al. 2014). The axes of the ephemeris are oriented with the 2009 update of the International Celestial Reference Frame, ICRF2 (Fey et al. 2015) with accuracy ≲0.2 mas, which represents the limiting error source for the inner planets, corresponding to orbit uncertainties of a few hundred meters. The Jupiter and Saturn orbits are determined to tens of kilometers, those of Uranus and Neptune (which are constrained mainly by astrometric measurements) to thousands of kilometers.
  • DE435 (Folkner et al. 2016): published in 2016; it improves the Saturn orbit using Cassini data through 2015, correcting it by ∼1.5 km. The Jupiter orbit had been updated (by ∼50 km) in the 2015 DE434 ephemeris (Park et al. 2015) by reprocessing data from six spacecraft flybys and adding data from the New Horizons flyby. In DE435, Jupiter's orbit is tweaked further (by ∼20 km) by reweighting the data sets. These changes are deemed "consistent" with the estimated orbit uncertainties (Park et al. 2015; Folkner et al. 2016).
  • DE436 (Folkner & Park 2016): published at the end of 2016; it updates the Jupiter ephemeris (by ∼20 km) for use by the Juno navigation team. Saturn's orbit changes very slightly.
  • DE438 (Folkner & Park 2018): published in 2018 June; it updates the Jupiter ephemeris (by ∼10 km) with Juno measurements (six ranges and four VLBI observations near perijove), and the Saturn ephemeris (by ∼1 km) with reprocessed ranges through the end of the Cassini mission. The accuracy of Jupiter's orbits is deemed "at least a factor of four better than previous ephemerides," viz. ≲10 km.

Across these ephemerides, the orbit of Earth relative to the Sun is consistent at the 3 m (∼10 ns) level, after applying an overall rotation with respect to the International Celestial Reference Frame (within the uncertainties of that "tie") and a rotation rate about the ecliptic pole.44 However, the orbit of the Sun and therefore the orbit of Earth (both relative to the SSB) match only at the 100 m (∼300 ns) level across ephemerides (see Figure 2). This discrepancy arises from the estimated positions of Jupiter, Saturn, Uranus, and Neptune. Using a simple dynamical model of the solar system (i.e., integrating the equations of Newtonian gravity for the eight planets and the Sun), it is easy to show that if we perturb the masses or the orbits of the outer planets, we affect the Sun-to-SSB and Earth-to-SSB trajectories through the resulting redefinition of the SSB, rather than through the very minor changes in the gravitational pull of the outer planets (see also Guo et al. 2019).

Figure 2.

Figure 2. Differences in Earth–Sun position, r(3) − rSun, and in Sun–SSB position, rSun − rSSB, between pairs of JPL SSEs. Note the difference in the vertical scale of the two panels. Earth–SSB plots, not included here, would appear essentially the same as Sun–SSB plots, as the Earth–Sun differences are very small. For reference, 2.5 m ≈ 8 ns, and 0.1 km ≈ 300 ns, bracketing the range of amplitudes expected for GW signals in PTA data sets. We plot SSE equatorial-coordinate differences over the approximate span of NANOGrav's 11 yr data set, and we apply the best-fit frame rotation that minimizes the 3D norm of the difference. Dotted curves are zoomed ×10 vertically. Dotted vertical lines mark the end of the DE421 and DE430 fitted data sets. For clarity, we also remove a ∼100 m mean in each coordinate in the Sun–SSB differences; such a constant offset does not affect PTA likelihoods.

Standard image High-resolution image

The last few JPL ephemerides focus on Jupiter and Saturn, as motivated by the navigation needs of JPL missions to those planets. In DE421 to DE436, Saturn's orbit is known better than Jupiter's, because the Cassini tracking data is more complete and accurate than was possible for previous spacecraft. In particular, the high-gain antenna of the Jupiter orbiter Galileo failed to deploy, leading to low-accuracy measurements. Data from the Juno spacecraft, which has been orbiting Jupiter since 2016 July, appears to improve the Jupiter ephemeris substantially (Folkner et al. 2009).

3. SSE Errors as Systematics for Pulsar-timing Arrays

The search for stochastic GW signals with PTAs exploits the distinctive quadrupolar signature in the interpulsar correlations of timing-model residuals (Hellings & Downs 1983). Residuals are obtained after applying a chain of corrections that convert the TOA measured at the radio telescope to the notional emission time in the pulsar system (see, e.g., Edwards et al. 2006b):

Equation (1)

where ${t}_{{\rm{e}}}^{\mathrm{psr}}$ and ${t}_{{\rm{a}}}^{\mathrm{obs}}$ are the emission and arrival times, Δ captures corrections between the observatory and SSB frames, ΔIS describes corrections between the SSB frame and the pulsar-system barycenter, and ΔB the model corrections from the pulsar system barycenter to the pulsar frame (which are relevant for binary systems). Among these terms, ΔIS is very large, but changes very slowly over the duration of pulsar data sets and thus maps to a constant phase offset. Next comes the Rømer delay (Rømer 1676), corresponding to the light-travel time (∼500 s) between the observatory at ${{\boldsymbol{r}}}^{\mathrm{obs}}$ and the SSB at rSSB:

Equation (2)

where $\hat{{\boldsymbol{p}}}$ is the unit vector in the direction of the pulsar. In fact, $\hat{{\boldsymbol{p}}}$ is determined from pulsar-timing data mainly through the time dependence of Equation (2).

The errors δrobs and δrSSB induce systematic TOA delays according to Equation (2). Furthermore, the position of the observatory with respect to Earth's barycenter is known to few-centimeter (sub-nanosecond) accuracy (Edwards et al. 2006b), so we may write the systematic Rømer-delay error as

Equation (3)

where x(3) ≡ r(3) − rSSB is the terrestrial barycenter's position in the SSE frame where the SSB is identified with the origin, and δx(3)(t) is the systematic error (a function of time) in the SSE's estimate of x(3)(t). Here and below, we index the solar system planets from Mercury to Neptune from (1) through (8), so Earth is (3). We will also continue to use x to refer to vectors in the SSE frame, with origin at the SSB.

The time dependence of the SSE errors is essential to the effect of the ensuing residuals on GW searches in PTA data. Constant offsets, as well as linear and quadratic trends, are absorbed by redefinitions of the timing-model parameters that describe the intrinsic spin evolution of the pulsar; in other words, the TOA likelihood is affected only by δtR⊙ minus the best-fitting quadratic polynomial.45 Other free parameters in the timing model yield similar subtractions: most important, an angular misalignment between the SSE frame and the "true" sidereal frame can be absorbed by a slight coherent displacement in the position of all pulsars; the corresponding TOA corrections are sinusoidal with periods of one sidereal year.

PTAs are most sensitive to GWs with periods of a few to several years, on the order of the total time span of observations;46 therefore, SSE errors in the orbits of the giant planets, which have comparable periods, could be mistaken for GWs. Correcting the orbit of a planet by the time-dependent vector δr(t) corresponds to offsetting the SSB position by (mplanet/ mSS× δr(t), where mSS is the total mass in the solar system. We then estimate

Equation (4)

where the quantities in nanoseconds are light-travel times equivalent to the distances. While the uncertainties due to Uranus and Neptune are larger, their orbital periods (P(7) = 84 yr and P(8) = 165 yr) ensure that the corresponding δxSSB appear as linear or mildly quadratic TOA trends, which are absorbed by timing models (and will continue to be, until PTA data sets approach a century in duration). By contrast, Jupiter and Saturn corrections enter the residuals with timescales (P(5) = 12 yr and P(6) = 29 yr) comparable to the span of our data set—just where PTAs are most sensitive to GWs.

Likewise, the absolute location of the SSB is degenerate with the initial phase of the pulses for each pulsar, so we need not worry that the former depends strongly on the set of bodies that are included in each SSE fit. For instance, including trans-Neptunian objects relocates the SSB by ∼100 km (Folkner et al. 2014).

Champion et al. (2010) discuss pulsar timing's potential to constrain the masses of outer planets. The uncertainties in current IAU best estimates (derived from spacecraft tracking and Doppler studies: Jacobson et al. 2000, 2006; Jacobson 2014, 2009; IAU Division I Working Group on Numerical Standards for Fundamental Astronomy 2017) give rise to Rømer corrections comparable to the orbit errors. These scale as (δmplanet/mSS× r, so

Equation (5)

These corrections yield delays with the same periods as the corresponding planets, so again they may be observable in PTA data sets for Jupiter and Saturn, but not for Uranus and Neptune.

In Figure 3, we show the difference between Rømer delays for 100 simulated pulsars randomly distributed across the sky, as computed with DE421/DE430/DE435/DE438 and with DE436, which is taken as a reference. Thus, we are plotting the systematic error that we would introduce using the other SSEs if DE436 gave exact orbits. We represent timing-model fits by subtracting the best-fitting quadratics and 1 yr-period sinusoids. The plot covers the span of the NANOGrav 11 yr data set. The differences reach ∼100 ns with typical periods of ∼10–12 yr (consistent with a Jovian attribution)—within the sensitivity range and band of GW searches. Indeed, as we see in Figure 1, the GW-amplitude posteriors for the NANOGrav 11 yr stochastic-background characterization are affected significantly by the choice of SSE.

Figure 3.

Figure 3. Differences in Rømer delays computed with DE421/DE430/DE435/DE438 and taking DE436 as reference for 100 simulated pulsars randomly distributed across the sky, plotted after subtracting the best-fitting quadratic and yearly sinusoid. The darker curves with the various dashed styles allow a comparison of the same pulsars across the three differences. The plot covers the span of the NANOGrav 11 yr data set. For any given pulsar, these differences are reduced to less than 10 ns after subtracting the best-fitting linear combination of the BayesEphem partials, dotted into the pulsar position to obtain Rømer delays.

Standard image High-resolution image

4. BayesEphem: A Physical Model of Solar System Ephemeris Uncertainties for Gravitational-wave Searches in PTA Data

We set out to address the sensitivity of the NANOGrav stochastic-background analysis to SSE systematics by developing a parameterized physical model of SSE uncertainties (BayesEphem), so that we can compute robust GW-parameter posteriors by marginalizing over the SSE parameters. Motivated by the discussion of Sections 2 and 3, we include the following components:

  • 1.  
    TOA delays generated by corrections to the masses of Jupiter, Saturn, Uranus, and Neptune, modeled as Champion et al. (2010):
    Equation (6)
    We impose normal Bayesian priors on the δm(p), with standard deviation equal to the IAU-adopted mass-estimate uncertainties (IAU Division I Working Group on Numerical Standards for Fundamental Astronomy 2017). For each pulsar data set, we compute the x(p) by evaluating the DE436 SSE at the measured TOAs {ti}.
  • 2.  
    TOA delays generated by a rotation rate about the ecliptic pole (as needed to absorb Sun-to-Earth orbit differences among SSEs),
    Equation (7)
    where ${{\mathsf{R}}}^{\hat{z}}(\cdot )$ is the appropriately oriented rotation matrix and $\delta \theta ={\omega }^{\hat{z}}(t-{t}_{0})$ is the rotation angle, with ${\omega }^{\hat{z}}$ the rate and t0 set at the beginning of the NANOGrav data set. We impose uniform Bayesian priors on ${\omega }^{\hat{z}}$ that are commensurate with the rotation rates needed to reduce the difference between the JPL SSEs. Given that these rates are very small, we linearize Equation (7) with respect to ${\omega }^{\hat{z}};$ as for δtR⊙(δm(p)), we compute x(3) by evaluating the DE436 at the measured TOAs {ti} for each pulsar data set. Note that we do not expect this term to affect GW posteriors: both static and uniform rotations of the SSE frame are absorbed in the estimated positions and proper motions of the pulsars. We omit the former altogether and include ${\omega }^{\hat{z}}$ as a check.
  • 3.  
    TOA delays generated by perturbing Jupiter's average orbital elements, as given by
    Equation (8)
    where the six aμ are the six J2000 Keplerian elements (semimajor axis, eccentricity, inclination, mean longitude, longitude of the perihelion, longitude of the ascending node) and the δaμ are their perturbations. Alternatively, we can formulate the perturbations in terms of Brouwer and Clemence's "Set III" parameters (Brouwer & Clemence 1961), for which we have access to uncertainty estimates for certain JPL SSEs (Park et al. 2015; Folkner & Park 2018; see Figure 5). The two formulations are largely equivalent with respect to their effect on GW searches. To implement the Rømer-delay perturbations, we begin with approximate values for the aμ and their rates of change.47 We vary these aμ to minimize the (rms) difference between our quasi-Keplerian orbits and DE436, integrated between years 2000 and 2020. The resulting orbits are within 1% of DE436, which ensures similar accuracy for the orbit partials, more than enough for our purposes. We compute the six partials as finite differences; because they are strongly correlated (see the top panel of Figure 4) and have different scales in their natural units, we decorrelate and normalize them by computing the singular value decomposition ${\sum }_{\nu }{U}_{\mu \nu }{S}_{\nu }{V}_{\nu {ik}}$ of the matrix ${P}_{\mu {ik}}=\partial {x}_{k}^{(5)}({t}_{i})/\partial {a}^{\mu }$, and adopting $\partial {x}_{k}^{(5)}({t}_{i})/\partial {b}^{\nu }\equiv {V}_{\nu {ik}}$ as new orbit partials (see bottom panel of Figure 4), where $\delta {b}^{\nu }={\sum }_{\mu }\delta {a}^{\mu }{M}_{\mu \nu }$ with Mμν = UμνSν (no summation intended). The resulting units are mixed. We give the orthonormalized coefficients δbμ uniform priors that are broad enough to generate the range of Rømer variation seen across the JPL SSEs and to contain the support of the PTA likelihood for the NANOGrav 11 yr data set. In other words, we make the priors broad enough that the posteriors do not impinge on the boundaries. We do not include Uranus' and Neptune's orbital perturbations, which currently lead to δtR⊙ with linear time dependencies that are absorbed entirely by timing-model parameters. By contrast, we repeat the procedure that we just outlined for Saturn, but find that GW posteriors are barely affected for orbital perturbations of magnitude comparable to the differences among the JPL SSEs. Thus, in our use of BayesEphem, we usually omit Saturn orbit perturbations. These may prove more important as data sets increase in length.

Figure 4.

Figure 4. Top: perturbative partials (x, y, and z coordinates, geometrized units of seconds) for Jupiter's orbit between the years 2000 and 2020, as obtained by varying the six Keplerian elements in a quasi-Keplerian model that includes their rates. The baseline elements and rates are adjusted to fit DE436 orbits. Here, a is the semimajor axis of the orbit, e the eccentricity, ${\imath }$ the inclination, the mean longitude, ϖ the longitude of the periapsis, and Ω the longitude of the ascending node. Bottom: singular value decomposition vector time series obtained from the partials in the top panel.

Standard image High-resolution image

The components outlined above are brought together into a linear delay model δtR⊙(ca) written as the product of the 11 dimensional correction vector ${c}^{a}\equiv \{\delta {m}^{(p)},{\omega }^{\hat{z}},\delta {b}^{\nu }\}$ (with (p) = 5, 6, 7, 8 and ν = 1, ..., 6) by the (nTOAs × 11) dimensional design matrix Gia with columns defined by the equations in this section. This model, including variational partials for Keplerian and Set III parameterizations, is available as part of the open-source software package Enterprise (Ellis et al. 2019).

It is also possible to treat the linear model as a Gaussian process common to all pulsars and to marginalize analytically over the ca, by building the effective correlation matrix GiaΦabGjb, where Φab is the prior covariance of the ca (van Haasteren & Vallisneri 2014). The basis vector for each ca corresponds to the concatenation of the Rømer-delay perturbations that the parameter generates for each pulsar (i.e., the projection of the same vector time series onto different p, at the appropriate TOAs).

5. Effect of BayesEphem on the Search for Stochastic Gravitational Waves in NANOGrav's 11 yr Data Set

To derive the SSE-robust results reported in Arzoumanian et al. (2018a), we sampled the BayesEphem corrections ca alongside the hyperparameters that describe the common GW background and the noise parameters for each pulsar (see Section 3.4 of Arzoumanian et al. 2018a and Sections 3 and 4 of Arzoumanian et al. 2016), treating δtR⊙(ca) as a deterministic correction to the residuals.

In Figure 1, we show Bayesian posteriors for the stochastic-background amplitude AGWB at a fiducial frequency of yr−1, as computed by fixing the fiducial SSE to DE421, DE430, DE435, DE436, and DE438 in turn, and either disabling (dotted lines) or applying BayesEphem in the Set III formulation (solid lines). The prior on $\mathrm{log}$10 AGWB is flat in [−18, −14], and the GWB spectral slope is set to γ = 13/3, as appropriate for an ensemble of binary inspirals progressing by GW emission alone. The posteriors follow from PTA likelihoods that omit Hellings–Downs correlations (as in "model 2A" rather than "3A" of Arzoumanian et al. 2018a), but results change only modestly if we include those. (Simulations show that as PTA data sets become more sensitive to GWs, a GWB would manifest first as seemingly uncorrelated red-noise processes of comparable amplitude in multiple pulsars, and later as a Hellings–Downs-correlated process across the PTA, providing more conclusive evidence of the GW origin of the signal).

The plots in Figure 1 demonstrate the successful bridging of AGWB, our goal in mitigating the systematic effects of SSE errors. In doing so, BayesEphem removes hints that any GWB is present, as confirmed by computing model 2A Bayes factors (in favor of a common GWB process), listed in Table 1. In Table 2, we show 95% AGWB upper limits, computed as just discussed but with flat uninformative priors on AGWB. The tables (as well as plots analogous to Figure 1 not shown here) confirm that the Set III and Keplerian-element formulations are equivalent with respect to GW detection.

The posterior distributions of the BayesEphem mass perturbations are identical to their IAU priors: NANOGrav's 11 yr data set is uninformative compared to current estimates from spacecraft data. Likewise, the frame rotation rate ${\omega }^{\hat{z}}$ fills its prior, as expected. The posterior distributions of the BayesEphem corrections to Jupiter's orbital elements are shown in Figure 5, where they are compared to JPL's estimated uncertainties for DE435 and DE436, derived by comparing SSE fits that use independent subsets of the data (Park et al. 2015). The NANOGrav posteriors appear consistent with all SSEs: they have reasonably high support around the estimated-uncertainty region around δaμ = 0. The least-consistent parameters are those involving rotations of the orbital plane (DMW and EDW; see the caption of Figure 5), as well as the eccentricity (DE) for the oldest SSEs, DE421 and DE430. JPL's estimated uncertainties for DE438 are a factor of ∼4 tighter than those for DE435/6, but they do not change this picture substantially.

Figure 5.

Figure 5. Bayesian posteriors for corrections to Jupiter's "Set III" orbital elements, as obtained in BayesEphem's application to NANOGrav's 11 yr data set. The solid curves show posteriors obtained by correcting different fiducial SSEs; the dotted curve shows JPL's estimate of uncertainties in Jupiter's osculating orbital elements for DE435 and DE436 (Park et al. 2015). Parameter names follow JPL's orbit-determination package (Moyer 2003). Here, DA ≡ Δa/a, a fractional correction to the semimajor axis of the osculating orbit; DE ≡ Δe, a fractional correction to the eccentricity; DMW = ΔM0 + Δw, EDW = eΔw, DP = Δp, and DQ = Δq, where M0 is the mean anomaly at the initial epoch (conventionally MJD 2440400.5) and (Δp, Δq, Δw) encode a rigid rotation of the orbit, with Δp and Δq in the orbital plane and Δw normal to it.

Standard image High-resolution image

The large dispersion of the BayesEphem posteriors is not unexpected, given that pulsar-timing data is sensitive to Rømer delays in a rather selective fashion (see the discussion of Section 3). In terms of uncertainties on Jupiter's instantaneous position, the BayesEphem posteriors map to rms 3D errors ∼100 km, and thus are larger than the differences between SSEs. While BayesEphem can bridge NANOGrav 11 yr AGW posteriors for different SSEs, the resulting orbital-element posteriors do not identify any specific systematic offset among them.

6. Assessing Gravitational-wave-background Detection Prospects when Using BayesEphem

BayesEphem is designed to model SSE uncertainties and systematics, bridging published estimates of Earth's trajectory around the SSB. To do so, BayesEphem introduces new parameters that govern a spatially correlated process of amplitude comparable to the stochastic GWBs that we seek. While SSE corrections and the GWB have different spatial-correlation structures, the two may nevertheless remain degenerate when probed by a limited number of PTA pulsars (Roebber 2019). It is then natural to ask how the application of BayesEphem may affect GWB-detection prospects (sensitivity and time to detection) in the weak-to-intermediate signal-to-noise regime in which spatial correlations among the pulsars carry marginal information. To sketch an answer to this question, we conducted a set of simulations, which are summarized briefly in Arzoumanian et al. (2018a) and discussed further here.

We created multiple synthetic data sets meant to replicate the sensitivity of NANOGrav's 11 yr (really, 11.4 yr) data. We used actual observation epochs for the 34 analyzed pulsars and extended the time span to 15 yr by drawing observation times from distributions fit to the last three years of measured data. We simulated timing residuals by drawing random white- and red-noise deviates at the maximum a posteriori 11 yr levels, and again extrapolated to 15 yr from the last 3 of the 11 years. These choices ensure that future data taking is represented with cadence and precision comparable to the end of 2015—a very conservative option given ongoing receiver upgrades and the continuous addition of new pulsars to the PTA.

We calibrated these noise-only simulations by increasing noise levels until the 11 yr "slice" of the data matched the GWB upper limit (specifically the spatially uncorrelated model 2A, with uncorrected DE436) of Arzoumanian et al. (2018a). All simulated data sets were created by postulating that DE436 was the "truth." We then injected γ = 13/3 GWBs of various amplitudes and analyzed both the 11 yr slices and the full 15 yr data sets by taking DE430 as our fiducial SSE, both with and without BayesEphem.

Our results were as follows: first, for noise-only data sets without BayesEphem, the systematic offset between DE430 and DE436 is interpreted as a GWB for both 11 yr and 15 yr time spans, with (model 2A versus the noise-only model 1) Bayes ratios of ∼2 and ∼20 respectively. The ratios are reduced to levels consistent with noise fluctuations by the application of BayesEphem.

Second, as we increase the amplitude of injected GWBs while applying BayesEphem, the Bayes factors remain marginal for the 11 yr data set, as do model 3A versus model 2A factors, the substantive spatial-correlation test of GW presence. The latter are shown in Figure 6. The story is different for 15 yr data sets, where the scaling of Bayes factors with injected GWBs is comparable whether or not we apply BayesEphem. Indeed, the longer time span disentangles SSB and GWB correlations, enabling detection even at the astrophysically conservative levels described by Sesana and colleagues (Sesana et al. 2016).

Figure 6.

Figure 6. Correlated vs. uncorrelated GWB Bayes factors (i.e., model 3A vs. model 2A), estimated for a set of simulated 15 yr data sets with GWB injections at increasing amplitudes. The data sets were generated by adopting DE436 as the "correct" SSE, and analyzed using both the "wrong" DE430 ephemeris (dashed lines) and BayesEphem (solid). The orange and blue lines display results for the full data sets and their 11 yr "slices," respectively. Their comparison indicates that BayesEphem will not impede the ability of PTAs to make a definitive detection in the near future. Figure adapted from Arzoumanian et al. (2018a).

Standard image High-resolution image

In summary, our simulations confirm that SSE errors can produce spurious evidence for GWBs; they suggest that, in data sets similar to NANOGrav's 11 yr, BayesEphem can overcorrect these biases, suppressing the evidence for a "true" GWB, but they imply also that this effect vanishes for longer data sets. Thus, we expect that BayesEphem will not impair detection prospects in the near future—if of course nature grants us a sufficient GWB bounty.

7. Other Modeling Approaches

In our analysis of NANOGrav's 11 yr data set, we experimented with other modeling approaches for SSE uncertainties, which (in our implementation) did not satisfy our bridging criterion or were otherwise disfavored. We discuss them briefly here, as a reference for future investigations:

  • 1.  
    Planetary mass perturbations for outer-solar system planets: modeled as in Champion et al. (2010, and as in the first element of BayesEphem; see Section 4), these did not affect our AGWB posteriors significantly (and therefore did not resolve the discrepancy among SSEs), whether introduced with best-estimate priors (IAU Division I Working Group on Numerical Standards for Fundamental Astronomy 2017) or with more relaxed assumptions.
  • 2.  
    Dipole-correlated Gaussian process: as proposed in Tiburzi et al. (2016), the Rømer delays for individual pulsars may be treated as Gaussian processes with common power-law or free-spectral priors (van Haasteren & Vallisneri 2014), and with dipolar spatial correlations between pulsars (${C}_{{ij}}=\cos {\theta }_{{ij}}$, with θij the angle between pulsars). We note that this approach is equivalent to modeling the apparent motion of the SSB (or equivalently, the error in Earth's orbit) as three Gaussian processes (along the three spatial axes) with identical uncorrelated priors, and then projecting the vector time series onto the pulsar positions to obtain Rømer delays. Dipole-correlated Gaussian processes were included in models 2B, 2C, 3B, and 3C of our 11 yr analysis (Arzoumanian et al. 2018a). It is unclear to us why the approach failed to bridge AGWB posteriors across SSEs, but the reason may be related to the choice of Gaussian-process priors appropriate to describe the range and shapes of variations among SSEs.
  • 3.  
    Rømer mixture: this phenomenological model describes Earth's orbit as a linear combination of its estimate in multiple SSEs, with mixture coefficients constrained by a Dirichlet prior (see, e.g., Gelman et al. 2013). This approach achieves bridging by construction, but it is difficult to interpret physically. In our 11 yr investigation, the posteriors of the mixture coefficients indicated a moderate preference for DE435/6 over DE421 and DE430.
  • 4.  
    Gaussian process based on numerical partials: in this approach, SSE corrections are modeled as a finite Gaussian process (Williams & Rasmussen 2006) in which basis vectors are given by orbit partials (e.g., the change in Earth's orbit as we vary the initial conditions for all the planets), as computed by numerical integration for SSE fits. The priors for the basis weights are given by the formal covariance of the SSE fit parameters. The basis vectors are then projected into Rømer delays for each pulsar. It turns out that only Jupiter and Saturn partials matter to GW results; when restricted to these planets, the approach is effectively equivalent to the orbit-correction sector of BayesEphem—at least for the 11 yr time span, for which numerical partials are very close to analytical osculating-orbit perturbations.

8. Conclusion

It is striking that the abundance and precision of current PTA data sets should be such that our sensitivity to GWs is limited by the very accuracy to which we can position Earth around the SSB. In the analysis of NANOGrav's 11 yr data set (Arzoumanian et al. 2018a), we took the conservative position that robust GW upper limits and detection Bayes factors must be reproducible with all SSEs released in the last 10 years, after introducing a sufficiently descriptive model of SSE uncertainties. To this end, we developed BayesEphem, described in this article, which focuses on the SSE degrees of freedom (Jupiter's orbital elements) that measurably affect our GW search, and which produces "bridged" AGWB posteriors, upper limits, and Bayes factors. Because of this focus, the analysis of NANOGrav data does not update the JPL SSEs significantly (see Figure 5), although it provides uncertainty estimates entirely independently of those offered by SSE makers (Park et al. 2015; Folkner & Park 2018). We expect that it would be possible to concentrate instead on a sector of SSE corrections that do not affect GW results, but that would benefit from PTA constraints, thus producing PTA-enhanced SSEs useful beyond GW detection. For this purpose, we could adopt either the orbital-elements formalism of Section 4 or the numerical-partials approach of Section 7.

The conservative modeling attitude employed for NANOGrav's 11 yr analysis comes at the cost of a loss of GW sensitivity, as described in Section 6 (see also Roebber 2019). The GW statistics reported in Arzoumanian et al. (2018a; and here in Tables 1 and 2) are supported in our Bayesian setting, in which we decline to favor one SSE above others. However, these results should not be considered binding for future GW searches that rely on demonstrably accurate SSEs and that are based on longer data sets where SSE and GW correlations become disentangled: more precisely, data sets that cover a longer time span with a sufficient number of high-quality pulsars. Both conditions are now materializing: Juno's ongoing measurements are improving estimates of Jupiter's orbit (Folkner & Park 2018), which (we argue) is the limiting factor for GW searches, and the NANOGrav data set is progressing toward the 15 yr span, for which (we reckon) Jovian systematics decouple from GW statistics. The combined data sets assembled by the International Pulsar Timing Array (Perera et al. 2019) have already passed this mark and thus may already be immune to this problem.

The path toward an authoritative detection of low-frequency GWs with PTAs requires intense and persistent timing campaigns on the world's most sensitive radio telescopes; sophisticated inference techniques and clever high-performance-computing algorithms, to make sense of ever-growing, heterogeneous data sets with many unknown parameters; and a confident control of TOA systematics at the nanosecond level. Among the last, the error analysis and validation of high-precision SSEs will remain paramount.

We are grateful to Agnès Fienga, Joseph Romano, Alvin Chua, and Maria Charisi for useful comments and interactions.

The NANOGrav project receives support from National Science Foundation (NSF) Physics Frontier Center award #1430284. NANOGrav research at U.B.C. is supported by an NSERC Discovery Grant and Discovery Accelerator Supplement and by the Canadian Institute for Advanced Research. M.V. and J.S. acknowledge support from the JPL RTD program. S.R.T. was partially supported by an appointment to the NASA Postdoctoral Program at JPL, administered by Oak Ridge Associated Universities through a contract with NASA. J.A.E. was partially supported by NASA through Einstein Fellowship grants PF4-150120. Portions of this work performed at NRL are supported by the Chief of Naval Research. The Flatiron Institute is supported by the Simons Foundation. Portions of this research were carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

This work was supported in part by National Science Foundation grant PHYS-1066293 and by the hospitality of the Aspen Center for Physics. We are grateful for computational resources provided by the Leonard E. Parker Center for Gravitation, Cosmology and Astrophysics at the University of Wisconsin-Milwaukee, which is supported by NSF grants 0923409 and 1626190. Data for this project were collected using the facilities of the Green Bank Observatory and the Arecibo Observatory. The National Radio Astronomy Observatory and Green Bank Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc. The Arecibo Observatory is a facility of the National Science Foundation operated under cooperative agreement by the University of Central Florida in alliance with Yang Enterprises, Inc. and Universidad Metropolitana.

Author contributions: this document is the result of more than a decade of work by the entire NANOGrav collaboration. We acknowledge specific contributions below. Z.A., K.C., P.B.D., M.E.D., T.D., J.A.E., R.D.F., E.C.F., E.F., P.A.G., G.J., M.L.J., M.T.L., L.L., D.R.L., R.S.L., M.A.M., C.N., D.J.N., T.T.P., S.M.R., P.S.R., R.S., I.H.S., K.S., J.K.S., and W.W.Z. developed the 11 yr data set. M.V. led this analysis and coordinated writing. M.V., S.R.T., and J.S. developed BayesEphem, managed Bayesian-inference runs, and performed data analysis, with help from C.C., J.A.E., T.J.W.L., and S.J.V. M.V., S.R.T., and J.S. wrote this article, with edits by D.J.N., M.T.L., T.J.W.L., D.L.K., J.S.H., and S.J.V. Ephemeris data and expertise were provided by W.M.F. and R.S.P.

Footnotes

  • 40 
  • 41 
  • 42 
  • 43 
  • 44 

    This accounts for differences in the estimated semimajor axis of the Earth–Moon barycenter orbit, which gives rise to a linear rate in estimated ecliptic longitude.

  • 45 

    In a Bayesian context ,there is no single best-fitting polynomial when noise parameters are included in the inference, leading to varying weights for the fit. Nevertheless, the subtracted features remain largely irrelevant to parameter posteriors.

  • 46 

    This can be understood by noticing that the dominant source of residual noise ("radiometer" measurement noise) is white, but TOAs are sensitive to time integrals of GW strain, so the effective GW noise grows with frequency.

  • 47 

    The six aμ plus the rate of change ${\dot{a}}^{4}\equiv \dot{l}$ of the mean longitude specify Jupiter's orbit as the osculating ellipse at the J2000 reference epoch (MJD 2451545); the remaining five rates encode the secular evolution of Jupiter's orbit due to SSB bodies other than the Sun and to other physical effects. See https://ssd.jpl.nasa.gov/txt/aprx_pos_planets.pdf.

Please wait… references are loading.
10.3847/1538-4357/ab7b67