The Exoplanet Simple Orbit Fitting Toolbox (ExoSOFT): An Open-source Tool for Efficient Fitting of Astrometric and Radial Velocity Data

Kyle Mede; Timothy D. Brandt

doi:10.3847/1538-3881/aa5e4a

1. Introduction

Substellar companions exist in a wide array of masses and orbital configurations. Confirmed companions⁴ ^,⁵ ^,⁶ have orbital separations ranging from 0.0044 au to 2500 au (for PSR 179-14 (Bailes et al. 2011) and WD 0806–661B b (Rodriguez et al. 2011), respectively), and masses as low as 0.022 M_⊕ (PSR B1257+12 (Wolszczan 1994)). This broad spectrum has highlighted the somewhat arbitrary nature of mass-based categorizations. The canonical definition of a star is an object with sufficient core pressures for stable hydrogen burning, or a minimum mass of ∼73–89 M_J depending on metallicity (Grossman et al. 1974; Chabrier & Baraffe 2000). Bodies sufficiently massive to fuse deuterium, but not hydrogen, with masses from ∼13–80 M_J, are referred to as brown dwarfs; objects of even lower mass are called planets. Deuterium burning has a negligible effect on a brown dwarf's evolution; a more physically interesting distinction between brown dwarfs and planets might be their mechanism of formation (Chabrier et al. 2014, p. 619).

Formation mechanisms are thought to be broken up into two general regimes: large objects form like stars, from the "top down," while planets grow from the "bottom up." Stellar-mass companions are believed to form early on through the cooling and contraction of a fragment of the primary's protostellar cloud, or slightly later from a fragment of its protoplanetary disk (Kuiper 1951; Cameron 1978; Boss 1997; Boley 2009; Vorobyov 2013). Giant planets are thought to form when a rocky planet becomes sufficiently massive to accrete gas directly from the circumstellar disk (Safronov & Zvjagina 1969; Goldreich & Ward 1973; Mizuno 1980; Pollack et al. 1996). Direct collapse and core accretion operate over different, though likely overlapping, mass and separation regimes (Bate et al. 2003; Bate 2009; Mordasini et al. 2012); they may also produce different distributions of eccentricity. Observational studies at these overlapping mass regimes would benefit from the precise and accurate planet masses obtained by fitting their dynamics.

Planets exist with a wide range of properties and their current orbits hold clues to their formation and evolutionary past. Unfortunately, no single observing technique can provide data sufficient to fully model a companion's orbit and constrain all six of its orbital elements. Two of those commonly applied for detecting substellar companions are the radial velocity method and high-contrast direct imaging. Partial solutions to the orbit can be achieved by fitting the measured radial velocity or astrometry, while the full set of orbital elements may be obtained by fitting both simultaneously for a single object. Solutions found in this manner are capable of measuring the companion's dynamical mass, rather than the ${m}_{2}\sin (i)$ value estimated with the RV method alone. Thus, the $\sin (i)$ degeneracy may be broken, enabling separate constraints on the inclination and companion mass. However, no general open-source tools to perform joint RV and astrometric data fitting currently exist.

Of the nearly 2000 confirmed exoplanets,⁷ ∼600 were found through radial velocity measurements and ∼40 with direct imaging, although none have data of both forms. While a small number of objects with simultaneous coverage and estimated dynamical masses exist, they are all in the low-mass star and high-mass brown dwarf range (i.e., Crepp et al. 2012a, 2012b, 2013b, 2013a, 2014; Ryu et al. 2016; also: J. Hagelberg et al. 2016, in preparation). Given the uncertainties in the atmospheric models of exoplanets and low-mass brown dwarfs, and the variety of orbital evolution scenarios, accurate dynamical mass values are important. The advancements in ground- and space-based instrumentation will further populate the lists of known low-mass companions observed with both techniques. With Gaia precisely measuring acceleration in the plane of the sky for ∼10⁹ stars, it is expected to lead to the discovery of thousands of planets astrometrically. In addition, high-dispersion spectrographs, such as IRD (Tamura et al. 2012), CRIRES (Käufl et al. 2008), and SPIRou (Artigau et al. 2011), will measure the radial velocity of low-mass stars with poor or no orbital coverage from previous instruments. For those systems with separations and predicted contrasts detectable with next-generation integral field spectrographs, including CHARIS (Peters et al. 2012), SPHERE (Claudi et al. 2006), and GPI (Macintosh et al. 2014), follow-up observations can help investigate their atmospheres. The higher sensitivities of all these instruments and spacecraft will increase the number of known companions and expand the overlap range of those detectable in both radial velocity and astrometric data.

In this paper, we introduce a new, open-source software package, ExoSOFT, for performing joint fitting of a system's orbital elements. We have organized the paper as follows. Section 2 covers the joint three-dimensional model used, the background on fitting models of this kind, and the approach ExoSOFT utilizes in Section 3; the implementation and verification of its capabilities for a synthetic example system is given in Section 4.

ExoSOFT is available for public download from GitHub⁸ under GNU General Public License v3, and has also been placed on the Python Package Index⁹ for simple installation. The release version of the codebase used to produced the results in Section 4.2 has been archived on Zenodo (Mede & Brandt 2017).¹⁰

2. Modeling the Orbit

The classical solution to the two-body problem in Newtonian gravity is given by using Kepler's laws; six parameters are needed to fully describe the orbits (e.g., Heintz 1978). We implement this approach, adding some intermediate parameters to facilitate the calculation and comparison to measured radial velocity and astrometry. Our full set of parameters is given in the top section of Table 1; these are used to calculate the additional parameters in the lower section.

Table 1. Model Parameters

Symbol	Parameter
ϖ	Parallax
P	Orbital period
m₁	Mass of the primary star
m₂	Mass of the companion
Ω	Longitude of the Ascending Node
ω	Argument of Periapsis
i	Inclination
e	Eccentricity
T_o	Time of Last Periapsis
γ	Instrument velocity offset

m_tot	Total mass (m₁ + m₂)
a_tot	Total semimajor axis (a₁ + a₂)
θ	True Anomaly
E	Eccentric Anomaly
M	Mean Anomaly
K	Radial Velocity Semi-amplitude

Download table as: ASCII Typeset image

The predicted radial velocity and astrometry values, to be compared to the measured data for epoch t, are calculated in the following way. The equations for the projected line-of-sight velocity, or radial velocity v, and projected locations in the plane of the sky, the astrometry, both require the anomalies M, E, and θ. Thus, the anomalies are determined first, followed by v, ending with the Thiele–Innes equations to obtain the relative right ascension, Δα, and declination, Δδ.

The eccentric anomaly E and true anomaly θ are found starting from the mean anomaly M,

$\begin{eqnarray}&&M=\displaystyle \frac{2\pi }{P}(t-{T}_{o}),\end{eqnarray} \tag{ 1 }$

and its relation to E through Kepler's equation is

$\begin{eqnarray}&&M=E-e\times \sin (E),\end{eqnarray} \tag{ 2 }$

where e is the eccentricity. We solve Equation (2) with Newton's method and use the resulting value to calculate θ following

$\begin{eqnarray}&&{\theta }^{\prime }={\cos }^{-1}\left(\displaystyle \frac{\cos (E)-e}{1-e\times \cos (E)}\right)\end{eqnarray} \tag{ 3 }$

and

$\begin{eqnarray}\theta =\left\{\begin{array}{ll}{\theta }^{\prime } & E\leqslant \pi \\ 2\pi -{\theta }^{\prime } & E\gt \pi \end{array}\right..\end{eqnarray} \tag{ 4 }$

Equations (2)–(4) provide the two key time-dependent values, E and θ.

The radial velocity as measured for the primary due to the companion's motion is

$\begin{eqnarray}&&v(t)={K}_{1}[\cos (\theta (t)+{\omega }_{1})+e\cos ({\omega }_{1})]+\gamma ,\end{eqnarray} \tag{ 5 }$

where ω₁ is the argument of periapsis for the primary, γ is the instrument velocity offset, and the semi-amplitude of the primary (K₁) is

$\begin{eqnarray}&&{K}_{1}={\left[\displaystyle \frac{2\pi G}{P}\right]}^{1/3}\displaystyle \frac{{m}_{2}\sin (i)}{{({m}_{\mathrm{tot}})}^{2/3}}\displaystyle \frac{1}{\sqrt{1-{e}^{2}}},\end{eqnarray} \tag{ 6 }$

with m₂ being the companion's mass, m_tot the total mass of both objects, and P the period. When only radial velocity data are available, Equation (6) shows that the companion's mass and inclination cannot be independently constrained.

The Thiele–Innes method is a standard approach to compute the predicted astrometric values given a set of input orbital elements (Van den Bos 1932; Thiele 1883), using a set of intermediate quantities:

$\begin{eqnarray}&&A={a}_{\mathrm{tot}}[\cos ({{\rm{\Omega }}}_{2})\,\cos ({\omega }_{2})-\sin ({{\rm{\Omega }}}_{2})\,\sin ({\omega }_{2})\,\cos (i)]\end{eqnarray} \tag{ 7 }$

$\begin{eqnarray}&&B={a}_{\mathrm{tot}}[\sin ({{\rm{\Omega }}}_{2})\,\cos ({\omega }_{2})+\cos ({{\rm{\Omega }}}_{2})\,\sin ({\omega }_{2})\,\cos (i)]\end{eqnarray} \tag{ 8 }$

$\begin{eqnarray}&&F={a}_{\mathrm{tot}}[-\cos ({{\rm{\Omega }}}_{2})\,\sin ({\omega }_{2})-\sin ({{\rm{\Omega }}}_{2})\,\cos ({\omega }_{2})\,\cos (i)]\end{eqnarray} \tag{ 9 }$

$\begin{eqnarray}&&G={a}_{\mathrm{tot}}[-\sin ({{\rm{\Omega }}}_{2})\,\sin ({\omega }_{2})+\cos ({{\rm{\Omega }}}_{2})\,\cos ({\omega }_{2})\,\cos (i)].\end{eqnarray} \tag{ 10 }$

These equations are in the form that is standard in the field (Aitken 1935; Heintz 1978). In Equations (7)–(10), the total semimajor axis of the apparent ellipse (a_tot) is calculated with Kepler's third law,

$\begin{eqnarray}&&{a}_{\mathrm{tot}}={\left[\displaystyle \frac{{P}^{2}{{Gm}}_{\mathrm{tot}}}{4{\pi }^{2}}\right]}^{1/3},\end{eqnarray} \tag{ 11 }$

with G in Equation (11) being Newton's gravitational constant.

Equations (7)–(10) give the relative location of the secondary with respect to the primary in the plane of the sky. The matching relative right ascension Δα and declination Δδ for comparison to the data are found with

$\begin{eqnarray}&&{\rm{\Delta }}\delta ={AX}(t)+{FY}(t),\end{eqnarray} \tag{ 12 }$

$\begin{eqnarray}&&{\rm{\Delta }}\alpha ={BX}(t)+{GY}(t),\end{eqnarray} \tag{ 13 }$

given

$\begin{eqnarray}&&X(t)=\cos (E(t))-e,\end{eqnarray} \tag{ 14 }$

$\begin{eqnarray}&&Y(t)=\sqrt{1-{e}^{2}}\sin (E(t)).\end{eqnarray} \tag{ 15 }$

The orbits of the two bodies m₁ and m₂ have matching values for all the orbital elements, except for Ω and ω. To convert between them, the relations ω₁ = ω₂ + π and Ω₁ = Ω₂ + π are used. For all cases, we assume the measured astrometry to be the relative positions of the two bodies tracing out an apparent ellipse on the sky, which reduces to the orbit of the companion when m₁ ≫ m₂.

The equations in this section calculate a set of observables (v, Δα, and Δδ) that we then compare to the corresponding measurements.

3. Model Fitting

3.1. Method

Model fitting can be broken down into two main tasks: solving for a set of parameters approaching those with the maximum likelihood and estimating their uncertainties. In a Bayesian framework, the full set of information is contained within the posterior probability distribution of the model parameters, $p(\mathrm{Model}| \mathrm{Data})$ . Ford (2005, 2006) discuss the use of Bayesian inference to solve for this distribution in the case of fitting observations of exoplanets to Keplerian models; for details beyond those given here, we refer the reader to those papers.

The posterior probability distribution, $p(\mathrm{Model}| \mathrm{Data})$ , is computed using Bayes' theorem,

$\begin{eqnarray}&&p(\mathrm{Model}| \mathrm{Data})\propto p(\mathrm{Data}| \mathrm{Model})p(\mathrm{Model}).\end{eqnarray} \tag{ 16 }$

Here $p(\mathrm{Data}| \mathrm{Model})$ is the likelihood of the model parameters, ${\mathscr{L}}(\mathrm{Model})$ , and p(Model) is the prior probability of the parameters based on previous knowledge. Assuming that the errors in the observed data are independent and Gaussian distributed, the likelihood can be expressed in terms of the usual χ²:

$\begin{eqnarray}&&{\mathscr{L}}(\mathrm{Model})=p(\mathrm{Data}| \mathrm{Model})\propto \exp \left(-\displaystyle \frac{{\chi }^{2}}{2}\right),\end{eqnarray} \tag{ 17 }$

with χ² being a sum over data points i,

$\begin{eqnarray}&&{\chi }^{2}=\displaystyle \sum _{i}\,\displaystyle \frac{{({\mathrm{Model}}_{i}-{\mathrm{Data}}_{i})}^{2}}{{\sigma }_{i}^{2}}\end{eqnarray} \tag{ 18 }$

The posterior probability distribution can then be computed by finding the value of ${\mathscr{L}}(\mathrm{Model})p(\mathrm{Model})$ for all possible parameter combinations and integrating. Due to the high dimensionality of Keplerian models however, the parameter space is large, making this brute-force approach computationally problematic. A variety of routines exist for sufficiently sampling the posterior without calculating its value everywhere. In Section 3.2 we will discuss the approaches used in ExoSOFT, particularly that of Markov Chain Monte Carlo (MCMC), along with the specific priors for each of the model parameters.

3.2. MCMC

Forming the posterior distributions can be achieved by integrating Bayes' theorem over the model's parameter space. Depending on the form of the likelihood function and the dimensionality of the model space, evaluating this integral directly or through "shotgun" Monte Carlo can prove computationally impractical. One alternative is to draw statistically dependent samples with MCMC (Liu 2004). A Markov Chain is a sequence of points satisfying the dependency defined by the Markov property: each point in the chain is a function only of its immediate predecessor with no memory of the chain history. MCMC guarantees convergence to the posterior probability distributions in the limit of a large number of steps. A widely used algorithm to form such chains is that of Metropolis–Hastings, originally detailed in Metropolis et al. (1953) and further generalized by Hastings (1970). Unfortunately, the convergence rate is typically impossible to compute; we discuss convergence in more detail in Section 3.3.

There are multiple approaches to sampling the posterior distributions of the Keplerian model in Section 2 with ExoSOFT. One is to use a built-in MCMC sampler based on the Metropolis–Hastings algorithm. This implements the standard form of the rejection function,

$\begin{eqnarray}&&r({\boldsymbol{\xi }},{{\boldsymbol{\xi }}}_{j+1})=\min \,\left\{1,\displaystyle \frac{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j+1})}{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j})}\displaystyle \frac{p({{\boldsymbol{\xi }}}_{j+1})}{p({{\boldsymbol{\xi }}}_{j})}\right\},\end{eqnarray} \tag{ 19 }$

where ${{\boldsymbol{\xi }}}_{j+1}$ is a step, a set of parameters; with this, the likelihood is ${\mathscr{L}}={\mathscr{L}}({\mathrm{Model}}_{j})={\mathscr{L}}({{\boldsymbol{\xi }}}_{j})$ , and the prior probability is $p=p(\mathrm{Model})=p({{\boldsymbol{\xi }}}_{j})$ . The proposed steps are drawn from normal distributions of fixed width in each parameter centered on the current step. Following the Gaussian error assumption in Section 3.1, the likelihood ratio is

$\begin{eqnarray}&&\displaystyle \frac{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j+1})}{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j})}=\exp \,\left(-\displaystyle \frac{{\chi }_{{{\boldsymbol{\xi }}}_{j+1}}^{2}-{\chi }_{{{\boldsymbol{\xi }}}_{j}}^{2}}{2}\right).\end{eqnarray} \tag{ 20 }$

ExoSOFT also provides ${\mathscr{L}}$ and p in a format that easily interfaces with MCMC packages like emcee (Foreman-Mackey et al. 2013), using the combined function ${ln}({\mathscr{L}}({\boldsymbol{\xi }}))+{ln}(p({\boldsymbol{\xi }}))$ . As such, emcee has been made a dependency of ExoSOFT and can be selected for use with a flag in the settings file.

We chose prior probabilities, or "priors," for each parameter with simplicity in mind. These should only be considered as suggestions, and may be easily changed in the ExoSOFT settings files to suit the user's preferences or to take into account the results of more recent work. The complete set of parameters in our full joint analysis are (m₁, m₂, ϖ, P, i, e, T_o, Ω, ω, γ). Independent of their distribution, the priors are all given wide boundary limits, each available for change within the settings. Those of Ω and ω are drawn without limits, but the stored values are then shifted into [0, 2π].

For m₁, the prior can be represented by the initial mass function (IMF) or present day mass function (PDMF). Chabrier (2003) found functional forms of these for a range of stellar populations, including those of the galactic disk. For m₂, however, particularly for stellar binaries, the IMF may be a poor approximation to the companion mass function (CMF); that of Metchev & Hillenbrand (2009) might be more appropriate. More recent investigations (e.g., Brandt et al. 2014) were consistent with Metchev & Hillenbrand (2009) down to a few Jupiter masses. PDMF and IMF can be found in Table 1 of Chabrier (2003), and that of the CMF in Equation (8) of Metchev & Hillenbrand (2009). By default we use the PDMF for the primary and CMF for the companion, with the IMF being an option also available to the user for either body.

The prior for the parallax (ϖ) may be derived from the assumption that stars are uniformly distributed in space. For nearby objects, within a few 100 pc, this prior is

$\begin{eqnarray}&&p(\varpi )\propto \displaystyle \frac{1}{{\varpi }^{4}}.\end{eqnarray} \tag{ 21 }$

In cases where high-S/N distance measurements are available, this information can be included with a Gaussian prior with center and width equal to its distance and error, respectively. These two components would then be multiplied to form a combined prior for the parallax. For distant objects, over a few hundred parsecs, the assumption of uniform stellar distribution is invalid. Low-S/N measurements of the parallax, in combination with Equation (21), would lead to unrealistically large distances. ExoSOFT currently requires accurately measured parallaxes. In practice, these are available for nearby bright stars from Hipparcos (van Leeuwen 2007), and the sample of objects with well-measured parallaxes will soon expand dramatically, thanks to Gaia.

The adopted prior for the period is a power law, $p(P)\propto {P}^{\gamma }$ . Radial velocity surveys favor γ ≈ −0.7 (Cumming et al. 2008), while γ = −1 gives equal numbers per logarithmic period.

We assume systems to be randomly oriented with $p(i)\,\propto \sin (i)$ .

Following the results of investigations into the eccentricity distributions of low-mass companions (e.g., Duquennoy & Mayor 1991; Cumming et al. 2008; Jurić & Tremaine 2008; Shen & Turner 2008; Hogg et al. 2010; Wang & Ford 2011; Kipping 2013; Kipping & Sandford 2016; Shabram et al. 2016), ExoSOFT offers multiple priors for the eccentricity. Currently, these include uniform, Rayleigh, Rayleigh+Exponential, Beta, $p(e)=2e$ (for companions with P > 1000 days), and those of Shen & Turner (2008). Additional priors can be added and modified manually by the user. This can be accomplished by placing a customized version of the priors module in the user's current working directory.

The remainder of the parameters, (T₀, Ω, ω, γ), has been given uniform priors.

3.3. Tuning MCMC

Given an infinitely long Markov chain, convergence to the posterior probability distributions is guaranteed. In practice, when trying to integrate Bayes' theorem for a particular model, one of the primary goals is to minimize the number of samples necessary to achieve sufficient convergence. Therefore, before performing MCMC, a metric to measure its convergence is required along with steps to optimize the convergence rate.

It is difficult to determine if a Markov Chain has converged to the posterior probability distribution. From a given set of parameter values, there is no guarantee that the likelihood function will smoothly increase toward the global maximum of the posterior. In cases with complicated likelihood topography, there might be many places where a chain could become stuck. Chains started near a local maximum could falsely appear to have converged or take a long time to diffuse into other regions. For this reason, the specific starting position of a Markov chain, and any following samples drawn far from the likelihood peaks, must be removed to avoid any disproportionalities in the posterior distributions they might induce. This initial period is referred to as the "burn-in." Although the exact number of steps involved is not clearly defined, Tegmark et al. (2004) took it to be over when the likelihood in a particular chain was equal to the median likelihood of all chains combined.

The rate at which a Markov chain explores the posteriors is related to the width, or σ, of each parameter's proposal distribution . For large σ, proposed steps tend to lie far from the current one and in a region where they are unlikely to be accepted. By contrast, a small σ would lead to a high rate of acceptance, but a slow rate at which the posteriors are explored. Gelman et al. (1996, p. 599) showed that the highest diffusion speed of a one-dimensional model was achieved when the probability of accepting a proposed step was 0.441. They tested a variety of multi-dimensional models, concluding that an acceptance rate of ∼25% is suitable for higher dimensional models and ∼50% for those with only one or two dimensions.

With a Markov sampling scheme, the dependence between neighboring samples can dramatically reduce the number of effectively independent steps taken. Unfortunately, prior to running a Markov chain, there are no direct methods to predict how many independent steps will be taken over a certain number of samples. Instead, following the completion of all Markov chains, a common practice is to estimate if enough steps were taken using either the Gelman–Rubin statistic (Gelman & Rubin 1992) or the integrated autocorrelation time (Goodman & Weare 2010). As discussed in Foreman-Mackey et al. (2013) and Goodman & Weare (2010), the integrated autocorrelation time can be represented by

$\begin{eqnarray}&&\tau =1+2\displaystyle \sum _{T=1}^{\infty }\,\displaystyle \frac{C(T)}{C(0)},\end{eqnarray} \tag{ 22 }$

where T is the time lag necessary for covariance between samples, C(T), to go to zero, ensuring their independence. With this, the number of effectively independent steps in a Markov chain of length L_c is ∼L_c/τ.

4. Implementation and Verification

In this section, we describe how the model discussed in Section 2 and types of parameter space exploration in Section 3 are implemented. The bulk of the code was written in the Python programming language, with the more computationally expensive model in Cython. Implementing the model in this way makes it run 5–30 times faster than the same code written purely in Python. ExoSOFT can be found for public download at https://github.com/kylemede/ExoSOFT.

4.1. Simulation Stages

ExoSOFT offers four of its own algorithms for exploring the parameter space, each for a different task: standard "shotgun" Monte Carlo, Simulated Annealing, Sigma Tuning, and MCMC. It is also ready to be interfaced with third-party sampling tools.

As discussed in Section 3.2, standard Monte Carlo is suitable for models with few free parameters, or when the likelihood function is not strongly peaked. In such cases, it could in fact sample the posterior probability distributions significantly faster than MCMC. With each sample being completely independent, it can also be useful for getting an idea of where the regions of minimum χ² lie.

In cases where standard Monte Carlo is inefficient, Markov sampling techniques are suggested. To minimize the burn-in, suitable starting positions for the Markov chains are found using Simulated Annealing. As discussed in Kirkpatrick et al. (1983), Simulated Annealing also makes use of the Metropolis–Hastings algorithm with an additional "temperature" factor in the likelihoods ratio, modifying this ratio in the rejection function to

$\begin{eqnarray}&&{\left(\displaystyle \frac{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j+1})}{{\mathscr{L}}({{\boldsymbol{\xi }}}_{j})}\right)}^{1/T}=\exp \,\left(-\displaystyle \frac{{\chi }_{{{\boldsymbol{\xi }}}_{j+1}}^{2}-{\chi }_{{{\boldsymbol{\xi }}}_{j}}^{2}}{2T}\right).\end{eqnarray} \tag{ 23 }$

At high temperatures, Equation (23) is temperature dominated, and likely to accept even unfavorable steps, thereby mitigating any local topographical bumps encountered. A cooling scheme started at a sufficiently high temperature (⪆100) explores a large portion of the parameter space and is slowly "annealed" toward the parameter set with the global maximum likelihood. In our implementation, the chains are repeatedly heated and cooled, each time from a lower starting temperature, while maintaining a list of the samples with the highest likelihoods achieved so far. We stop this process when the fits have all converged to a single set of parameter peaks in the posterior probability distributions.

After finding an initial set of parameters, ExoSOFT can sample the posterior probability distributions with either MCMC or the affine-invariant ensemble sampler built into emcee (Foreman-Mackey et al. 2013). When using MCMC, we first apply a version of the Metropolis–Hastings algorithm to determine the appropriate widths for each proposal distribution. Periodically, the acceptance rates are calculated, typically after ∼200 samples per varied parameter, and the fixed σ values of the normal proposal distributions for each parameter incremented accordingly until stable acceptance rates between 25%–35% are reached; this process is occasionally referred to as Sigma Tuning. In contrast to the earlier Simulated Annealing and later MCMC stages, Sigma Tuning only requires a single chain and has been implemented in such a way that it automatically completes once the acceptance rates stabilize. In cases where the observational data is sparse, or parameters highly correlated, the use of an ensemble sampler has been proven to increase the sampling efficiency (Goodman & Weare 2010; Foreman-Mackey et al. 2013). Thus, our recommended approach is to utilize the starting position produced by Simulated Annealing to initialize the walkers of the affine-invariant ensemble sampler built into emcee (Foreman-Mackey et al. 2013).

The solutions for determining a set of initial parameters and tuning the proposal distribution widths lend themselves well to integration with other third-party sampling tools, including PyMC (Fonnesbeck et al. 2015) and PyMultiNest (Feroz et al. 2009).

4.2. Verification

Our model was verified with synthetic data produced using an independent Keplerian tool built upon the PyAstronomy package.¹¹ The observable values had their errors realized from Gaussian distributions, using a percentage of each observable's absolute mean as the distribution widths. Such synthetic data were produced for various systems spanning the range of exoplanets and binary star systems observed to date to ensure ExoSOFT's ability to fit any system of interest.

For demonstration purposes, the approximate orbital elements of Jupiter were used assuming a host star at a distance of 20 pc, or a parallax of 50 mas, with the orbital plane inclined by 45° to Earth's line of sight. A variety of measurement errors were assumed for the same orbital elements to show the convergence of the resulting posterior distributions with improving data quality; the errors indicated are a percentage of each observable parameter's absolute mean value.

Figure 1 shows the marginalized posteriors using realizations of 10%, 5%, and 1% Gaussian noise on the measured parallax, radial velocity, and angular separation. The 68% confidence regions shrink by a factor of ∼10 from the 10% to 1% cases, shown as the green and red lines in Figure 1, respectively. The posterior distributions of T₀ and ω (not shown) have larger widths and decrease faster with improving data quality; this is an artifact of our choice of a nearly circular orbit. In all of the hypothetical Jupiter demonstration cases, ExoSOFT used a uniform prior for e, appropriate for planets. This was implemented by drawing ω and e simultaneously according to the parametrization from Albrecht et al. (2012).

Figure 2 shows the resulting marginalized posteriors for one realization of 5% measurement errors; Table 2 provides a complete summary of the fit values. The covariance between a subset of the orbital elements is represented in a corner plot in Figure 4. Here the relationship between the masses and semimajor axis can be seen, arising from Kepler's third law in Equation (11) being directly included in the model equations.

Table 2. Synthetic Jupiter Fit Results

Parameter	Synthetic	Best Sample	Median	1σ Range
m₁ [M_⊙]	1	0.87	0.93	0.13
m₂ [M_J]	1	0.942	0.985	0.094
ϖ [mas]	50	51.4	50.3	2.3
Ω₂ [°]	100.6	101.4	101.7	1.1
e	0.048	0.051	0.042	0.014
T₀ [MJD]	50639	50670	50660	140
P [Years]	11.9	11.94	12.05	0.18
i [°]	45	43.7	44.2	2.2
ω₂ [°]	14.8	17	16	12
a_tot [au]	5.21	4.98	5.13	0.25
γ [m s⁻¹]	0	0.03	−0.06	0.16
χ² (ν)	46.9 (35)	⋯	⋯	⋯

Download table as: ASCII Typeset image

The best sample, defined as the set of stored parameters with the highest likelihood, has χ² = 46.9 for 35 degrees of freedom (25 epochs of radial velocity data, 10 epochs of astrometry in R.A. and decl., and 10 fitted parameters), for a reduced χ² = 1.34. Figure 3 shows the orbit for the best sample plotted atop the noisy astrometry and radial velocity data. The residuals, as expected, scatter about the best sample's orbit and typically lie ∼1σ away.

**Figure 3.** Orbit for the best sample plotted atop the synthetic astrometry (left) and radial velocity data (right) of our Jupiter analogue test case with a 5% realization of the errors. In the astrometry fit, the solid green line represents the projected semimajor axis and the dashed–dotted line the line of nodes. The red dots represent the predicted location and radial velocity values at the time of each observation.
Download figure:
Standard image High-resolution image

**Figure 4.** Corner plot for a subset of the parameters fit during verification with synthetic data of a Jupiter analogue with a 5% realization of the errors in radial velocity and astrometry, at a distance of 20 pc and inclination of 45°. The solid blue lines represent the true values. The covariance between the masses and total semimajor axis seen here arises from the direct use of *Kepler*'s third law in our model discussed in Section 2.
Download figure:
Standard image High-resolution image

The joint model given in Section 2 mutually constrains the parameters P, m_tot, ω*, i, e, and T₀, where ω* represents the value for either body following the relation ω₁ = ω₂ + π. The other parameters (γ, m₂, Ω₂, ϖ) are constrained by a single observation type, or in the form of a combined parameter, as is the case for m_tot = m₁ + m₂. We performed the final fitting with the affine-invariant ensemble sampler of emcee and investigated the mixing and convergence of each parameter using the integrated autocorrelation time, discussed in Section 3.3 and represented in Equation (22). The least convergent parameter in all three error realizations was T₀, with a total of 5 × 10⁶ samples distributed over 1000 walkers, its τ was 690 for the 5% case. This slow convergence is an artifact of our choice of a nearly circular orbit (the definition of T₀ is arbitrary for a circular orbit) and is much less pronounced for more eccentric systems.

Although this test case is at the edge of current instruments' sensitivity, the ∼0 farcs 2 separation is within the expected limits of upcoming instruments like CHARIS (Peters et al. 2012), SPHERE (Claudi et al. 2006), and GPI (Macintosh et al. 2014). However, even with the improved contrast ratios expected at these separations, this hypothetical planetary system would need to be dramatically younger than Jupiter for detection through direct imaging techniques. The uncertainties in m₁ and m₂ also depend on the precision of the measured parallax, though this is already better than 1% for many bright, nearby stars (van Leeuwen 2007) and will further improve with the forthcoming Gaia data release.

An example of ExoSOFT applied to real data can be found in Hełminiak et al. (2016). In that paper, we introduce the first direct imaging detections of the companion to V450 And, solve for its joint orbital solution, and compare the resulting dynamical masses to those from stellar model fits based on the photometry.

5. Conclusion

We have presented ExoSOFT, a toolbox for the Keplerian orbital analysis of exoplanets and binary star systems. It is free to compile, open source, fits any combination of astrometric and radial velocity data, and offers four of its own parameter space exploration techniques, along with an integration of the affine-invariant ensemble sampler in emcee (Foreman-Mackey et al. 2013). Our suggestion is to first use Simulated Annealing to find a set of parameters near the global maximum likelihood. This can then be followed by Sigma Tuning and MCMC to optimize and fill out the posterior probability distributions. When data is sparse or parameters highly correlated, we instead recommend emcee to increase sampling efficiency. ExoSOFT is packaged together with a set of post-processing and plotting routines to summarize the results. Verifications were performed by fitting noisy synthetic data produced with an independent Keplerian model for a variety of systems ranging the entire parameter space. Examples of its use on real data include Hełminiak et al. (2016) for the V450 And system, and the τ Boo AB binary in Mede & Brandt (2013).

Simultaneous fitting to both radial velocity and astrometric data helps to break the degeneracies in Keplerian model parameters. The dynamical mass estimates of the companion and the well-constrained values of the other orbital elements found in this way are important clues to substellar formation and evolution. The companion's mass and age may be used to place constraints on the input parameters of atmospheric models for brown dwarf and low-mass stars. The initial conditions of substellar objects remain uncertain, with theoretical predictions ranging from hot-start to cold-start scenarios (Marley et al. 2007). Mass, eccentricity, and separation are also important to help distinguish between the variety of dynamical evolution possibilities in multiplanet systems.

Recent surveys aiming at the high-end range of substellar mass companions (Wilson et al. 2016) and at the lower end (Hebrard et al. 2016) show how the capabilities of new instruments are expanding the range of companions detectable with the radial velocity method. Complementary to these, observations of HR 8799 with SPHERE and GPI (e.g., Ingraham et al. 2014; Apai et al. 2016; Bonnefoy et al. 2016; Zurlo et al. 2016) highlight the improved astrometric and characterization capabilities with next-generation integral field spectrographs. Combining ground-based facilities with the Gaia spacecraft will broaden not only the mass range, but also the number of objects simultaneously observed with both radial velocity and direct imaging techniques. With its capacity to simultaneously fit both of these data types, effectively explore the parameter space, and with its post-processing and plotting tools, ExoSOFT is well-suited to constrain the orbits of newly discovered systems.

K.M. gratefully acknowledges support from the Mitsubishi Corporation International Student Scholarship. This work was performed in part under contract with the Jet Propulsion Laboratory (JPL) funded by NASA through the Sagan Fellowship Program executed by the NASA Exoplanet Science Institute.

This research made use of the SIMBAD literature database, operated at CDS, Strasbourge, France, and of NASA's Astrophysics Data System. This research made use of the NASA/IPAC/NExScI Star and Exoplanet Database, which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration. This research has made use of the Exoplanet Orbit Database and the Exoplanet Data Explorer at exoplanets.org. This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program.

The Exoplanet Simple Orbit Fitting Toolbox (ExoSOFT): An Open-source Tool for Efficient Fitting of Astrometric and Radial Velocity Data

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Modeling the Orbit