INTRODUCTION

Anticancer agents used in combination are fundamental to successful cancer treatment. Indeed, even if the administration of a single drug can appear sufficient at first, emergence of resistance can appear and thus reduce the efficacy of the drug. Therefore, the association of different therapies can target tumor cells with different sensitivities and consequently improve the response to treatment. These combination therapies often involve a marketed drug and a new drug. In these associations, the characteristics of the marketed drug (pharmacokinetic–pharmacodynamic relationships (PKPD), maximal tolerated dose (MTD)) are already known when it is administered alone whereas the properties of the association have to be characterized. Compared to the development of a single drug, development of combination therapies is complex. Instead of a simple unidimensional dose–safety profile, both drug dosages can be changed resulting in a continuum of potential MTD, meaning that multiple potential Recommended Phase 2 Doses (RP2Ds) can be defined.

Different strategies are currently used in phase 1 trials to try to handle this difference of drug development. Standard trials fix the level of the marketed drug close to its RP2D and the other drug is then escalated with the aim to reach the RP2D in combination. A problem with this standard approach is that the fixed dose agent can induce substantial toxicity (as it is at its RP2D) and will only permit administration of a minimal amount of the additional drug to avoid exceeding the tolerated toxicity threshold. Consequently, different alternative dose finding strategies have been proposed including alternated dose escalation, simultaneous escalation of both agents, or performing parallel trials with the standard approach (fixing the level of one drug and escalating the other drug and vice versa) (1).

In contrast, model-based dose escalation designs based on the continual reassessment method (CRM) (2,3,4) have also been proposed. These approaches assume that the probability of a toxicity occurring is characterized by a function of the dose. In the case of combination therapies, the models describing the dose–toxicity relationship are more elaborative than those used for single drugs and require specific parameters when drugs are co-administered (5,6). Other strategies elaborated from the CRM approach consider ordering dose combinations based on toxicity (7,8). However, even in these approaches, schedule optimization is not explored (unless in different arms) and efficacy is not taken into account.

The benefits of modeling and simulation in drug development have become evident over the last decades. This approach consists of developing mathematical models to describe data as it emerges from drug development in order to allow the possibility to forecast the next step(s) in drug development and therefore provide more information for decision-making (9,10). In parallel, control theory is a concept that deals with influencing the behavior of dynamical systems and how their behavior is modified by feedback mechanisms. By monitoring a system, control theory can provide adjustments to stay within a desired output range and to get as close as possible to a desired goal (11,12). By combining these two approaches using model-based adaptive optimal design (MBAOD), one may have a solution to the aforementioned multi-dimensional problem in the development of combination therapies.

MBAOD approaches using nonlinear mixed effect models have been shown to be less sensitive to misspecification in the design stage (model structure, distribution of parameters, etc.) (13,14). In contrast to non-adaptive methods, MBAOD consists of a series of adaptive steps where the study population is divided into several cohorts. Consequently, adaptive designs allow the updating of prior information by interim analyses as more information is available (15,16,17).

In the initial cohort of a MBAOD study (Fig. 1), the design is driven by an initial guess of models and parameters and/or the investigators’ safety assessments. Once data has been collected, the fit of the models may be evaluated. The knowledge of the system is, in this step, updated with new information via the collected data, and the influence of the potential misspecification in the prior information may thus be reduced as the study progresses. Continuing cohorts’ enrollment sequentially and updating the guess of model and parameters, MBAOD can constantly improve the design during the trial, resulting in a more informed design decision as the study continues. In the context of a dose finding study, control theory can be used between cohorts to optimize the dosing regimen (dose amount and/or dosing schedule) according to a criterion based on toxicity and/or efficacy. Additionally, MBAODs have the option of determining the endpoint of the trial within a stopping criterion, which can be evaluated after each cohort. With this defined stopping criterion, a study may be stopped before enrolling the pre-determined total number of patients and thus the sample size could be smaller compared to classical approaches (18).

Fig. 1
figure 1

Schematic of the model-based adaptive optimal design (MBAOD) process used in a real study and in a case of methodology (using a simulation step)

In projects aiming at studying the properties of the MBAOD methodology in general, or in the planning of a specific trial (in order to a priori optimize adaptation settings), a simulation step is included in the MBAOD process to mimic the conduct of a study (Fig. 1).

In this work, the use of MBAOD was evaluated through different simulated scenarios to assure that design aspects of dose escalations strategies are favorable and robust enough to identify “optimal” dosing regimen combinations of anticancer drugs.

MATERIALS AND METHODS

In this study, the use of the MBAOD concept in the context of dose finding of combination therapy was exemplified by the association of a marketed drug (paclitaxel) and a hypothetical compound in development (denoted as drug S) to be added on top of paclitaxel (the rationale of the combination lying on a decrease of the risk of drug resistance and/or on the association of compounds with separate mechanisms of action). In this combination, paclitaxel is given once a week for 3 weeks as a 1-h infusion and drug S is administered as an IV bolus. It was assumed that prior knowledge on paclitaxel (19) and preclinical data on drug S indicated that myelosuppression was expected to be the main toxicity, allowing us to consider neutropenia as the only dose-limiting toxicity (DLT).

For this work, a number of different components described below were required:

  • PKPD Model

A previous model developed by Henningsson et al. (20) was used to simulate concentration time profiles of paclitaxel. In this model, the PK of paclitaxel free concentration is described by a two-compartment disposition model with first order elimination. The PK of drug S was also described by a two-compartment model with first order elimination.

The PK profiles were used to drive the hematological model developed by Friberg et al. (21), which was used to simulate neutrophil count data. This semi-physiological model consists of one compartment representing the proliferating cells in the bone marrow, three transit compartments mimicking the maturation of the non-proliferating precursors into neutrophils before transferring into the circulating compartment where neutrophils are observed. The model contains a feedback function acting as the moderator of a stimulating factor on the proliferation rate when circulating cells count is below the baseline. For both compounds, the drug effects were described by linear functions (slope parameter) acting additively on the proliferative cells.

During phase 1 trials in oncology, the main purpose is dedicated to focus on both safety and PK aspects, efficacy data being often uneasy to assess given the heterogeneity in patient population. However, several studies have shown a correlation between preclinical efficacy determined from xenograft experiments and clinical response (22,23). Wang et al. (22) demonstrated a correlation of the tumor size at the end of the study (determined by linking the human PK model to a preclinical xenograft PD model) with the probability of success of the clinical study. This approach was adapted to exemplify the integration of efficacy in the current study. The efficacy information from preclinical studies was simulated using a tumor growth inhibition model developed by Simeoni et al. (24). In this model, the natural tumor growth is described by an exponential phase followed by a linear growth phase. In treated animals, it is assumed that the anticancer drug makes some cells non-proliferating, bringing them to death through a mortality chain.

A schematic of the complete PKPD model is shown in Fig. 2 (differential equations are provided in a supplementary material).

  • Simulation Step

Fig. 2
figure 2

Representation of the PKPD models for myelosuppression and tumor growth inhibition

Simulations used to generate the patients enrolled in the different cohorts were based on the previously described PKPD model and the parameter values shown in Table I.

Table I Parameters of the PKPD Model Used for the Simulation

Neutrophil counts were simulated twice a week for 4 weeks (DLTs being assessed only during the first cycle according to the Common terminology criteria for adverse events). All cohorts involved in the trials were composed by three patients.

  • Estimation Steps and Prior Information

The structure of the models used for estimating the simulated datasets was the same as the model used for the simulation step.

The population PK parameters (typical values and variances) for both drugs were considered as known from previous clinical studies and were fixed while the empirical Bayes estimates were estimated. In order to reduce MBAOD runtime, only drug S drug effect (SlopedrugS), its associated interindividual variability, and the residual error component (Ɛ) were estimated in the myelosuppression model. To bridge the gap between preclinical and clinical settings regarding the toxicological aspect and to use all the information available, information from preclinical species on drug effect scaled up to human (considering species differences in protein binding and drug sensitivity) was used for drug S as prior information (25). A relative standard error of 100% of the prior parameter value was chosen to represent the usual twofold magnitude considered as reasonable when predicting from preclinical to clinical (25). Interindividual variability on the drug effect was considered as a system-specific parameter and therefore, an estimate from previous clinical studies was used as the prior parameter value (21) (as shown in Table II). In one scenario, a deviation (more than a twofold factor) of the drug effect parameter value between the one scaled from preclinical to clinical and used as prior information, and the true drug effect parameter value in humans was investigated to explore the sensitivity to a biased preclinical prior. Prior information was implemented using the NWPRI subroutine (normal-inverse Wishart distribution) available in NONMEM (26).

  • Optimization

    • Optimization Process and Selection Criteria:

Table II Estimated Parameters of the PKPD Model and Prior Information Used in the Estimation Step

The optimization step to find the best dosing regimen was executed by targeting a maximum efficacy with a reasonable probability of DLT. This was done by maximizing a cost function consisting of two parts:

One part considered toxicity. A probability of DLT greater than or equal to 1/3 was considered unacceptable. This probability was determined by simulating 500 individuals from the PKPD model and deducing the proportion of individuals experiencing a grade 4 neutropenia.

The other part of the cost function accounted for the efficacy aspect. As mentioned above, the tumor size at the end of the study has been correlated with the probability of success in clinical studies in later development (22). Accordingly, a minimal tumor size value at the end of the study when simulating from the human PK models and the preclinical xenograft PD model was targeted.

The cost function maximized by the MBAOD process was given by:

$$ {\xi}^{\ast }= argma{x}_{\xi \in {\xi}_{design}}\left(\frac{1- Tox}{TS}\right) $$

where ξ is the optimal dosing regimen of the combination therapy (given the design space ξ design ), TS is the population prediction of the tumor size at the end of the treatment (after 4 weeks) when human PK is driving tumor killing in the xenograft animal PD model (22), and Tox is a binary variable used to incorporate an acceptance criterion for toxicity.

$$ \mathrm{Tox}=\left\{\begin{array}{l}0\kern0.5em \mathrm{when}\kern0.5em \mathrm{probability}\kern0.5em \mathrm{of}\kern0.5em \mathrm{DLT}\kern0.5em \mathrm{is}\kern0.5em <1/3\\ {}1\kern0.5em \mathrm{when}\kern0.5em \mathrm{probability}\kern0.5em \mathrm{of}\kern0.5em \mathrm{DLT}\kern0.5em \mathrm{is}\kern0.5em \ge 1/3\end{array}\right. $$

Thus, between cohorts, the MBAOD process will select the arrangement of paclitaxel dose, drug S dose, and drug S schedule that maximizes the aforementioned cost function.

  • Design Space

Paclitaxel was assumed to be administered as a weekly 1-h infusion for 3 weeks. A wide interval of paclitaxel doses was investigated in order to cover a broad range of the dose–safety space (although so low doses might not be investigated in clinical practice). The allowed dose range was between 10 and 80 mg/m2 in increments of 10 mg/m2 (eight dose levels). Drug S was assumed to be given as a bolus dose within a range of 10 to 265 mg by increments of 50% of the previous allowed lower dose (nine dose levels). Four different dosing schedules for drug S was allowed: administration every second week, 3 weeks “on” 1 week “off,” once a week for 4 weeks, or twice a week for 4 weeks. Given the true PKPD model parameters and the design space, the maximal tumor growth inhibition (TGI%) value that can be reached is 137%, TGI% being calculated with the following formula (22):

$$ TGI\%=\frac{TS_{vehicle}-{TS}_{treatment}}{TS_{vehicle}-{TS}_{initial}}\bullet 100 $$

where TGI% represents the tumor growth inhibition (%), TSvehicle is the tumor size after one cycle when no drug in administered (placebo arm), TStreatment is the tumor size after one cycle of treatment, and TSinitial is the tumor size at the onset of the treatment.

Figure 3 shows predictions of both the probability of DLT and tumor growth inhibition in relation to dose and dosing schedules after simulations from the “true” PKPD model parameters.

  • Initial Design

Fig. 3
figure 3

Heatmaps of the true dose–toxicity relationship (top) and of the true dose–efficacy relationship for each possible dosing schedule (amounts for each drug are given per dose)

The initial design (cohort 1) of this simulation study was selected to ensure the safety of the patients by using the lowest allowed dose level of drug S and also by choosing the dosing schedule which administered the lowest possible total amount of drug S to patients (i.e., drug S administered every second week). Additionally, so as to not expose patients to unefficacious treatment (respecting ethical considerations) and to explore the impact of paclitaxel starting dose on the MBAOD approach, paclitaxel dose for the initial design was considered to be 60, 70, or 80 mg/m2.

  • Constraints on Dose Escalation

There are some concerns that should be taken into consideration when performing group sequential trials such as MBAOD. In first in human studies where the knowledge of the PKPD characteristics is poor, it might not be acceptable to escalate directly from one dosing regimen to another regimen while changing both dosing schedule and drug dose at the same time. Therefore, several rules were implemented to make the dose escalation process safer.

  • Dose Constraint

A constraint on the drug S dose escalation was used in all scenarios. A dose level could only be selected if the next lower dose level had been tested in a previous cohort.

  • Schedule Constraint

A constraint on the dosing schedule was also implemented in some scenarios. The four different possibilities in the design space for the dosing schedule of drug S were ranked in terms of total drug amount administered as follows:

  1. 1.

    Once every 2 weeks

  2. 2.

    3 weeks on, 1 week off

  3. 3.

    Once a week

  4. 4.

    Twice a week

With the dosing schedule constraint, a schedule could only be selected in the optimization process if the previous schedule in the ranking has already been studied.

  • Optimization of One Variable at a Time

During the optimization process, only the change of one variable at a time (either paclitaxel dose, drug S dose, or drug S dosing schedule) was allowed in all scenarios to avoid large dose escalations.

  • Stopping Criterion:

    • The “3 + 3 Rule”

The 3 + 3 design is a rule-based algorithm which proceeds with cohorts of three patients. The first cohort of three patients is treated at the starting dose. If none of them experiences a DLT, then the next cohort will be treated at the next dose/dose schedule level. If one DLT is observed in the first three patients, the cohort is expanded by a further three patients. The dose escalation is stopped when at least two patients (out of the group of three or six patients) given a particular dose experience a DLT. The dosing regimen tested in the previous cohort is thus considered as the recommended dosing regimen for the phase 2 trial.

  • Stable Selected Dosing Regimen

As an alternative, a stopping criterion related to the stability of the selected design after the optimization process was implemented: if the optimization process selected the same design two times in a row, the process ended, meaning that enough information had been collected to determine the best dosing regimen.

  • Maximum Number of Patients

In addition to the two stopping criteria described above, the MBAOD process was set to enroll at most 45 patients (i.e., 15 cohorts). If after enrolling 45 patients one of the stopping criteria was not reached, the MBAOD process stopped.

Different Tested Scenarios

In order to evaluate the performances of MBAOD and to test its flexibility, different scenarios were compared where different possible combinations of stopping criterion and dose and schedule constraints in the escalation step were investigated. For each scenario (described in Table III), 300 clinical trials were simulated (100 replicates for each initial design).

Table III Description of the Different Scenarios Compared in the MBAOD Process

Also, the impact of the value used as preclinical prior for drug S drug effect in the myelosuppression model was investigated (supplemental material).

Evaluation of Design Performance

The performance of the different scenarios were compared according to different criteria: (1) the proportion of times of each of the selected dosing regimens for the subsequent phase 2 trial compared to the true best dosing regimens, (2) the occurrence of DLTs per trial, (3) the number of included patients per trial, and (4) the average predicted efficacy in the trial.

Tools Used

MBAOD processes were simulated using the MBAOD R-package (27). This package currently uses PopED (28,29), NONMEM (26), PsN (30), and R (31) to handle the various tasks inherent to simulating and evaluating the MBAOD process. Parallel computational processes available in the MBAOD R-package and compilation in C-language of the scripts were also used to shorten runtimes. A schematic for the examined MBAOD process is shown in Fig. 4.

Fig. 4
figure 4

Schematic of the model-based adaptive optimal design (MBAOD) process explored here

RESULTS

Selected Dosing Regimen for the Phase 2 Trial

Throughout all the tested scenarios, those using the 3 + 3 rule mainly predict non-optimal dosing regimens (with low efficacy) as the selected dosing regimen for the phase 2 trial (Table IV and Fig. 5). Indeed, in the two 3 + 3 rule scenarios, the dosing regimen most frequently selected corresponds to the dosing regimen used as starting point for the MBAOD process (10 mg every second week for drug S with either 60, 70, or 80 mg/m2 of paclitaxel every week for 3 weeks). This was true in 24.0% and 22.0% of the cases with and without schedule constraint, respectively.

Table IV Characteristics of the Selected Designs for the Phase 2 Trial with the Different Scenarios Compared in the MBAOD Process
Fig. 5
figure 5

Heatmaps of the frequency of the selected dosing regimen for the phase 2 trial over 300 replicates for each scenario. Row (a) displays the results for the scenario with 3 + 3 rule, row (b) the scenario without 3 + 3 rule, row (c) the scenario with 3 + 3 rule and schedule constraint, row (d) the scenario with only schedule constraint. The black line represents the limit at which the probability of DLT is greater or equal to 1/3 (i.e., the MTD curve). Amounts for each drug are given per dose

Furthermore, it is important to notice that for those two scenarios, the process detects in around 25% of the cases (27.0% and 23.0% with and without schedule constraint, respectively) that the first cohort is too toxic (observation of at least two patients with DLTs), suggesting that even the first dose level of drug S (administered with the lowest dosing schedule) has to be lowered or simply that the addition of drug S on top of paclitaxel is not worthwhile given their toxicity profiles, and the development of the combination has to be stopped (while the true risk of DLT is much lower than 1/3 given the simulation settings). Additionally, in all simulated trials, scenarios with the 3 + 3 rule never select a dosing regimen with the highest possible efficacy while maintaining a probability of DLT lower than the acceptable threshold of 1/3. Finally, it is worth noticing that with the 3 + 3 rule, a too toxic dosing regimen was selected for the phase 2 trial in only 0.6 and 1.0% of the simulated trials. These results suggests that the 3 + 3 rule reduces the risk, but at the expense of efficacy.

The difference between the two 3 + 3 scenarios lies in the dose escalation trajectory selected by the optimization process leading to the best dosing regimen. For the scenario with the 3 + 3 rule only, no schedule constraint was set, allowing the optimization process to select the shortest path to get maximum efficacy. That is to say, the straight selection of the dosing schedule “twice a week” since it increases the total amount of drug S eightfold compared to the starting dosing schedule. For the scenario with both the 3 + 3 rule and schedule constraint, it was not possible to select directly the “twice a week” dosing schedule, and the MBAOD needed to first select intermediate schedules. It ends in a more heterogeneous distribution of the selected dosing regimen compared to the scenario with only 3 + 3 rule. With this scenario, less than 1% of the simulated trials selects the schedule “3 weeks on, 1 week off” and the schedule “twice a week” is never selected. Despite this difference, performances regarding the selection of the dosing regimen to use in phase 2 were quite similar.

The other tested scenarios only considered the stable selected dosing regimen criterion (with and without schedule constraint) and consequently the trends of the search path to select the best dosing regimen were different compared to previous scenarios. By removing the conservative aspect introduced by 3 + 3, it turns out that scenarios selects in 4.0 and 17.0% of the cases a dosing regimen with the highest possible efficacy (i.e., 137% of TGI%) and a probability of DLT lower than 1/3 (i.e., the best possible dosing regimen) (Table IV and Fig. 5). Moreover, a dosing regimen with at least 90% of the efficacy of the best dosing regimen was chosen in 75.7 and 67.0% of the cases which is much higher than scenarios considering the 3 + 3 rule (respectively 20.7 and 11.3% of the cases without and with schedule constraint). However, in 16.3 and 29.0% of the simulated trials, the selected dosing regimen to use in phase 2 is too toxic exhibiting the potential aggressive behavior of these scenarios. Contrary to the 3 + 3 rule, those scenarios were allowed to continue to enroll patients until either the stable selected design criteria or the maximum number of cohorts is reached. Consequently, they tend to determine a dosing regimen close to the MTD curve (i.e., close to the curve where the probability is equal to 1/3) and therefore show a higher risk of selecting dosing regimen known as too toxic.

When looking in more detail at the results of each scenario (supplemental material), it is noticeable that, regardless of the settings of the scenarios, simulated MBAOD processes with an initial design using a dose lower than the RP2D of paclitaxel led to that the best dosing regimen was more frequently selected, i.e., the dosing regimen with best efficacy, as well as selection of fewer dosing regimens with a probability of DLT greater to 1/3.

Occurrence of DLTs per Trial

The average number of patients enrolled when using the 3 + 3 rule was found to be around 13 (12.7 and 13.5, respectively, for scenarios with 3 + 3 rule) whereas with scenarios without 3 + 3 rule the mean number patients enrolled per trial was higher (22.3 and 32.3 patients per trial for scenarios without and with schedule constraint, respectively). Additionally, fewer DLTs were observed with scenarios that used the 3 + 3 rule (5.2 and 7.9 DLTs vs. 2.7 DLTs per trial). However, the proportion of DLTs per trial remained the same (around 25%) regardless of scenario (Table V).

Table V Results of the different scenarios compared in the MBAOD process

Average Predicted Efficacy in the Trial

Throughout the simulated clinical trials, the average tumor growth inhibition (simulated from preclinical knowledge) was much higher with scenarios not considering 3 + 3 rule (111.3 and 108.0% compared to 86.3 and 72.5% with 3 + 3 rule) meaning that more patients would benefit from a more efficacious treatment during the dose escalation process (Table V).

In general, it is important to note that for all the different tested MBAOD processes, none of the replicates stopped because the maximum number of patients was reached but stopped either because of the 3 + 3 rule or because of the stable selected dosing regimen criterion.

Discussion

In this work, a model-based adaptive approach was developed to optimize dose finding in the context of a phase 1 trial involving combination therapy. Indeed, with combination therapy, the dose–safety and dose–efficacy profiles are multi-dimensional since both drug dosages and dosing schedules can be changed resulting in the potential choice of different dosing regimens for the phase 2 trial. The different dose–toxicity profiles showed complex relationships due to the dynamics of the different mechanisms involved in the pharmacology of the drugs (combination of the difference in PK of the drugs with the dynamics of the neutrophils), highlighting the interest of such model-based approaches to describe these patterns.

The best dosing regimens selected in the 3 + 3 rule scenarios were found to be too conservative which was anticipated and in line with literature (1,2,32,33,34,35,36), i.e., the 3 + 3 rule reduced risk at the expense of efficacy. In studies applying the 3 + 3 rule, assessment of the best dosing regimen is mainly based on actual toxicity observations at a tested dose; therefore, the probability of underestimating the best regimen increases with increasing numbers of pre-specified dose levels below the true best dosing regimen. This work highlights the problem that observed toxicities may not be representative of true toxicity probabilities and demonstrates the questionable performance of the 3 + 3 rule to select the dosing regimen to use in a phase 2 trial. In the current study, this problem was further aggravated by the fact that the starting dose of paclitaxel was close to its MTD, as well as by the fact that many drug S dose levels were considered in the dose escalation process. Therefore, the 3 + 3 rule routinely found sub-optimal dosing regimens for these situations.

Scenarios with the 3 + 3 rule have shown to enroll fewer patients and to exhibit a lower number of DLTs compared to scenarios without the 3 + 3 rule. Nevertheless, when focusing on the proportion of DLTs occurring per trial, it was found to be similar regardless of the use of the 3 + 3 rule (which is also in line with the literature (36)). This raises the point of the safety evaluation during a trial. Considering an absolute number of DLTs per dose level seems to be too restrictive for the dose escalation process and not representative of the true safety of the trial. Scenarios without the 3 + 3 stopping rule allowed for more DLTs in total, but maintained the same proportion of DLTs per trial (i.e., the same safety level) and resulted in a higher chance to reach an optimal dosing regimen for the later phase 2 study.

Also, in this work, no comparison of trial duration was considered (unlike other publications (36)) because this metric can be dependent on different factors that are not predictable or manageable, and therefore not taken into account in the presented simulated exercise (i.e., rate of enrollment of patients…).

Because cancer indications are typically mixed in phase 1 studies, and since there is generally a time delay before a response can be observed, clinical efficacy data such as tumor size were assumed to be unavailable, and the optimization process regarding efficacy was only performed based on preclinical knowledge. However, if there exists a good clinical pharmacodynamics biomarker for clinical efficacy assessment at the time of these dose finding studies, then those measurements and models should, ideally, be integrated into the approach presented here (in place of the preclinical tumor growth model). This would clearly allow for a refinement of the approach, considering all the components of the clinical benefit–risk profile. Also, in this example, only hematological toxicities were considered for the sake of clarity and simplicity; however, it would be possible to combine different types of toxicities in the PKPD models and to adjust the cost function accordingly.

In all the simulated MBAOD processes, a selection of the same dosing regimen twice in a row ends the process. This criterion was considered to represent the fact that enough information has been gained in the trial to assess the best dosing regimen with enough confidence. The definition of stability criteria can be discussed; however, the presented MBAOD approach is flexible enough to modify this criterion as desired and to explore other standards (selection of the same dosing regimen three times in a row, consideration of parameter uncertainty, etc.).

Usually, three patients per cohort are included in trials aiming at determining the toxicity of a drug. This might be non-optimal when the PKPD parameters are used for dose adjustment since it might lead to a toxic dosing regimen if the model parameters are poorly characterized (given the potentially small number of subjects at doses of relevance). Therefore, a reevaluation of the number of patients per cohort might be of interest when using adaptive approaches to limit this dependency.

A subtle question about the dose finding concept in oncology can be raised. Indeed, during classical dose escalation studies, the purpose is to find the highest possible dose, i.e., the MTD. However, one could think that a lower dose than the MTD (or a different association of doses in the case of combination therapies) could still be as efficacious as the MTD (in case of saturation of efficacy). In this work, the optimization process had the possibility to decrease the dose of one component of the combination (by changing either the administered dose or the dosing schedule) if the current dosing regimen was judged inappropriate (considered as leading to too much toxicity) given the collected information, or to stop the trial if a higher dose level led to similar efficacy. Also, the developed MBAOD approach allowed for both dose and schedule to be optimized in the same study which is not common in clinical trials. However, in practice, this possibility has to be left to the opinion of the clinicians.

Even if the elaborated MBAOD approach showed good results, it was observed that the best possible dosing regimen was only selected in less than 17% of the simulated trials. One might think that this performance is not good enough for such an adaptive approach. However, it is worth mentioning that, in the present example, the capability of the developed approach was restricted by different constraints used to prevent large dose escalations. Indeed, it was not possible to change both paclitaxel and drug S dose at the same time, meaning that some “best dosing regimen” involving a low dose of paclitaxel and a higher dose of drug S, for instance, were difficult to be reached given the initial design. More flexibility in the optimization process (decrease of the dose of one drug while increasing the dose of the other drug for instance) would most likely have ended up with better results. This was confirmed by an investigation of the impact of the initial design (supplementary material). Better results were obtained when using an initial dose of paclitaxel which was not close to its MTD and thus allowed the optimization process to have more freedom in the search path. Those findings might be helpful in revision of the commonly used methods in the development of combination therapies and suggest that the initial design and optimization settings used in the dose escalation procedure should be investigated via simulation in addition to being discussed with investigators and in agreement with ethical considerations.

Additionally, in the presented example, MBAOD was used to improve the dose finding strategy for combination therapy; however, in a later stage of drug development or in practical use, MBAOD might be used at an individual level to achieve adaptive dose individualization.

In summary, the MBAOD approach showed satisfactory performance and might be less complex or hazardous to run in comparison to other classical protocols that are not using adaptive design. This project highlights that, in the development of a combination therapy, a phase 1 trial might be run more efficiently by using the presented methodology.

Conclusion

The presented application of MBAOD in the context of the development of combination therapy showed interesting performance improvements in the capability of handling the multi-dimensional challenge to reach an optimal dosing regimen while considering the benefit–risk profile. This work confirmed the conservative aspect of the widely used 3 + 3 rule. In addition, by gathering information from both preclinical and clinical studies, this adaptive approach could improve the efficiency of phase 1 trials to identify the best dosing regimen for drug combinations given efficacy information available at the current stage of development.