ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Journals
  • Articles  (10,983)
  • Oxford University Press  (10,983)
  • Biometrika  (404)
  • 3549
  • Biology  (10,983)
  • 1
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: Individualized treatment rules recommend treatments on the basis of individual patient characteristics. A high-quality treatment rule can produce better patient outcomes, lower costs and less treatment burden. If a treatment rule learned from data is to be used to inform clinical practice or provide scientific insight, it is crucial that it be interpretable; clinicians may be unwilling to implement models they do not understand, and black-box models may not be useful for guiding future research. The canonical example of an interpretable prediction model is a decision tree. We propose a method for estimating an optimal individualized treatment rule within the class of rules that are representable as decision trees. The class of rules we consider is interpretable but expressive. A novel feature of this problem is that the learning task is unsupervised, as the optimal treatment for each patient is unknown and must be estimated. The proposed method applies to both categorical and continuous treatments and produces favourable marginal mean outcomes in simulation experiments. We illustrate it using data from a study of major depressive disorder.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: Sufficient dimension reduction in regression aims to reduce the predictor dimension by replacing the original predictors with some set of linear combinations of them without loss of information. Numerous dimension reduction methods have been developed based on this paradigm. However, little effort has been devoted to diagnostic studies within the context of dimension reduction. In this paper we introduce methods to check goodness-of-fit for a given dimension reduction subspace. The key idea is to extend the so-called distance correlation to measure the conditional dependence relationship between the covariates and the response given a reduction subspace. Our methods require minimal assumptions, which are usually much less restrictive than the conditions needed to justify the original methods. Asymptotic properties of the test statistic are studied. Numerical examples demonstrate the effectiveness of the proposed approach.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: We propose a geometric framework to assess sensitivity of Bayesian procedures to modelling assumptions based on the nonparametric Fisher–Rao metric. While the framework is general, the focus of this article is on assessing local and global robustness in Bayesian procedures with respect to perturbations of the likelihood and prior, and on the identification of influential observations. The approach is based on a square-root representation of densities, which enables analytical computation of geodesic paths and distances, facilitating the definition of naturally calibrated local and global discrepancy measures. An important feature of our approach is the definition of a geometric $\epsilon$ -contamination class of sampling distributions and priors via intrinsic analysis on the space of probability density functions. We demonstrate the applicability of our framework to generalized mixed-effects models and to directional and shape data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2015-08-21
    Description: With the discovery of an increasing number of causal genes for complex human disorders, it is crucial to assess the genetic risk of disease onset for individuals who are carriers of these causal mutations and to compare the distribution of the age-at-onset for such individuals with the distribution for noncarriers. In many genetic epidemiological studies that aim to estimate causal gene effect on disease, the age-at-onset of disease is subject to censoring. In addition, the mutation carrier or noncarrier status of some individuals may be unknown, due to the high cost of in-person ascertainment by collecting DNA samples or because of the death of older individuals. Instead, the probability of such individuals’ mutation status can be obtained from various other sources. When mutation status is missing, the available data take the form of censored mixture data. Recently, various methods have been proposed for risk estimation using such data, but none is efficient for estimating a nonparametric distribution. We propose a fully efficient sieve maximum likelihood estimation method, in which we estimate the logarithm of the hazard ratio between genetic mutation groups using B-splines, while applying nonparametric maximum likelihood estimation to the reference baseline hazard function. Our estimator can be calculated via an expectation-maximization algorithm which is much faster than existing methods. We show that our estimator is consistent and semiparametrically efficient and establish its asymptotic distribution. Simulation studies demonstrate the superior performance of the proposed method, which is used to estimate the distribution of the age-at-onset of Parkinson's disease for carriers of mutations in the leucine-rich repeat kinase 2, LRRK2, gene.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2015-08-21
    Description: Smoothing splines provide flexible nonparametric regression estimators. However, the high computational cost of smoothing splines for large datasets has hindered their wide application. In this article, we develop a new method, named adaptive basis sampling, for efficient computation of smoothing splines in super-large samples. Except for the univariate case where the Reinsch algorithm is applicable, a smoothing spline for a regression problem with sample size n can be expressed as a linear combination of n basis functions and its computational complexity is generally O ( n 3 ). We achieve a more scalable computation in the multivariate case by evaluating the smoothing spline using a smaller set of basis functions, obtained by an adaptive sampling scheme that uses values of the response variable. Our asymptotic analysis shows that smoothing splines computed via adaptive basis sampling converge to the true function at the same rate as full basis smoothing splines. Using simulation studies and a large-scale deep earth core-mantle boundary imaging study, we show that the proposed method outperforms a sampling method that does not use the values of response variables.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2015-08-21
    Description: We propose tests for nonlinear serial dependence in time series under the null hypothesis of general linear dependence, in contrast to the more widely studied null hypothesis of independence. The approach is based on combining an entropy dependence metric, which possesses many desirable properties and is used as a test statistic, with a suitable extension of surrogate data methods, a class of Monte Carlo distribution-free tests for nonlinearity, and a smoothed sieve bootstrap scheme. We show how, in the same way as the autocorrelation function is used for linear models, our tests can in principle be employed to detect the lags at which a significant nonlinear relationship is present. We prove the asymptotic validity of the proposed procedures and the corresponding inferences. The small-sample performance of the tests in terms of power and size is assessed through a simulation study. Applications to real datasets of different kinds are also presented.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2015-08-21
    Description: Contamination caused by outliers is inevitable in data analysis, and robust statistical methods are often needed. In this paper we develop a new approach for robust data analysis on the basis of scoring rules. A scoring rule is a discrepancy measure to assess the quality of probabilistic forecasts. We propose a simple method of estimating not only parameters in the statistical model but also the contamination ratio, i.e., the ratio of outliers. The outliers are detected based on the estimated contamination ratio. For this purpose, we use scoring rules with extended statistical models called unnormalized models. Regression problems are also considered. We study complex heterogeneous contamination wherein the contamination ratio in a response variable may depend on covariate variables, and propose a simple method to estimate a robust regression function and expected contamination ratio. Simulation studies demonstrate the effectiveness of our method.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: The sample covariance matrix, which is well known to be highly nonrobust, plays a central role in many classical multivariate statistical methods. A popular way of making such multivariate methods more robust is to replace the sample covariance matrix with some robust scatter matrix. The aim of this paper is to point out that multivariate methods often require that certain properties of the covariance matrix hold also for the robust scatter matrix in order for the corresponding robust plug-in method to be a valid approach, but that not all scatter matrices possess the desired properties. Plug-in methods for independent components analysis, observational regression and graphical modelling are considered in more detail. For each case, it is shown that replacing the sample covariance matrix with a symmetrized robust scatter matrix yields a valid robust multivariate procedure.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: So-called big data are likely to have complex structure, in particular implying that estimates of precision obtained by applying standard statistical procedures are likely to be misleading, even if the point estimates of parameters themselves may be reasonably satisfactory. While this possibility is best explored in the context of each special case, here we outline a fairly general representation of the accretion of error in large systems and explore the possible implications for the estimation of regression coefficients. The discussion raises issues broadly parallel to the distinction between short-range and long-range dependence in time series theory.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2015-08-21
    Description: This paper extends the classical two-regime threshold autoregressive model by introducing hysteresis to its regime-switching structure, which leads to a new model: the hysteretic autoregressive model. The proposed model enjoys the piecewise linear structure of a threshold model but has a more flexible regime switching mechanism. A sufficient condition is given for geometric ergodicity. Conditional least squares estimation is discussed, and the asymptotic distributions of its estimators and information criteria for model selection are derived. Simulation results and an example support the model.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2015-08-21
    Description: This paper points out an error in Davidov and Iliopoulos's ( Biometrika 100 , 778–80) proof of convergence of an iterative algorithm for the proportional likelihood ratio model. It is shown that the iterative algorithm increases the likelihood in each iteration and converges under mild additional conditions when the odds ratio function is bounded.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2015-08-21
    Description: Choosing the number of components in a finite mixture model is a challenging task. In this article, we study the behaviour of information criteria for selecting the mixture order, based on either the observed likelihood or the complete likelihood including component labels. We propose a new observed likelihood criterion called aic mix , which is shown to be order consistent. We further show that when there is a nontrivial level of classification uncertainty in the true model, complete likelihood criteria asymptotically underestimate the true number of components. A simulation study illustrates the potentially poor finite-sample performance of complete likelihood criteria, while aic mix and the Bayesian information criterion perform strongly regardless of the level of classification uncertainty.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-21
    Description: Outlier detection is an integral component of statistical modelling and estimation. For high-dimensional data, classical methods based on the Mahalanobis distance are usually not applicable. We propose an outlier detection procedure that replaces the classical minimum covariance determinant estimator with a high-breakdown minimum diagonal product estimator. The cut-off value is obtained from the asymptotic distribution of the distance, which enables us to control the Type I error and deliver robust outlier detection. Simulation studies show that the proposed method behaves well for high-dimensional data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2015-08-21
    Description: This paper adopts a nonparametric Bayesian approach to testing whether a function is monotone. Two new families of tests are constructed. The first uses constrained smoothing splines with a hierarchical stochastic-process prior that explicitly controls the prior probability of monotonicity. The second uses regression splines together with two proposals for the prior over the regression coefficients. Via simulation, the finite-sample performance of the tests is shown to improve upon existing frequentist and Bayesian methods. The asymptotic properties of the Bayes factor for comparing monotone versus nonmonotone regression functions in a Gaussian model are also studied. Our results significantly extend those currently available, which chiefly focus on determining the dimension of a parametric linear model.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2015-08-21
    Description: The paper develops hierarchical empirical Bayes and benchmarked hierarchical empirical Bayes estimators of positive small area means under multiplicative models. The usual benchmarking requirement is that the small area estimates, when aggregated, should equal the direct estimates for the larger geographical areas. However, while estimating positive small area parameters, the conventional squared error or weighted squared error loss subject to the usual benchmark constraint may not produce positive estimators, so it is necessary to seek other loss functions. We consider a multiplicative model for the original data for estimating positive small area means, and suggest a variant of the Kullback–Leibler divergence as a loss function. The prediction errors of the suggested hierarchical empirical Bayes estimators are investigated asymptotically, and their second-order unbiased estimators are provided. Bootstrapped estimators of these prediction errors for both hierarchical empirical Bayes and benchmarked hierarchical empirical Bayes estimators are also given. The performance of the suggested procedures is investigated through simulation as well as with an example.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2015-08-21
    Description: Odds ratios can be estimated in case-control studies using standard logistic regression, ignoring the outcome-dependent sampling. In this paper we discuss an analogous result for treatment effects on the treated in matched cohort studies. Specifically, in studies where a sample of treated subjects is observed along with a separate sample of possibly matched controls, we show that efficient and doubly robust estimators of effects on the treated are computationally equivalent to standard estimators, which ignore the matching and exposure-based sampling. This is not the case for general average effects. We also show that matched cohort studies are often more efficient than random sampling for estimating effects on the treated, and derive the optimal number of matches for a given set of matching variables. We illustrate our results via simulation and in a matched cohort study of the effect of hysterectomy on the risk of cardiovascular disease.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2015-08-21
    Description: Current status data occur in contexts including demographic studies and tumorigenicity experiments. In such cases, each subject is observed only once and the failure time of interest is either left- or right-censored (Kalbfleisch & Prentice, 2002). Many methods have been developed for the analysis of such data (Huang, 1996; Sun, 2006), most of which assume that the failure time and the observation time are independent completely or given covariates. In this paper, we present a sieve maximum likelihood approach for current status data when independence does not hold. A copula model and monotone I-splines are used and the asymptotic properties of the resulting estimators are established. In particular, the estimated regression parameters are shown to be semiparametrically efficient. An illustrative example is provided.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2016-09-04
    Description: We provide a complete description of possible distributions consistent with any Gaussian latent tree model. This description consists of polynomial equations and inequalities involving covariances between the observed variables. Testing inequality constraints can be done using the inverse Wishart distribution and this leads to simple preliminary assessment of tree-compatibility. To test equality constraints we employ general techniques of tetrad analyses. This approach is effective even for small sample sizes and can be easily adjusted to test either entire models or just particular macrostructures of a tree. Our methods are simple to implement and do not require fitting of the model. The versatility of the techniques is illustrated by performing exploratory and confirmatory tetrad analyses in linguistic and biological settings respectively.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: We propose an automatic structure recovery method for additive models, based on a backfitting algorithm coupled with local polynomial smoothing, in conjunction with a new kernel-based variable selection strategy. Our method produces estimates of the set of noise predictors, the sets of predictors that contribute polynomially at different degrees up to a specified degree M , and the set of predictors that contribute beyond polynomially of degree M . We prove consistency of the proposed method, and describe an extension to partially linear models. Finite-sample performance of the method is illustrated via Monte Carlo studies and a real-data example.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2015-05-24
    Description: The Davis–Kahan theorem is used in the analysis of many statistical procedures to bound the distance between subspaces spanned by population eigenvectors and their sample versions. It relies on an eigenvalue separation condition between certain population and sample eigenvalues. We present a variant of this result that depends only on a population eigenvalue separation condition, making it more natural and convenient for direct application in statistical contexts, and provide an improvement in many cases to the usual bound in the statistical literature. We also give an extension to situations where the matrices under study may be asymmetric or even non-square, and where interest is in the distance between subspaces spanned by corresponding singular vectors.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2015-05-24
    Description: A crucial component of performing sufficient dimension reduction is to determine the structural dimension of the reduction model. We propose a novel information criterion-based method for this purpose, a special feature of which is that when examining the goodness-of-fit of the current model, one needs to perform model evaluation by using an enlarged candidate model. Although the procedure does not require estimation under the enlarged model of dimension k +1, the decision as to how well the current model of dimension k fits relies on the validation provided by the enlarged model; thus we call this procedure the validated information criterion, vic ( k ). Our method is different from existing information criterion-based model selection methods; it breaks free from dependence on the connection between dimension reduction models and their corresponding matrix eigenstructures, which relies heavily on a linearity condition that we no longer assume. We prove consistency of the proposed method, and its finite-sample performance is demonstrated numerically.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: We propose a method of effective dimension reduction for functional data, emphasizing the sparse design where one observes only a few noisy and irregular measurements for some or all of the subjects. The proposed method borrows strength across the entire sample and provides a way to characterize the effective dimension reduction space, via functional cumulative slicing. Our theoretical study reveals a bias-variance trade-off associated with the regularizing truncation and decaying structures of the predictor process and the effective dimension reduction space. A simulation study and an application illustrate the superior finite-sample performance of the method.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: We incorporate the nascent idea of envelopes (Cook et al., Statist. Sinica 20 , 927–1010) into reduced-rank regression by proposing a reduced-rank envelope model, which is a hybrid of reduced-rank and envelope regressions. The proposed model has total number of parameters no more than either of reduced-rank regression or envelope regression. The resulting estimator is at least as efficient as both existing estimators. The methodology of this paper can be adapted to other envelope models, such as partial envelopes (Su & Cook, Biometrika 98 , 133–46) and envelopes in predictor space (Cook et al., J. R. Statist. Soc. B 75 , 851–77).
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2015-05-24
    Description: We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2015-05-24
    Description: To most applied statisticians, a fitting procedure’s degrees of freedom is synonymous with its model complexity, or its capacity for overfitting to data. In particular, the degrees of freedom is often used to parameterize the bias-variance trade-off in model selection. We argue that, on the contrary, model complexity and degrees of freedom may correspond very poorly. We exhibit and theoretically explore various fitting procedures for which the degrees of freedom is not monotonic in the model complexity parameter and can exceed the total dimension of the ambient space even in very simple settings. We show that the degrees of freedom for any nonconvex projection method can be unbounded.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: We propose a semiparametric method for fitting the tail of a heavy-tailed population given a relatively small sample from that population and a larger sample from a related background population. We model the tail of the small sample as an exponential tilt of the better-observed large-sample tail, using a robust sufficient statistic motivated by extreme value theory. In particular, our method induces an estimator of the small-population mean, and we give theoretical and empirical evidence that this estimator outperforms methods that do not use the background sample. We demonstrate substantial efficiency gains over competing methods in simulation and on data from a large controlled experiment conducted by Facebook.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2015-05-24
    Description: Optimum designs are described for two treatments with different variances when covariates are included in the model. The designs, a generalization of Neyman allocation, are required in personalized medicine to model the effect of covariates on the choice of treatment. The use of the designs in clinical trials is indicated. D-optimality of the designs is established using results from Kiefer’s general equivalence theorem. The results are obtained with the use of surprisingly elementary algebra.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2015-05-24
    Description: Model organisms and human studies have yielded increasing empirical evidence that interactions among genes contribute broadly to genetic variation of complex traits. In the presence of gene-gene interactions, the dimensionality of the feature space becomes extremely high relative to the sample size. This poses a significant methodological challenge in the identification of gene-gene interactions. In this paper, by using a Gaussian graphical model framework, we translate the problem of identifying gene-gene interactions associated with a binary trait D into an inference problem on the difference of two high-dimensional precision matrices that summarize the conditional dependence network structures of the genes. We propose a procedure for testing the differential network globally, which is particularly powerful against sparse alternatives. In addition, a multiple testing procedure with false discovery rate control is developed to infer the specific structure of the differential network. Theoretical justification is provided to ensure the validity of the proposed tests, and optimality results are derived under sparsity assumptions. Through a simulation study we demonstrate that the proposed tests maintain the desired error rates under the null hypothesis and have good power under the alternative hypothesis. The methods are applied to a breast cancer gene expression study.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2015-05-24
    Description: We study how to separate signals from noisy data accurately and determine the patterns of the selected signals. Controlling the inflation of false positive errors is important in large-scale simultaneous inference but has not been addressed in the pattern recognition literature. We develop a decision-theoretic framework and formulate the sparse pattern recognition problem as a simultaneous inference problem with multiple decision trees. Oracle and adaptive classifiers are proposed for maximizing the expected number of true positives subject to a constraint on the overall false positive rate. Existing results on multiple testing are extended by allowing more than two states of nature, hierarchical decision-making and new error rate concepts.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2015-05-24
    Description: When an unbiased estimator of the likelihood is used within a Metropolis–Hastings chain, it is necessary to trade off the number of Monte Carlo samples used to construct this estimator against the asymptotic variances of the averages computed under this chain. Using many Monte Carlo samples will typically result in Metropolis–Hastings averages with lower asymptotic variances than the corresponding averages that use fewer samples; however, the computing time required to construct the likelihood estimator increases with the number of samples. Under the assumption that the distribution of the additive noise introduced by the loglikelihood estimator is Gaussian with variance inversely proportional to the number of samples and independent of the parameter value at which it is evaluated, we provide guidelines on the number of samples to select. We illustrate our results by considering a stochastic volatility model applied to stock index returns.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2015-05-24
    Description: We investigate information-theoretic optimality properties of the score function of the predictive likelihood as a device for updating a real-valued time-varying parameter in a univariate observation-driven model with continuous responses. We restrict our attention to models with updates of one lag order. The results provide theoretical justification for a class of score-driven models which includes the generalized autoregressive conditional heteroskedasticity model as a special case. Our main contribution is to show that only parameter updates based on the score will always reduce the local Kullback–Leibler divergence between the true conditional density and the model-implied conditional density. This result holds irrespective of the severity of model misspecification. We also show that use of the score leads to a considerably smaller global Kullback–Leibler divergence in empirically relevant settings. We illustrate the theory with an application to time-varying volatility models. We show that the reduction in Kullback–Leibler divergence across a range of different settings can be substantial compared to updates based on, for example, squared lagged observations.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2015-05-24
    Description: Bivariate or multivariate recurrent event processes are often encountered in longitudinal studies in which more than one type of event is of interest. There has been much research on regression analysis for such data, but little has been done to measure the dependence between recurrent event processes. We propose a time-dependent measure, termed the rate ratio, to assess the local dependence between two types of recurrent event processes. We model the rate ratio as a parametric function of time, and leave unspecified all other aspects of the distribution. We develop a composite likelihood procedure for model fitting and parameter estimation. We show that the proposed estimator is consistent and asymptotically normal. Its finite sample performance is evaluated by simulation and illustrated by an application to a soft tissue sarcoma study.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: Space-filling properties are important in designing computer experiments. The traditional maximin and minimax distance designs consider only space-filling in the full-dimensional space; this can result in poor projections onto lower-dimensional spaces, which is undesirable when only a few factors are active. Restricting maximin distance design to the class of Latin hypercubes can improve one-dimensional projections but cannot guarantee good space-filling properties in larger subspaces. We propose designs that maximize space-filling properties on projections to all subsets of factors. We call our designs maximum projection designs. Our design criterion can be computed at no more cost than a design criterion that ignores projection properties.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: Meta-analysis is widely used to compare and combine the results of multiple independent studies. To account for between-study heterogeneity, investigators often employ random-effects models, under which the effect sizes of interest are assumed to follow a normal distribution. It is common to estimate the mean effect size by a weighted linear combination of study-specific estimators, with the weight for each study being inversely proportional to the sum of the variance of the effect-size estimator and the estimated variance component of the random-effects distribution. Because the estimator of the variance component involved in the weights is random and correlated with study-specific effect-size estimators, the commonly adopted asymptotic normal approximation to the meta-analysis estimator is grossly inaccurate unless the number of studies is large. When individual participant data are available, one can also estimate the mean effect size by maximizing the joint likelihood. We establish the asymptotic properties of the meta-analysis estimator and the joint maximum likelihood estimator when the number of studies is either fixed or increases at a slower rate than the study sizes and we discover a surprising result: the former estimator is always at least as efficient as the latter. We also develop a novel resampling technique that improves the accuracy of statistical inference. We demonstrate the benefits of the proposed inference procedures using simulated and empirical data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-05-24
    Description: Nonparametric regression analysis when the regression function is discontinuous has many applications. Existing methods for estimating a discontinuous regression curve usually assume that the number of jumps in the regression curve is known beforehand, which is unrealistic in some situations. Although there has been research on estimation of a discontinuous regression curve when the number of jumps is unknown, the problem remains mostly open because such research often requires assumptions on other related quantities, such as a known minimum jump size. In this paper we propose a jump information criterion which consists of a term measuring the fidelity of the estimated regression curve to the observed data and a penalty related to the number of jumps and the jump sizes. The number of jumps can then be determined by minimizing our criterion. Theoretical and numerical studies show that our method works well.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-05-24
    Description: We propose a five-parameter bivariate wrapped Cauchy distribution as a unimodal model for toroidal data. It is highly tractable, displays numerous desirable properties, including marginal and conditional distributions that are all wrapped Cauchy, and arises as an appealing submodel of a six-parameter distribution obtained by applying Möbius transformation to a pre-existing bivariate circular model. Method of moments and maximum likelihood estimation of its parameters are fast, and tests for independence and goodness-of-fit are available. An analysis involving dihedral angles of the proteinogenic amino acid Tyrosine illustrates the distribution’s application. A Markov process for circular data is also explored.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2013-02-24
    Description: Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n 3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n . Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2013-02-24
    Description: Karl Pearson edited Biometrika for the first 35 years of its existence. Not only did he shape the journal, he also contributed over 200 pieces and inspired, more or less directly, most of the other contributions. The journal could not be separated from the man.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2013-02-24
    Description: In the modelling of longitudinal data from several groups, appropriate handling of the dependence structure is of central importance. Standard methods include specifying a single covariance matrix for all groups or independently estimating the covariance matrix for each group without regard to the others, but when these model assumptions are incorrect, these techniques can lead to biased mean effects or loss of efficiency, respectively. Thus, it is desirable to develop methods for simultaneously estimating the covariance matrix for each group that will borrow strength across groups in a way that is ultimately informed by the data. In addition, for several groups with covariance matrices of even medium dimension, it is difficult to manually select a single best parametric model among the huge number of possibilities given by incorporating structural zeros and/or commonality of individual parameters across groups. In this paper we develop a family of nonparametric priors using the matrix stick-breaking process of Dunson et al. (2008) that seeks to accomplish this task by parameterizing the covariance matrices in terms of their modified Cholesky decompositions (Pourahmadi, 1999). We establish some theoretical properties of these priors, examine their effectiveness via a simulation study, and illustrate the priors using data from a longitudinal clinical trial.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2013-02-24
    Description: Full Bayesian computational inference for model determination in undirected graphical models is currently restricted to decomposable graphs or other special cases, except for small-scale problems, say up to 15 variables. In this paper we develop new, more efficient methodology for such inference, by making two contributions to the computational geometry of decomposable graphs. The first of these provides sufficient conditions under which it is possible to completely connect two disconnected complete subsets of vertices, or perform the reverse procedure, yet maintain decomposability of the graph. The second is a new Markov chainMonte Carlo sampler for arbitrary positive distributions on decomposable graphs, taking a junction tree representing the graph as its state variable. The resulting methodology is illustrated with numerical experiments on three models.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2013-02-24
    Description: In longitudinal data analysis, statistical inference for sparse data and dense data could be substantially different. For kernel smoothing, the estimate of the mean function, the convergence rates and the limiting variance functions are different in the two scenarios. This phenomenon poses challenges for statistical inference, as a subjective choice between the sparse and dense cases may lead to wrong conclusions. We develop methods based on self-normalization that can adapt to the sparse and dense cases in a unified framework. Simulations show that the proposed methods outperform some existing methods.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2013-02-24
    Description: The problem of testing smooth components of an extended generalized additive model for equality to zero is considered. Confidence intervals for such components exhibit good across-the-function coverage probabilities if based on the approximate result , where f is the vector of evaluated values for the smooth component of interest and V f is the covariance matrix for f according to the Bayesian view of the smoothing process. Based on this result, a Wald-type test of f =0 is proposed. It is shown that care must be taken in selecting the rank used in the test statistic. The method complements previous work by extending applicability beyond the Gaussian case, while considering tests of zero effect rather than testing the parametric hypothesis given by the null space of the component’s smoothing penalty. The proposed p -values are routine and efficient to compute from a fitted model, without requiring extra model fits or null distribution simulation.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2013-02-24
    Description: Motivated by analysis of genetical genomics data, we introduce a sparse high-dimensional multivariate regression model for studying conditional independence relationships among a set of genes adjusting for possible genetic effects. The precision matrix in the model specifies a covariate-adjusted Gaussian graph, which presents the conditional dependence structure of gene expression after the confounding genetic effects on gene expression are taken into account. We present a covariate-adjusted precision matrix estimation method using a constrained 1 minimization, which can be easily implemented by linear programming. Asymptotic convergence rates in various matrix norms and sign consistency are established for the estimators of the regression coefficients and the precision matrix, allowing both the number of genes and the number of the genetic variants to diverge. Simulation shows that the proposed method results in significant improvements in both precision matrix estimation and graphical structure selection when compared to the standard Gaussian graphical model assuming constant means. The proposed method is applied to yeast genetical genomics data for the identification of the gene network among a set of genes in the mitogen-activated protein kinase pathway.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2013-02-24
    Description: This paper introduces, constructs and studies a new class of arrays, called strong orthogonal arrays, as suitable designs for computer experiments. A strong orthogonal array of strength t enjoys better space-filling properties than a comparable orthogonal array in all dimensions lower than t while retaining the space-filling properties of the latter in t dimensions. Latin hypercubes based on strong orthogonal arrays of strength t are more space-filling than comparable orthogonal array-based Latin hypercubes in all g dimensions for any 2 ≤ g ≤ t – 1.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2013-02-24
    Description: This paper considers the construction of blocked two-level regular designs with weak minimum aberration. We first obtain the minimum value of the number of two-factor interactions which are aliased with the block effects. Based on this result, two methods are then proposed in two different scenarios to construct weak minimum aberration blocked two-level designs with respect to some existing combined wordlength patterns.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2013-02-24
    Description: Rathbun et al. (2007) and Waagepetersen (2008) propose estimating functions for parameters of Poisson point process intensity that may be applied when space- and/or time-varying covariates are sampled from a probability-based sampling design. This paper demonstrates that Waageptersen’s estimating function is optimal in a class of weighted estimating functions.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2013-02-24
    Description: We show that the proportional likelihood ratio model proposed recently by Luo & Tsai (2012) enjoys model-invariant properties under certain forms of nonignorable missing mechanisms and randomly double-truncated data, so that target parameters in the population can be estimated consistently from those biased samples. We also construct an alternative estimator for the target parameters by maximizing a pseudolikelihood that eliminates a functional nuisance parameter in the model. The corresponding estimating equation has a U-statistic structure. As an added advantage of the proposed method, a simple score-type test is developed to test a null hypothesis on the regression coefficients. Simulations show that the proposed estimator has a small-sample efficiency similar to that of the nonparametric likelihood estimator and performs well for certain nonignorable missing data problems.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2013-02-24
    Description: This paper considers benchmarking issues in the context of small area estimation. We find optimal estimators within the class of benchmarked linear estimators under linear constraints. This extends existing results for external and internal benchmarking, and also links the two. Necessary and sufficient conditions for self-benchmarking are found for an augmented model. Most results of this paper are found using ideas of orthogonal projection
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2013-02-24
    Description: Suppose we are interested in the effect of a binary treatment on an outcome where that relationship is confounded by an ordinal confounder. We assume that the true confounder is not observed but, rather, we observe a nondifferentially mismeasured version of it. We show that, under certain monotonicity assumptions about its effect on the treatment and on the outcome, an effect measure controlling for the mismeasured confounder will fall between the corresponding crude and true effect measures. We also present results for coarsened and, under further assumptions, multiple misclassified confounders.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2013-02-24
    Description: Applying concepts from partial identification to the domain of finite population sampling, we propose a method for interval estimation of a population mean when the probabilities of sample selection lie within a posited interval. The interval estimate is derived from sharp bounds on the Hajek (1971) estimator of the population mean. We demonstrate the method’s utility for sensitivity analysis by applying it to a sample of needles collected as part of a syringe tracking and testing programme in New Haven, Connecticut.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2013-02-24
    Description: Since many environmental processes are spatial in extent, a single extreme event may affect several locations, and the spatial dependence must be taken into account in an appropriate way. This paper proposes a framework for conditional simulation of max-stable processes and gives closed forms for the regular conditional distributions of Brown–Resnick and Schlather processes. We test the method on simulated data and present applications to extreme rainfall around Zurich and extreme temperatures in Switzerland. The proposed framework provides accurate conditional simulations and can handle problems of realistic size.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2013-02-24
    Description: We study the nonparametric estimation of the cumulative incidence function and the cause-specific hazard function for current status data with competing risks via kernel smoothing. A smoothed naive nonparametric maximum likelihood estimator and a smoothed full nonparametric maximum likelihood estimator are shown to have pointwise asymptotic normality and faster convergence rates than the corresponding unsmoothed nonparametric likelihood estimators. Using the smoothed estimators and the plug-in principle, we can estimate the cause-specific hazard function, which has not been studied previously. We also propose semi-smoothed estimators of the cause-specific hazard as an alternative to the smoothed estimator and demonstrate that neither is uniformly more efficient than the other. Numerical studies show that a smoothed bootstrap method works well for selecting the bandwidths in the smoothed nonparametric estimation. The use of the estimators is exemplified by an application to cumulative incidence and hazard of subtype-specific HIV infection from a sero-prevalence study in injecting drug users in Thailand.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2013-02-24
    Description: Highlights, trends and influences are identified associated with the pages of Biometrika subsequent to the editorship of Karl Pearson.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2013-02-24
    Description: Copy number variant is an important type of genetic structural variation appearing in germline DNA, ranging from common to rare in a population. Both rare and common copy number variants have been reported to be associated with complex diseases, so it is important to identify both simultaneously based on a large set of population samples. We develop a proportion adaptive segment selection procedure that automatically adjusts to the unknown proportions of the carriers of the segment variants. We characterize the detection boundary that separates the region where a segment variant is detectable by some method from the region where it cannot be detected. Although the detection boundaries are very different for the rare and common segment variants, it is shown that the proposed procedure can reliably identify both whenever they are detectable. Compared with methods for single-sample analysis, this procedure gains power by pooling information from multiple samples. The method is applied to analyse neuroblastoma samples and identifies a large number of copy number variants that are missed by single-sample methods.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2013-02-24
    Description: We derive sufficient conditions for the cross-correlation coefficient of a multivariate spatial process to vary with location when the spatial model is augmented with nugget effects. The derived class is valid for any choice of covariance functions, and yields substantial flexibility between multiple processes. The key is to identify the cross-correlation coefficient matrix with a contraction matrix, which can be either diagonal, implying a parsimonious formulation, or a fully general contraction matrix, yielding greater flexibility but added model complexity. We illustrate the approach with a bivariate minimum and maximum temperature dataset in Colorado, allowing the two variables to be positively correlated at low elevations and nearly independent at high elevations, while still yielding a positive definite covariance matrix.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2013-02-24
    Description: Variable screening techniques have been proposed to mitigate the impact of high dimensionality in classification problems, including t -test marginal screening (Fan & Fan, 2008) and maximum marginal likelihood screening (Fan & Song, 2010). However, these methods rely on strong modelling assumptions that are easily violated in real applications. To circumvent the parametric modelling assumptions, we propose a new variable screening technique for binary classification based on the Kolmogorov–Smirnov statistic. We prove that this so-called Kolmogorov filter enjoys the sure screening property under much weakened model assumptions. We supplement our theoretical study by a simulation study.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2013-02-24
    Description: Cumulative sum or cusum charts are typically used to detect a change in the distribution of a sequence of observations, e.g., shifts in the mean. Usually, after signalling, the chart is restarted by setting it to some value below the signalling threshold. We propose a non-restarting cusum chart which is able to detect periods during which the stream is out of control. Further, we advocate an upper boundary to prevent the cusum chart rising too high, which helps to detect a change back into control. We present an algorithm to control the false discovery rate when considering cusum charts based on multiple streams of data. We consider two definitions of a false discovery: signalling out-of-control when the observations have been in control since the start and signalling out-of-control when the observations have been in control since the last time the chart was at zero. We prove that the false discovery rate is controlled under both these definitions simultaneously. Simulations reveal the difference in false discovery rate control when using these and other desirable definitions of a false discovery.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-11-16
    Description: In a matched observational study of treatment effects, a sensitivity analysis asks about the magnitude of the departure from random assignment that would need to be present to alter the conclusions of an analysis that assumes that matching for measured covariates removes all bias. The reported degree of sensitivity to unmeasured biases depends on both the process that generated the data and the chosen methods of analysis, so a poor choice of method may lead to an exaggerated report of sensitivity to bias. This suggests the possibility of performing more than one analysis with a correction for multiple inference, say testing one null hypothesis using two or three different tests. In theory and in an example, it is shown that, in large samples, the gains from testing twice will often be large, because testing twice has the larger of the two design sensitivities of the component tests, and the losses due to correcting for two tests will often be small, because two tests of one hypothesis will typically be highly correlated, so a correction for multiple testing that takes this into account will be small. An illustration uses data from the U.S. National Health and Nutrition Examination Survey concerning lead in the blood of cigarette smokers.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2012-11-16
    Description: Inferences related to the second-order properties of functional data, as expressed by covariance structure, can become unreliable when the data are non-Gaussian or contain unusual observations. In the functional setting, it is often difficult to identify atypical observations, as their distinguishing characteristics can be manifold but subtle. In this paper, we introduce the notion of a dispersion operator, investigate its use in probing the second-order structure of functional data, and develop a test for comparing the second-order characteristics of two functional samples that is resistant to atypical observations and departures from normality. The proposed test is a regularized M -test based on a spectrally truncated version of the Hilbert–Schmidt norm of a score operator defined via the dispersion operator. We derive the asymptotic distribution of the test statistic, investigate the behaviour of the test in a simulation study and illustrate the method on a structural biology dataset.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2012-11-16
    Description: Linear classifiers are very popular, but can have limitations when classes have distinct subpopulations. General nonlinear kernel classifiers are very flexible, but do not give clear interpretations and may not be efficient in high dimensions. We propose the bidirectional discrimination classification method, which generalizes linear classifiers to two or more hyperplanes. This new family of classification methods gives much of the flexibility of a general nonlinear classifier while maintaining the interpretability, and much of the parsimony, of linear classifiers. They provide a new visualization tool for high-dimensional, low-sample-size data. Although the idea is generally applicable, we focus on the generalization of the support vector machine and distance-weighted discrimination methods. The performance and usefulness of the proposed method are assessed using asymptotics and demonstrated through analysis of simulated and real data. Our method leads to better classification performance in high-dimensional situations where subclusters are present in the data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2012-11-16
    Description: Two transformations are proposed that give orthogonal components with a one-to-one correspondence between the original vectors and the components. The aim is that each component should be close to the vector with which it is paired, orthogonality imposing a constraint. The transformations lead to a variety of new statistical methods, including a unified approach to the identification and diagnosis of collinearities, a method of setting prior weights for Bayesian model averaging, and a means of calculating an upper bound for a multivariate Chebychev inequality. One transformation has the property that duplicating a vector has no effect on the orthogonal components that correspond to nonduplicated vectors, and is determined using a new algorithm that also provides the decomposition of a positive-definite matrix in terms of a diagonal matrix and a correlation matrix. The algorithm is shown to converge to a global optimum.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-11-16
    Description: Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs little beyond the computation of a path or grid of the sparse regression estimator for penalty levels above a proper threshold. For the scaled lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the scaled lasso simultaneously yields an estimator for the noise level and an estimated coefficient vector satisfying certain oracle inequalities for prediction, the estimation of the noise level and the regression coefficients. These inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise-level estimator, including certain cases where the number of variables is of greater order than the sample size. Parallel results are provided for least-squares estimation after model selection by the scaled lasso. Numerical results demonstrate the superior performance of the proposed methods over an earlier proposal of joint convex minimization.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2012-11-16
    Description: In this article, we propose a regression method for simultaneous supervised clustering and feature selection over a given undirected graph, where homogeneous groups or clusters are estimated as well as informative predictors, with each predictor corresponding to one node in the graph and a connecting path indicating a priori possible grouping among the corresponding predictors. The method seeks a parsimonious model with high predictive power through identifying and collapsing homogeneous groups of regression coefficients. To address computational challenges, we present an efficient algorithm integrating the augmented Lagrange multipliers, coordinate descent and difference convex methods. We prove that the proposed method not only identifies the true homogeneous groups and informative features consistently but also leads to accurate parameter estimation. A gene network dataset is analysed to demonstrate that the method can make a difference by exploring dependency structures among the genes.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-11-16
    Description: This article proposes a method of moments technique for estimating the sparsity of signals in a random sample. This involves estimating the largest eigenvalue of a large Hermitian trigonometric matrix under mild conditions. As illustration, the method is applied to two well-known problems. The first focuses on the sparsity of a large covariance matrix and the second investigates the sparsity of a sequence of signals observed with stationary, weakly dependent noise. Simulation shows that the proposed estimators can have significantly smaller mean absolute errors than their main competitors.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2012-11-16
    Description: We introduce a doubly stochastic marked point process model for supervised classification problems. Regardless of the number of classes or the dimension of the feature space, the model requires only 2–3 parameters for the covariance function. The classification criterion involves a permanental ratio for which an approximation using a polynomial-time cyclic expansion is proposed. The approximation is effective even if the feature region occupied by one class is a patchwork interlaced with regions occupied by other classes. An application to DNA microarray analysis indicates that the cyclic approximation is effective even for high-dimensional data. It can employ feature variables in an efficient way to reduce the prediction error significantly. This is critical when the true classification relies on nonreducible high-dimensional features.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2012-11-16
    Description: Researchers in the biological sciences nowadays often encounter the curse of dimensionality. To tackle this, sufficient dimension reduction aims to estimate the central subspace, in which all the necessary information supplied by the covariates regarding the response of interest is contained. Subsequent statistical analysis can then be made in a lower-dimensional space while preserving relevant information. Many studies are concerned with the transformed response rather than the original one, but they may have different central subspaces. When estimating the central subspace of the transformed response, direct methods will be inefficient. In this article, we propose a more efficient two-stage estimator of the central subspace of a transformed response. This approach is extended to censored responses and is applied to combining multiple biomarkers. Simulation studies and data examples support the superiority of the procedure.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2012-11-16
    Description: Transient semi-Markov processes have traditionally been used to describe the transitions of a patient through the various states of a multistate survival model. A survival distribution in this context is a sojourn through the states until passage to a fatal absorbing state or certain endpoint states. Using complete sojourn data, this paper shows how such survival distributions and associated hazard functions can be estimated nonparametrically and also how nonparametric bootstrap pointwise confidence bands can be constructed for them when patients are subject to independent right censoring from each state during the sojourn. Limitations to the estimability of such survival distributions that result from random censoring with bounded support are clarified. The methods are applicable to any sort of sojourn through any finite state process of arbitrary complexity involving feedback into previously occupied states.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2012-11-16
    Description: In some problems involving functional data, it is desired to undertake prediction or classification before the full trajectory of a function is observed. In such cases, it is often preferable to suffer somewhat greater error in return for making a decision relatively early. The prediction and classification problems can be treated similarly, using mean squared prediction error, or classification error, respectively, as the means for quantifying performance, so in this paper we focus principally on classification. We introduce a method for determining when an early decision can reasonably be made, using only part of the trajectory, and we show how to use the method to choose among data types. Our approach is fully nonparametric, and no specific model is required. Properties of error-rate are studied as functions of time and data type. The effectiveness of the proposed method is illustrated in both theoretical and numerical terms. The classification referred to in this paper would be termed supervised classification in machine learning, to distinguish it from unsupervised classification, or clustering.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-11-16
    Description: Linear mixed models cover a wide range of statistical methods, which have found many uses in the estimation for complex surveys. The purpose of this work is to consider methods by which linear mixed models may be used at the design stage of a survey to incorporate available auxiliary information. This paper reviews the ideas of balanced sampling and the cube algorithm, and proposes an implementation of the latter by which penalized balanced samples can be selected. Such samples can reduce or eliminate the need for linear mixed model weight adjustments, a result demonstrated theoretically and via simulation. Horvitz–Thompson estimators for such samples will be highly efficient for any responses well approximated by a linear mixed model in the auxiliary information. In Monte Carlo experiments using nonparametric and temporal linear mixed models, the strategy of penalized balanced sampling with Horvitz–Thompson estimation dominates a variety of standard strategies.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2012-11-16
    Description: Monte Carlo algorithms are commonly used to identify a set of models for Bayesian model selection or model averaging. Because empirical frequencies of models are often zero or one in high-dimensional problems, posterior probabilities calculated from the observed marginal likelihoods, renormalized over the sampled models, are often employed. Such estimates are the only recourse in several newer stochastic search algorithms. In this paper, we prove that renormalization of posterior probabilities over the set of sampled models generally leads to bias that may dominate mean squared error. Viewing the model space as a finite population, we propose a new estimator based on a ratio of Horvitz–Thompson estimators that incorporates observed marginal likelihoods, but is approximately unbiased. This is shown to lead to a reduction in mean squared error compared to the empirical or renormalized estimators, with little increase in computational cost.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-11-16
    Description: Many proper scoring rules such as the Brier and log scoring rules implicitly reward a probability forecaster relative to a uniform baseline distribution. Recent work has motivated weighted proper scoring rules, which have an additional baseline parameter. To date two families of weighted proper scoring rules have been introduced, the weighted power and pseudospherical scoring families. These families are compatible with the log scoring rule: when the baseline maximizes the log scoring rule over some set of distributions, the baseline also maximizes the weighted power and pseudospherical scoring rules over the same set. We characterize all weighted proper scoring families and prove a general property: every proper scoring rule is compatible with some weighted scoring family, and every weighted scoring family is compatible with some proper scoring rule.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2012-11-16
    Description: Projective shape consists of the information about a configuration of points that is invariant under projective transformations. It is an important tool in machine vision to pick out features that are invariant to the choice of camera view. The simplest example is the cross ratio for a set of four collinear points. Recent work involving ideas from multivariate robustness enables us to introduce here a natural preshape on projective shape space. This makes it possible to adapt the Procrustes analysis that forms the basis of much methodology in the simpler setting of similarity shape space.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2012-11-16
    Description: To study disease association with risk factors in epidemiologic studies, cross-sectional sampling is often more focused and less costly for recruiting study subjects who have already experienced initiating events. For time-to-event outcome, however, such a sampling strategy may be length biased. Coupled with censoring, analysis of length-biased data can be quite challenging, due to induced informative censoring in which the survival time and censoring time are correlated through a common backward recurrence time. We propose to use the proportional mean residual life model of Oakes & Dasu ( Biometrika 77 , 409–10, 1990) for analysis of censored length-biased survival data. Several nonstandard data structures, including censoring of onset time and cross-sectional data without follow-up, can also be handled by the proposed methodology.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2012-11-16
    Description: Several two-stage multiple testing procedures have been proposed to detect gene-environment interaction in genome-wide association studies. In this article, we elucidate general conditions that are required for validity and power of these procedures, and we propose extensions of two-stage procedures using the case-only estimator of gene-treatment interaction in randomized clinical trials. We develop a unified estimating equation approach to proving asymptotic independence between a filtering statistic and an interaction test statistic in a range of situations, including marginal association and interaction in a generalized linear model with a canonical link. We assess the performance of various two-stage procedures in simulations and in genetic studies from Women’s Health Initiative clinical trials.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2012-11-16
    Description: Resampling-based methods for multiple hypothesis testing often lead to long run times when the number of tests is large. This paper presents a simple rule that substantially reduces computation by allowing resampling to terminate early on a subset of tests. We prove that the method has a low probability of obtaining a set of rejected hypotheses different from those rejected without early stopping, and obtain error bounds for multiple hypothesis testing. Simulation shows that our approach saves more computation than other available procedures.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2012-11-16
    Description: We explore the use of estimating equations for efficient statistical inference in case of missing data. We propose a semiparametric efficient empirical likelihood approach, and show that the empirical likelihood ratio statistic and its profile counterpart asymptotically follow central chi-square distributions when evaluated at the true parameter. The theoretical properties and practical performance of our approach are demonstrated through numerical simulations and data analysis.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-08-22
    Description: Consider parametric models that are too complicated to allow calculation of a likelihood but from which observations can be simulated. We examine parameter estimators that are linear functions of a possibly large set of candidate features. A combination of simulations based on a fractional design and sets of discriminant analyses is then used to find an optimal estimator of the vector parameter and its covariance matrix. The procedure is an alternative to the approximate Bayesian computation scheme.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2012-08-22
    Description: Prior information or background knowledge may suggest that interactions arise only within certain factors. When such knowledge is available, we propose using a new class of designs: designs of variable resolution. Several constructions are presented. Statistical justifications for using such designs from minimum G 2 aberration and design efficiency perspectives are provided.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2012-08-22
    Description: Merging data from multiple studies has been widely adopted in biomedical research. In this paper, we consider two major issues related to merging longitudinal datasets. We first develop a rigorous hypothesis testing procedure to assess the validity of data merging, and then propose a flexible joint estimation procedure that enables us to analyse merged data and to account for different within-subject correlations and follow-up schedules in different studies. We establish large sample properties for the proposed procedures. We compare our method with meta analysis and generalized estimating equations and show that our test provides robust control of Type I error against both misspecification of working correlation structures and heterogeneous dispersion parameters. Our joint estimating procedure leads to an improvement in estimation efficiency on all regression coefficients after data merging is validated.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2016-03-01
    Description: This paper introduces a new method for performing computational inference on log-Gaussian Cox processes. The likelihood is approximated directly by making use of a continuously specified Gaussian random field. We show that for sufficiently smooth Gaussian random field prior distributions, the approximation can converge with arbitrarily high order, whereas an approximation based on a counting process on a partition of the domain achieves only first-order convergence. The results improve upon the general theory of convergence for stochastic partial differential equation models introduced by Lindgren et al. (2011) . The new method is demonstrated on a standard point pattern dataset, and two interesting extensions to the classical log-Gaussian Cox process framework are discussed. The first extension considers variable sampling effort throughout the observation window and implements the method of Chakraborty et al. (2011) . The second extension constructs a log-Gaussian Cox process on the world's oceans. The analysis is performed using integrated nested Laplace approximation for fast approximate inference.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2016-03-01
    Description: We propose a regression model for data spatially distributed over general two-dimensional Riemannian manifolds. This is a generalized additive model with a roughness penalty term involving a differential operator computed over the non-planar domain. By virtue of a semiparametric framework, the model allows inclusion of space-varying covariate information. Estimation can be performed by conformally parameterizing the non-planar domain and then generalizing existing models for penalized spatial regression over planar domains. The conformal coordinates and the estimation problem are both computed with a finite element approach.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2016-03-01
    Description: Seneta & Chen (2005) tightened the familywise error rate control of Holm's procedure by sharpening its critical values using pairwise dependencies of the $p$ -values. In this paper we further sharpen these critical values in the case where the distribution functions of the pairwise maxima of null $p$ -values are convex, a property shown to hold in some applications of Holm's procedure. The newer critical values are uniformly larger, providing tighter familywise error rate control than the approach of Seneta & Chen (2005) , significantly so under high pairwise positive dependencies. The critical values can be further improved under exchangeable null $p$ -values.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2016-03-01
    Description: An unknown prior density $g(\theta )$ has yielded realizations $\Theta _1,\ldots ,\Theta _N$ . They are unobservable, but each $\Theta _i$ produces an observable value $X_i$ according to a known probability mechanism, such as $X_i\sim {\rm Po}(\Theta _i)$ . We wish to estimate $g(\theta )$ from the observed sample $X_1,\ldots ,X_N$ . Traditional asymptotic calculations are discouraging, indicating very slow nonparametric rates of convergence. In this article we show that parametric exponential family modelling of $g(\theta )$ can give useful estimates in moderate-sized samples. We illustrate the approach with a variety of real and artificial examples. Covariate information can be incorporated into the deconvolution process, leading to a more detailed theory of generalized linear mixed models.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2016-03-01
    Description: In modern experiments, functional and nonfunctional data are often encountered simultaneously when observations are sampled from random processes and high-dimensional scalar covariates. It is difficult to apply existing methods for model selection and estimation. We propose a new class of partially functional linear models to characterize the regression between a scalar response and covariates of both functional and scalar types. The new approach provides a unified and flexible framework that simultaneously takes into account multiple functional and ultrahigh-dimensional scalar predictors, enables us to identify important features, and offers improved interpretability of the estimators. The underlying processes of the functional predictors are considered to be infinite-dimensional, and one of our contributions is to characterize the effects of regularization on the resulting estimators. We establish the consistency and oracle properties of the proposed method under mild conditions, demonstrate its performance with simulation studies, and illustrate its application using air pollution data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2016-03-01
    Description: To estimate unknown population parameters based on data having nonignorable missing values with a semiparametric exponential tilting propensity, Kim & Yu (2011) assumed that the tilting parameter is known or can be estimated from external data, in order to avoid the identifiability issue. To remove this serious limitation on the methodology, we use an instrument, i.e., a covariate related to the study variable but unrelated to the missing data propensity, to construct some estimating equations. Because these estimating equations are semiparametric, we profile the nonparametric component using a kernel-type estimator and then estimate the tilting parameter based on the profiled estimating equations and the generalized method of moments. Once the tilting parameter is estimated, so is the propensity, and then other population parameters can be estimated using the inverse propensity weighting approach. Consistency and asymptotic normality of the proposed estimators are established. The finite-sample performance of the estimators is studied through simulation, and a real-data example is also presented.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2016-03-01
    Description: Sufficient dimension reduction has been extensively explored in the context of independent and identically distributed data. In this article we generalize sufficient dimension reduction to longitudinal data and propose an estimating equation approach to estimating the central mean subspace. The proposed method accounts for the covariance structure within each subject and improves estimation efficiency when the covariance structure is correctly specified. Even if the covariance structure is misspecified, our estimator remains consistent. In addition, our method relaxes distributional assumptions on the covariates and is doubly robust. To determine the structural dimension of the central mean subspace, we propose a Bayesian-type information criterion. We show that the estimated structural dimension is consistent and that the estimated basis directions are root- $n$ consistent, asymptotically normal and locally efficient. Simulations and an analysis of the Framingham Heart Study data confirm the effectiveness of our approach.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2016-03-01
    Description: A theoretical analysis is made of the properties of various methods for comparing two distributions of survival time. The results are intended primarily to guide the choice of method of analysis for such simple comparisons as of a treatment versus a control, but the main implications are fairly general, illustrating the performance of different models in a range of conditions. For most of the models there is a parameter specifying the comparison of interest and the Fisher information per observation can be calculated for that parameter, and provides a succinct basis for comparison. Two of the models are semiparametric and the others are based on exponential distributions with or without extra sources of variability.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2016-03-01
    Description: We consider generalized linear regression with a covariate left-censored at a lower detection limit. Complete-case analysis, where observations with values below the limit are eliminated, yields valid estimates for regression coefficients but loses efficiency, ad hoc substitution methods are biased, and parametric maximum likelihood estimation relies on parametric models for the unobservable tail probability distribution and may suffer from model misspecification. To obtain robust and more efficient results, we propose a semiparametric likelihood-based approach using an accelerated failure time model for the covariate subject to the detection limit. A two-stage estimation procedure is developed, where the conditional distribution of this covariate given other variables is estimated prior to maximizing the likelihood function. The proposed method outperforms complete-case analysis and substitution methods in simulation studies. Technical conditions for desirable asymptotic properties are provided.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2016-03-01
    Description: Inversion formulae are derived that express the density and distribution function of a ratio of random variables in terms of the joint characteristic function of the numerator and denominator. The resulting expressions are amenable to numerical evaluation and lead to simple asymptotic expansions. The expansions reduce to known results when the denominator is almost surely positive. Their accuracy is demonstrated with numerical examples.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2016-03-01
    Description: We propose new nonparametric empirical Bayes methods for high-dimensional classification. Our classifiers are designed to approximate the Bayes classifier in a hypothesized hierarchical model, where the prior distributions for the model parameters are estimated nonparametrically from the training data. As is common with nonparametric empirical Bayes, the proposed classifiers are effective in high-dimensional settings even when the underlying model parameters are in fact nonrandom. We use nonparametric maximum likelihood estimates of the prior distributions, following the elegant approach studied by Kiefer & Wolfowitz in the 1950s. However, our implementation is based on a recent convex optimization framework for approximating these estimates that is well-suited for large-scale problems. We derive new theoretical results on the accuracy of the approximate estimator, which help control the misclassification rate of one of our classifiers. We show that our methods outperform several existing methods in simulations and perform well when gene expression microarray data is used to classify cancer patients.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2016-03-01
    Description: In many application areas, a primary focus is on assessing evidence in the data refuting the assumption of independence of $Y$ and $X$ conditionally on $Z$ , with $Y$ response variables, $X$ predictors of interest, and $Z$ covariates. Ideally, one would have methods available that avoid parametric assumptions, allow $Y, X, Z$ to be random variables on arbitrary spaces with arbitrary dimension, and accommodate rapid consideration of different candidate predictors. As a formal decision-theoretic approach has clear disadvantages in this context, we instead rely on an encompassing nonparametric Bayes model for the joint distribution of $Y$ , $X$ and $Z$ , with conditional mutual information used as a summary of the strength of conditional dependence. We construct a functional of the encompassing model and empirical measure for estimation of conditional mutual information. The implementation relies on a single Markov chain Monte Carlo run under the encompassing model, with conditional mutual information for candidate models calculated as a byproduct. We provide an asymptotic theory supporting the approach, and apply the method to variable selection. The methods are illustrated through simulations and criminology applications.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2016-03-01
    Description: We address the problem of testing for a parametric function of fixed effects in mixed models. We propose a test based on the distance between two empirical error distribution functions, which are constructed from residuals calculated under the opposing hypotheses. The proposed test statistic has power against all alternatives, and its asymptotic distribution is derived. A simulation study shows that the test outperforms others in the literature. The test is applied to longitudinal data from an AIDS clinical trial and a growth study.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2016-03-01
    Description: This article develops a unified framework to study the asymptotic properties of all periodic spline-based estimators, that is, of regression, penalized and smoothing splines. The explicit form of the periodic Demmler–Reinsch basis in terms of exponential splines allows the derivation of an expression for the asymptotic equivalent kernel on the real line for all spline estimators simultaneously. The corresponding bandwidth, which drives the asymptotic behaviour of spline estimators, is shown to be a function of the number of knots and the smoothing parameter. Strategies for the selection of the optimal bandwidth and other model parameters are discussed.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2016-03-01
    Description: For multivariate functional data recorded from a sample of subjects on a common domain, one is often interested in the covariance between pairs of the component functions, extending the notion of a covariance matrix for multivariate data to the functional case. A straightforward approach is to integrate the pointwise covariance matrices over the functional time domain. We generalize this approach by defining the Fréchet integral, which depends on the metric chosen for the space of covariance matrices, and demonstrate that ordinary integration is a special case where the Frobenius metric is used. As the space of covariance matrices is nonlinear, we propose a class of power metrics as alternatives to the Frobenius metric. For any such power metric, the calculation of Fréchet integrals is equivalent to transforming the covariance matrices with the chosen power, applying the classical Riemann integral to the transformed matrices, and finally using the inverse transformation to return to the original scale. We also propose data-adaptive metric selection with respect to a user-specified target criterion, such as fastest decline of the eigenvalues, establish consistency of the proposed procedures, and demonstrate their effectiveness in a simulation. The proposed functional covariance approach through Fréchet integration is illustrated by a comparison of connectivity between brain voxels for normal subjects and Alzheimer's patients based on fMRI data.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2016-03-01
    Description: We define mechanistic interaction between the effects of two variables on an outcome in terms of departure of these effects from a generalized noisy-OR model in a stratum of the population. We develop a fully probabilistic framework for the observational identification of this type of interaction via excess risk or superadditivity, one novel feature of which is its applicability when the interacting variables have been generated by arbitrarily dichotomizing continuous exposures. The method allows for stochastic mediators of the interacting effects. The required assumptions are provided in the form of conditional independencies between the problem variables, which may relate to a causal-graph representation of the problem. We also develop a theory of mechanistic interaction between effects associated with specific paths of the causal graph.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2016-03-01
    Description: An adjustment for marginal composite likelihoods is derived to match the second-order theory of the likelihood when inference is for a vector-valued parameter in the absence of nuisance components. The adjustment overcomes the failure of Bartlett identities for marginal composite likelihoods and leads to a Bartlett-correctable marginal composite likelihood ratio statistic.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2016-03-01
    Description: The Clayton–Oakes bivariate failure time model is extended to dimensions $m 〉 2$ in a manner that allows unspecified marginal survivor functions for all dimensions less than $m$ . Special cases that allow unspecified marginal survivor functions of dimension $q$ or less with $q 〈 m$ , while making some provisions for dependencies of dimension greater than $q$ , are also described.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2016-03-01
    Description: Multiple imputation is widely used for estimation in situations where there are missing data. Rubin (1987) provided an easily applicable formula for multiple imputation variance estimation, but its validity requires the congeniality condition of Meng (1994) , which may not be satisfied for method of moments estimation. We give the asymptotic bias of Rubin's variance estimator when method of moments estimation is used in the complete-sample analysis for each imputed dataset. A new variance estimator based on over-imputation is proposed to provide asymptotically valid inference in this case.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-11-28
    Description: An effect modifier is a pretreatment covariate that affects the magnitude of the treatment effect or its stability. When there is effect modification, an overall test that ignores an effect modifier may be more sensitive to unmeasured bias than a test that combines results from subgroups defined by the effect modifier. If there is effect modification, one would like to identify specific subgroups for which there is evidence of effect that is insensitive to small or moderate biases. In this paper, we propose an exploratory method for discovering effect modification, and combine it with a confirmatory method of simultaneous inference that strongly controls the familywise error rate in a sensitivity analysis, despite the fact that the groups being compared are defined empirically. A new form of matching, strength- $k$ matching, permits a search through more than $k$ covariates for effect modifiers, in such a way that no pairs are lost, provided that at most $k$ covariates are selected to group the pairs. In a strength- $k$ match, each set of $k$ covariates is exactly balanced, although a set of more than $k$ covariates may exhibit imbalance. We apply the proposed method to study the effects of the earthquake that struck Chile in 2010.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2015-11-28
    Description: Several authors have investigated the challenges of statistical analyses and inference in the presence of early treatment termination, including a loss of efficiency in randomized controlled trials and a connection to dynamic regimes in observational studies. Popular estimation strategies for causal estimands in dynamic regimes lend themselves to studies where treatment is assigned at a finite number of points and the extension to continuous treatment assignment is nontrivial. We re-examine this from a different perspective and propose a new estimator for the mean outcome of a target treatment length policy that does not involve a treatment model. Because this strategy avoids modelling the treatment assignment mechanism, the estimator works for both discrete and continuous treatment length data and eschews bias and imprecision that arise as a result of coarsening continuous time data into intervals. We show how the competition of treatment length assignment and terminating event lead to a competing risks problem. We exemplify the direct estimator through numerical studies and the analysis of two real datasets. When all modelling assumptions for both the direct and inverse weighted estimators are correct, our simulation studies suggest that the direct estimator is more precise.
    Print ISSN: 0006-3444
    Electronic ISSN: 1464-3510
    Topics: Biology , Mathematics , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...