Semester Schedule: Statistics - Spring 2008

Seminars are on Mondays
Time:12:00 - 1:00 PM Location: Room 903, 1255 Amsterdam Avenue Tea and Coffee will be served before the seminar at 11:30 AM, Room 1025

*Tuesday, January 22

*Time: 4:00 PM

Dr. Per Kragh Andersen


Department of Biostatistics, University of Copenhagen

(Joint seminar with the department of statistics & PI)

Title: Checking hazard regression models using pseudo-observations

Graphical methods for model diagnostics are an essential part of the model fitting procedure. However, in survival analysis the plotting is always hampered by the presence of censoring. While model specific solutions do exist and are commonly used, mainly for the Cox regression model,we present a more general approach that covers all models using the same framework.The pseudo-observations enable us to calculate residuals for each individual ateach time point regardless of censoring and provide methods for simultaneously checking all the assumptions of both the Cox and the additive model.

We introduce methods for single as well as multiple covariate cases and complement them with corresponding goodness of fit tests.The methods are illustrated on simulated as well as real data examples.

The talk is based on join work with Maja Pohar Perme,

Department of Biomedical Informatics, University of Ljubljana, Slovenia.

*Location: the PI multipurpose room (click here to view the directions)

 

January 28

Dr. Eric Tchetgen, Harvard University

"Higher order influence functions for causal and missing data models"

We describe a novel approach for making nonparametric valid inferences onnonlinear functionals on high dimensional models. Our class of estimatorsequally applies to the regular case where the functional of interest isroot-n estimable, and to the nonregular case where much slower nonparametric rates prevail. We illustrate our approach in two important examples: theconstruction of an honest confidence interval for a treatment effect in the presence of many confounders and/or under very low smoothness conditions and an optimal model selection procedure for nonparametric regression withoutcome missing at random given a large vector of observed auxiliary covariates.

 

February 4

 

TBA

February 11

TBA

February 18

Dr. Robert Adler, Technion Israel

"FROM STATISTICS TO TOPLOGY AND BACK AGAIN"

 

We shall start by briefly discussing some statistical problems related to the structure of the primordial universe, as seen through the NobelPrize winning cosmic microwave background (COBE) data.


The next step will be to turn this into an abstract problem related to the (integral and differential) geometry generated by Gaussianrandom processes on manifolds. Out of this will come extensions to Riemannian manifolds of thefamous Kinematic Fundamental Formula of classical,Euclidean,integral geometry, as well as the related Crofton Formula. In the end we shall see how these results shed new light on excursion probabilities for smooth Gaussian processes, and even how they arerelevant to analysing the COBE data.

 

 

February 25 TBA

March 3

Dr. Matthew Schofield, University of Otago

Capture-recapture models have long been used to give biologists information about population dynamics. In the past 10-20 years there hasbeen a proliferation of capture-recapture methods based primarily aroundincreasingly complex data. We show how thinking of the capture-recapture experiment in terms of a missing data problem: (i) makes use of all available data, (ii) helps to unify the abundance of models into a commonframework and (iii) makes estimation of models that describe complex population dynamics more accessible to biologists.

March 10

Dr. Siddhartha Dalal, RAND

"Information Mining and Services Research: it's not computing prowess alone."

"There is a dramatic shift in the world economy towards services. Because the services transcend across heterogeneous organizations, technologies, processes and people, there is an associated need for a vast amount of information processing. In spite of impressive gains, there are still many technological challenges that cannot be tackled by the computational prowess alone. I will describe some of these challenges in through specific examples from widely different domains like search engines, detection of illicit nuclear material at ports, and software engineering. On the surface, traditional information theoretic considerations do not offer solutions. Accordingly, researchers looking for conventional solutions would have difficulty in solving these problems. I will describe how alternative formulations based on statistical underpinnings including Bayesian methods, sequential stopping and combinatorial designs have played a critical role in addressing these challenges.

Biographical Sketch:Siddhartha Dalal is the Senior Technology Adviser to the President at Rand. Sid’s industrial research career began at Math Research Center at Bell Labs followed by Bellcore/Telcordia Technologies. Most recently he was a vice president of research at Xerox. He has co-authored seventy publications, several patents and two NRC reports covering the areas of risk analysis econometrics modeling, image processing, stochastic optimization, data/document mining, software engineering and Bayesian methods.

 

March 17

Spring Break

March 24

Dr. Douglas G. Simpson, University of Illinois at Urbana-Champaign

" Semiparametric Analysis of Heterogeneous Data Using Varying-Scale Generalized Linear Models"

A class of heteroscedastic generalized linear regression models is developed in which a subset of the regression parameters are scaled nonparametrically. Efficient semiparametric inferencesare derived for the parametric components of the models. Bootstraptests for scale heterogeneity are also developed. The models provide an approach to adapt for heterogeneity in the data duefactors such as to varying exposures and varying levels ofaggregation. The methodology is illustrated with simulations, published data and data from collaborative research on ultrasound safety.

 

March 31

Dr. Hernando Ombao, Brown University

"Spectral Analysis of Brain Signals"

In many neuroscience experiments, one of the key goals is to investigate the oscillatory behavior of brain signals as quantified by spectral analysis. First, we review some basic ideas of Fourier analysis of stationary time series and highlight its connection to analysis of variance. Second, we discuss current models and methods for analyzing non-stationary processes (i.e., processes whose spectral decomposition change over time). Stochastic representations using localized basis functions will be discussed. The talk will conclude with some current investigations including spatio-temporal-spectral analysis and classification of biological signals. These methods will be illustrated using electroencephalogram (EEGs) and magnetoencephalogram (MEGs).

This talk is based on collaborations with P. Fryzlewicz (University of Bristol, UK), R. Ho (Nanyang Technological University,Singapore) and C. Edgar (University of Pennsylvania).

 

April 7

Dr. Wei Biao Wu, University of Chicago

"Construction of simultaneous confidence bands in time series"

I will talk about statistical inference of trends in mean
non-stationary models, and mean regression and conditional variance (or volatility) functions in nonlinear stochastic regression models.Simultaneous confidence bands are constructed and the coverageprobabilities are shown to be asymptotically correct. The Simultaneous confidence bands are useful for model specificationproblems in nonlinear time series. The results are applied toenvironmental and financial data-sets.

The talk will be based on joint papers with Zhibiao Zhao.

 

 

April 14

Dr. Andrew Lawson, University of South Carolina

"Space-time modeling of small area health data via latent structures."

Abstract:

In the assessment of the linkage between environmental risk gradients and health outcomes there is often a need to consider the possibility that risk is multi-faceted. Many models for disease risk in the spatial or spatio-temporal domain regard the map as being defined by global parameters with single underlying components. However it is commonly true that unknown risk structures (multiple risk gradients, or population sub -groups) could interweave to result in a single realization of disease. In this talk I will first discuss the context of spatio-temporal modeling of disease. I will then describe two novel Bayesian approaches to the analysis of latent structure in space-time disease maps: 1) time-dependent latent mixture components with spatial weights; 2) covariate spline interaction models with mixture prior distributions. A comparison will be made with a standard Bayesian space-time model. Issues of identifiability and the appropriateness of DIC will be discussed.

 

 

Reference:

 

Lawson, A. B. (2008) Bayesian Disease Mapping: hierarchical modeling in spatial epidemiology CRC press (ch 11)

 

April 21

 

Dr. Ed Ionides, Ann Arbor Michigan

"Time series analysis via mechanistic models"

The purpose of time series analysis via mechanistic models is to reconcile the known or hypothesized structure of a dynamical system with observations collected over time. We develop a framework for constructing nonlinear mechanistic models and carrying out inference. Our framework
permits the consideration of implicit dynamic models, meaning statistical models for stochastic dynamical systems which are specified by a simulation algorithm to generate sample paths. Inference procedures that operate on implicit models are said to have the plug-and-play property.
Our work builds on recently developed plug-and-play inference methodology for partially observed Markov models. We introduce a class of implicitly specified Markov chains with stochastic transition rates, and we demonstrate its applicability to open problems in statistical inference
for biological systems. As one example, these models are shown to give a fresh perspective on measles transmission dynamics. As a second example, we present a mechanistic analysis of cholera incidence data, involving interaction between two competing strains of the pathogen Vibrio cholerae.

In collaboration with Carles Breto, Daihai He and Aaron King

 

April 28

 

May 5

Dr. Katherine Pollard, UC Davis Genome Center & Department of Statistics

"Rapid evolution in the human genome"

Comparative genomics is a powerful approach to investigating the genetic basis for what makes us human. I will describe three different methods we have developed for identifying lineage-specific evolution: a phylogenetic hidden Markov model (phylo-HMM), an empirical Bayes phylogenetic p-value (phyloP), and a likelihood ratio test (LRT). With a reasonable number of sequenced species, phyloP and the LRT have sufficient power to detect evolutionary forces operating on single bases of DNA in a specific clade of interest (e.g. primates). In contrast, the phylo-HMM is useful for learning which clade has experienced lineage-specific evolution, but requires this evolution to have acted on a region tens of bases long. I will describe simulations comparing the power of these and several other published methods in a variety of evolutionary scenarios. Then, I will describe results of applying our methods to multiple sequence alignments of the human and other vertebrate genomes. I will focus on our discovery of 202 Human Accelerated Regions (HARs) that were extensively changed in the last ~6 million years since divergence from our common ancestor with chimpanzee.

Close Window