|
Semester
Schedule: Statistics - Spring 2008
Seminars are on Mondays
Time:12:00 - 1:00 PM Location: Room 903, 1255 Amsterdam Avenue
Tea and Coffee will be served before
the seminar at 11:30 AM, Room 1025
|
|
Dr. Per
Kragh Andersen
Department of Biostatistics, University of Copenhagen
(Joint seminar with the department of statistics & PI)
Title:
Checking hazard regression models using pseudo-observations
Graphical methods for model diagnostics are an essential part of the
model fitting procedure. However, in survival analysis the
plotting is always hampered by the presence of censoring. While
model specific solutions do
exist and are commonly used, mainly for the Cox regression model,we present a more general approach that covers all
models using the same framework.The pseudo-observations enable us to calculate residuals for each individual ateach time point regardless of censoring and provide methods for simultaneously checking all the assumptions of both the Cox and
the additive model.
We introduce methods for single as well as
multiple covariate cases and complement them with corresponding goodness of fit tests.The methods are
illustrated on simulated as well as real data examples.
The talk is based on join work with Maja Pohar Perme,
Department of
Biomedical Informatics, University of Ljubljana, Slovenia.
*Location:
the PI
multipurpose room (click here to view the directions)
|
|
Dr. Eric Tchetgen, Harvard University
"Higher order influence functions for causal and missing data models"
We describe a novel approach for making nonparametric valid inferences onnonlinear functionals on high dimensional models. Our class of estimatorsequally applies to the regular case where the functional of interest isroot-n estimable, and to the nonregular case where much slower nonparametric rates prevail. We illustrate our approach in two important examples: theconstruction of an honest confidence interval for a treatment effect in the presence of many confounders and/or under very low smoothness conditions and an optimal model selection procedure for nonparametric regression withoutcome missing at random given a large vector of observed auxiliary covariates.
|
| TBA |
TBA
|
|
Dr. Robert Adler, Technion Israel
"FROM STATISTICS TO TOPLOGY AND BACK AGAIN"
We shall start by briefly discussing some statistical problems related to the structure of the primordial universe, as seen through the NobelPrize winning cosmic microwave background (COBE) data.
The next step will be to turn this into an abstract problem related to the (integral and differential) geometry generated by Gaussianrandom processes on manifolds. Out of this will come extensions to Riemannian manifolds of thefamous Kinematic Fundamental Formula of classical,Euclidean,integral geometry, as well as the related Crofton Formula. In the end we shall see how these results shed new light on excursion probabilities for smooth Gaussian processes, and even how they arerelevant to analysing the COBE data.
|
| TBA |
|
Dr. Matthew Schofield, University of Otago
Capture-recapture models have long been used to give biologists information about population dynamics. In the past 10-20 years there hasbeen a proliferation of capture-recapture methods based primarily aroundincreasingly complex data. We show how thinking of the capture-recapture experiment in terms of a missing data problem: (i) makes use of all available data, (ii) helps to unify the abundance of models into a commonframework and (iii) makes estimation of models that describe complex population dynamics more accessible to biologists.
|
|
Dr. Siddhartha Dalal, RAND
"Information Mining and Services Research: it's not computing prowess alone."
"There
is a dramatic shift in the world economy towards services. Because the services
transcend across heterogeneous organizations, technologies, processes and
people, there is an associated need for a vast amount of information
processing. In spite of impressive gains, there are still many technological
challenges that cannot be tackled by the computational prowess alone. I will describe
some of these challenges in through specific examples from widely different
domains like search engines, detection of illicit nuclear material at ports,
and software engineering. On the surface, traditional information theoretic
considerations do not offer solutions. Accordingly, researchers looking for
conventional solutions would have difficulty in solving these problems. I will
describe how alternative formulations based on statistical underpinnings
including Bayesian methods, sequential stopping and combinatorial designs have
played a critical role in addressing these challenges.
Biographical Sketch:Siddhartha Dalal is the Senior Technology
Adviser to the President at Rand. Sid’s
industrial research career began at Math
Research Center
at Bell Labs followed by Bellcore/Telcordia Technologies. Most recently he was
a vice president of research at Xerox. He has co-authored seventy publications,
several patents and two NRC reports covering the areas of risk analysis
econometrics modeling, image processing, stochastic optimization, data/document
mining, software engineering and Bayesian methods.
|
|
Spring Break
|
|
Dr. Douglas G. Simpson, University of Illinois at Urbana-Champaign
" Semiparametric Analysis of Heterogeneous Data Using Varying-Scale Generalized Linear Models"
A class of heteroscedastic generalized linear regression models is developed in which a subset of the regression parameters are scaled nonparametrically. Efficient semiparametric inferencesare derived for the parametric components of the models. Bootstraptests for scale heterogeneity are also developed. The models provide an approach to adapt for heterogeneity in the data duefactors such as to varying exposures and varying levels ofaggregation. The methodology is illustrated with simulations, published data and data from collaborative research on ultrasound safety.
|
|
Dr. Hernando Ombao, Brown University
"Spectral Analysis of Brain Signals"
In many neuroscience experiments, one of the key goals is to investigate the oscillatory behavior of brain signals as quantified by spectral analysis. First, we review some basic ideas of Fourier analysis of stationary time series and highlight its connection to analysis of variance. Second, we discuss current models and methods for analyzing non-stationary processes (i.e., processes whose spectral decomposition change over time). Stochastic representations using localized basis functions will be discussed. The talk will conclude with some current investigations including spatio-temporal-spectral analysis and classification of biological signals. These methods will be illustrated using electroencephalogram (EEGs) and magnetoencephalogram (MEGs).
This talk is based on collaborations with P. Fryzlewicz (University of Bristol, UK), R. Ho (Nanyang Technological University,Singapore) and C. Edgar (University of Pennsylvania).
|
|
Dr. Wei Biao Wu, University of Chicago
"Construction of simultaneous confidence bands in time series"
I will talk about statistical inference of trends in mean non-stationary models, and mean regression and conditional variance (or volatility) functions in nonlinear stochastic regression models.Simultaneous confidence bands are constructed and the coverageprobabilities are shown to be asymptotically correct. The Simultaneous confidence bands are useful for model specificationproblems in nonlinear time series. The results are applied toenvironmental and financial data-sets.
The talk will be based on joint papers with Zhibiao Zhao.
|
|
Dr. Andrew Lawson, University of South Carolina
"Space-time modeling of small area health data via latent structures."
Abstract:
In the assessment of the linkage between environmental risk
gradients and health outcomes there is often a need to consider the possibility
that risk is multi-faceted. Many models for disease risk in the spatial or
spatio-temporal domain regard the map as being defined by global parameters
with single underlying components. However it is commonly true that unknown
risk structures (multiple risk gradients, or population sub -groups) could
interweave to result in a single realization of disease. In this talk I will
first discuss the context of spatio-temporal modeling of disease. I will then
describe two novel Bayesian approaches to the analysis of latent structure in space-time
disease maps: 1) time-dependent latent mixture components with spatial weights;
2) covariate spline interaction models with mixture prior distributions. A comparison will be made with a standard
Bayesian space-time model. Issues of identifiability and the appropriateness of
DIC will be discussed.
Reference:
Lawson, A. B. (2008) Bayesian Disease Mapping: hierarchical
modeling in spatial epidemiology CRC press (ch 11)
|
|
Dr. Ed Ionides, Ann Arbor Michigan
"Time series analysis via mechanistic models"
The purpose of time series analysis via mechanistic models is to reconcile the known or hypothesized structure of a dynamical system with observations collected over time. We develop a framework for constructing nonlinear mechanistic models and carrying out inference. Our framework permits the consideration of implicit dynamic models, meaning statistical models for stochastic dynamical systems which are specified by a simulation algorithm to generate sample paths. Inference procedures that operate on implicit models are said to have the plug-and-play property. Our work builds on recently developed plug-and-play inference methodology for partially observed Markov models. We introduce a class of implicitly specified Markov chains with stochastic transition rates, and we demonstrate its applicability to open problems in statistical inference for biological systems. As one example, these models are shown to give a fresh perspective on measles transmission dynamics. As a second example, we present a mechanistic analysis of cholera incidence data, involving interaction between two competing strains of the pathogen Vibrio cholerae.
In collaboration with Carles Breto, Daihai He and Aaron King
|
| |
|
Dr. Katherine Pollard, UC Davis Genome Center & Department of Statistics
"Rapid evolution in the human genome"
Comparative genomics is a powerful approach to investigating the genetic basis for what makes us human. I will describe three different methods we have developed for identifying lineage-specific evolution: a phylogenetic hidden Markov model (phylo-HMM), an empirical Bayes phylogenetic p-value (phyloP), and a likelihood ratio test (LRT). With a reasonable number of sequenced species, phyloP and the LRT have sufficient power to detect evolutionary forces operating on single bases of DNA in a specific clade of interest (e.g. primates). In contrast, the phylo-HMM is useful for learning which clade has experienced lineage-specific evolution, but requires this evolution to have acted on a region tens of bases long. I will describe simulations comparing the power of these and several other published methods in a variety of evolutionary scenarios. Then, I will describe results of applying our methods to multiple sequence alignments of the human and other vertebrate genomes. I will focus on our discovery of 202 Human Accelerated Regions (HARs) that were extensively changed in the last ~6 million years since divergence from our common ancestor with chimpanzee.
|
 |
|