# Selected Research Projects

We develop classification methods as well as their corresponding theory under various contexts.

We study various statistical aspects of network type data including modeling, estimation and algorithms.

We study nonparametric estimation methods with applications.

We investigate the tuning parameter selection problem under various contexts.

We propose different methods for performing variable selection in various contexts.

# Selected Publications

(2017). How Many Communities Are There?. Journal of Computational and Graphical Statistics.

(2017). Model Selection for High Dimensional Quadratic Regression via Regularization. Journal of the American Statistical Association.

(2016). Neyman-Pearson Classification under High-Dimensional Settings. Journal of Machine Learning Research.

(2011). Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models. Journal of the American Statistical Association.

# Recent Manuscripts

(2018). Large-Scale Model Selection with Misspecification. arXiv preprint arXiv:1803.07418.

(2018). Partial Distance Correlation Screening for High Dimensional Time Series. arXiv preprint arXiv:1802.09116.

(2018). Sparse Linear Discriminant Analysis under the Neyman-Pearson Paradigm. arXiv preprint arXiv:1802.02557.

(2017). A note on estimation in a simple probit model under dependency. arXiv preprint arXiv:1712.09694.

# Software

• nproc: given a sample of class 0 and class 1 and a classification method, the package generates the corresponding Neyman-Pearson classifier with a pre-specified type-I error control and Neyman-Pearson Receiver Operating Characteristic (NP-ROC) Bands. Relevant paper.
• SIS: an R package for implementing different Sure Independence Screening methods. Relevant paper.
• RAMP: an R package for fitting the entire solution path for high-dimensional regularized generalized linear models with interactions effects under the strong heredity constraint. Relevant paper.
• FANS: matlab code for implementing the FANS (Feature Augmentation via Nonparametrics and Selection) classification method for high-dimensional data. Relevant paper.
• CLBIC: R code for implementing Composite Likelihood BIC for selecting the number of communities. Relevant paper.
• apple: an R package for calculating the Approximate Path for Penalized Likelihood Estimators for Generalized Linear Models. Relevant paper.
• ROAD: a matlab package designed for the Regularized Optimal Affine Discriminant method for high-dimensional classification. Relevant paper.
• xtab: an R function for generating latex tables from a data matrix.

# Recent Posts

Random thoughts and notes

# Teaching

I have been teaching the following courses at Columbia University.

• GR6102: Statistical Modeling and Data Analysis (II)
• GR6101: Statistical Modeling and Data Analysis (I)
• W2024: Applied Linear Regression Analysis
• W4315: Linear Regression Models
• G8325: Advanced Topics in Statistics (Statistical Analysis for Network Data)
• G8325: Advanced Topics in Statistics (High-dimensional Variable Selection)
• W1211: Introduction to Statistics (with calculus)