| statistics G6101: statistical modeling for data analysis |
Project The default project concerns the Federalist essays. Take a look at the this material. The goal of the project is to use logistic regression or tree-based methods to assign the disputed essays to either Hamilton or Madison. You must make a convincing case as to why you believe your approach leads to the correct attribution. I tend to favor out-of-sample predictive accuracy as measured by 10-fold cross validation but this is certainly not the only possibility.
I would like you to complete your project report by the end of the semester so that I can assign grades.
Reading: Chapter 11
The set of R examples I started in class on Monday Sept 12th is here. This is from the Venables and Ripley book. I strongly recommend that you go through these yourself. Here is nice brief introduction to R.
Reading: We are working through Chapters 5 and 6 now. The basic regression stuff I went through on Monday is summarized nicely here.