Project 03 (due 12/13/10)

NYCStat is the City's one-stop-shop for all essential data, reports, and statistics related to City services. Analyze the city data using the statistical techniques you are learning in this course.

At NYC Data Mine, you can download city data based on numerous categories (Education, Media, Community Service etc). For example, you can download the results of a 2009 school survey with ratings in categories such as parental academic expectations, teacher academic expectations, student engagement, safety, and other categories. For an example analysis, you could select a random sample of schools and compare the mean academic expectations score of parents versus teachers. Or you could select random samples of elementary schools and high schools, construct a 2x2 table and compare the academic expectations score of parents (greater or less than a certain score) for both types of schools.

At My Neighborhood Statistics, you can view statistics and data regarding your community district. You could select a random sample of districts and look at the relationship between say "Major felony crimes" and "unemployment rate" (unemployment rate data can be downloaded from the NYC Data Mine site under Demographic, Social, Economic, and Housing Profiles by Community District.)

You can generate the random numbers in order to select your random sample by using this online random number generator.

Complete three pieces of analysis including the use of two of the following statistical techniques: linear regression (chapter 10), comparison of means - confidence interval OR hypothesis testing (chapter 21 and 23), and a chi-squared test for a 2x2 table (chapter 13). Comment on and interpret the results including graphical presentation of the data and the results where appropriate. Insofar as your analysis involves sample means or sample proportions computed using samples drawn from population data, compute the true population means or proportions (for example the true population mean academic expectations score of parents). Compare the accuracy of the results of your analysis to the true population parameter.

Note: Here are some excel tips for calculating regression line equation, means and standard deviations, and the results of a chi-squared test.

Other Datasets downloaded from NYC Data Mine (Only attempt to download datasets with File type: XLS)

Directory of Parks Disability Accessibility Facilities and Programs

Nature Preserves

Department of Youth and Community Development (DYCD) Programs

Graffiti