Statistics W1111
Columbia University
Instructions for lab 2
Part 1:
Long-term interest rates drive much of the economic activity in the
U.S. When interest rates are low, people and establishments are
more likely to borrow money for purchasing homes or growing their
businesses. When interest rates are high, people and
establishments are less likely to do so. In this lab, we'll work
with economic data from the U.S. from 1980 to 1998 to explore
relationships involving interest rates.
Open the data file USeconstat.
The data are culled from the 1997 Organization
for Economic Cooperation and Development economic outlook
report. There are dozens of variables, but we'll only use a small
number. For convenience, all the relevant variables are among the
first few columns in the Stata file.
IMPORTANT CAVEAT FOR ALL ANALYSES:
One can look at many relationships with economic (or any) data.
It is tempting to assign causal explanations to those
relationships. This is risky. Just because there is
(or is not) a relationship between two variables, it does not mean
there is (or is not) a causal relationship between those
variables. There could
be many other factors that affect both variables, and these could
explain what is seen in the graphs.
Questions:
1) Describe the distribution of long-term interest rates.
That is, say where most values are, note any outliers, and say
whether the distribution is tightly packed around its mean or is spread
out. Also, report the mean and standard deviation.
2)
Long-term interest rates right now are around 4.5%. Describe how 4.5%
compares to the historical record of long-term interest rates from 1980
-1998.
3) Using the data, describe the relationship between long-term
and short-term interest rates. Include in your descriptions
a one-number summary of the strength of the association between the two
variables.
4)
Using the data, describe the relationship between long-term interest
rates and unemployment rates. Include in your descriptions a
one-number summary of the strength of the association between the two
variables.
5) Using the data, describe the relationship between long-term
interest rates and gross domestic product (market prices).
Include in your
descriptions a one-number summary
of the strength of the association between the two variables.
6)
Of the following two variables, which one has the weaker linear
association with long-term interest rates: (i) wage rate in business
sector; or (ii) net lending, government? Explain your choice in one
sentence.
7) Suppose you had a model that gave reasonable predictions about
long-term interest rates in the next year. (This is fantasy: interest
rates are notoriously hard to predict. You'd be a billionaire
many times over if you come up with a good prediction model.
Believe me, there are many statisticians and economists trying to do
so!) Suppose you predict that interest rates next year will be
6.0%. Predict gross domestic product (market prices) for next
year using a regression line to make
your prediction.
To fit a regression line, go to Statistics--Regression and Related--Linear Regression. Select
"Gross Domestic Product (Market Prices), Value" as the dependent variable and
"Interest rate, Long-term" as the X (independent) variable. The intercept and slope of the
regression line are the values in the column of the table
labeled Coef.
We'll talk about the values in the other columns, as well as the values
in the other tables, later in the course.
8) What's the slope of the regression line? Intercept?
Write down both the numerical values and their interpretations.
9) Does the scatter plot
suggest any clearly non-linear relationships in the data?
Justify
your answer in at most two sentences.
10)
If interest rates were 1%, could you use the regression equation to
predict the corresponding gross domestic product (market prices)? If
you think so, write down the predicted value of GDP. If you think not,
explain why not in at most one sentence. ONLY WRITE ONE ANSWER:
WRITING BOTH ANSWERS GETS NO CREDIT.
11) Fit another model to predict long-term interest rates for the
next year. You can choose your dependant varaible from any of the
other variables in the dataset. Write down the regression
equation, the slope, intercept, and the interpretation of each.
12) Most of the time, there is more than one reasonable regression
model. How could you compare your model to the model from
questions (8) and (9)? What criteria could you use to say that
your model is "superior" to the model in the previous questions?
How could you test it? (Note: We'll talk about some methods
towards the end of the semester, so no need to actually do any
comparison now--just think about it and write down your thoughts.)
Unit 2: The Correlation Challenge.
Click here for the Guess the Correlations game.
Play it at least three
times against a classmate. Don't forget to talk trash if you win.
You don't have to write anything down for this part of the lab.
If you're feeling really cocky, challenge the TA.
And, if
you're feel like you need to be humbled, come challenge me.
If you beat me at the correlation game, I will buy you coffee.
If I win, I will do some serious gloating.