p*******0 发帖数: 420 | 1 The file Housing.txt contains a random sample of recently-sold houses in
two quadrants of a community.
The houses were very similar, differing in only two key respects: some
were located in the northwest quadrant of the
city (along the bordering river), while others were located in the
southeast quadrant of the city (along the bordering
mountains); and some were two-level structures while others were single-
level structures. The square footage (and
other key characteristic typically thought to be associated with sale
price) were, for all intents and purposes, identical
for the houses in the sample (and in the underlying population).
Each record in the file corresponds to one house in the sample, and
contains three pieces of information:
sales price (in $1,000s: Columns 1-4); quadrant (Column 6: S=Southeast,
N=Northwest); and number of levels
(Column 8: 1 or 2). Note that quadrant is specified in the file as a
capital letter.
Your task is to use these data to answer several questions. Please
justify your answer to each, and identify
where on the printout you looked to reach the conclusion regarding each
one:
Using a model without an interaction term:
(1) Is there reasonable basis to suspect that houses which border the
river differ in predicted sales
price from houses which border the mountains? If so, what is your best
estimate of the dollar value
of this difference?
(2) Is there reasonable basis to suspect that houses which have 1 level
differ in predicted sales price
from houses which have 2 levels? If so, what is your best estimate of
the dollar value of this
difference?
(3) What is the predicted difference in sales price between 1-level
houses bordering the river and 2-
level houses bordering the mountains? (For this question, a minimal
amount of hand computation
may be involved.)
Using a model with an interaction term:
(4) Is there reasonable basis to suspect that houses which border the
river differ in predicted sales
price from houses which border the mountains? If so, what is your best
estimate of the dollar value
of this difference?
(5) Is there reasonable basis to suspect that houses which have 1 level
differ in predicted sales price
from houses which have 2 levels? If so, what is your best estimate of
the dollar value of this
difference?
(6) Is there reasonable basis to suspect that the answer to Question (4)
differs depending on number of
levels? If so, what is your best estimate of the difference in
differences?
(7) Is there reasonable basis to suspect that the answer to Question (5)
differs depending on quadrant?
If so, what is your best estimate of the difference in differences?
(8) What is the predicted difference in sales price between 1-level
houses bordering the river and 2-
level houses bordering the mountains? Can we be reasonably confident
that this difference is
anything other than zero in the population?
我照老师handout写的sas: 我觉得应该用t test,但老师上课讲的
这种方法,但
得出来得不知道怎么分析。那位好心人帮帮忙。
data lab5_data_Housing;
infile "D:\Documents\Dropbox\Books\lab\lab5.txt";
input price 1-4 quadrant $ 6 levels 8;
run;
proc glm data = lab5_data_Housing;
classes quadrant levels;
model price= quadrant levels/solution;
means quadrant levels ;
estimate "N, 1" intercept 1 quadrant 1 0 levels 1 0;
estimate "N, 2" intercept 1 quadrant 1 0 levels 0 1;
estimate "S, 1" intercept 1 quadrant 0 1 levels 1 0;
estimate "S, 2" intercept 1 quadrant 0 1 levels 0 1;
run;
proc glm data = lab5_data_Housing;
classes quadrant levels;
model price= quadrant levels quadrant*levels/solution;
means quadrant levels quadrant*levels;
estimate "N, 1" intercept 1 quadrant 1 0 levels 1 0 quadrant*levels 1 0
0 0;
estimate "N, 2" intercept 1 quadrant 1 0 levels 0 1 quadrant*levels 0 1
0 0;
estimate "S, 1" intercept 1 quadrant 0 1 levels 1 0 quadrant*levels 0 0
1 0;
estimate "S, 2" intercept 1 quadrant 0 1 levels 0 1 quadrant*levels 0 0
0 1;
run;
quit;
data lab5_data_Housing;
infile "D:\Documents\Dropbox\Books\lab\lab5.txt";
input price 1-4 quadrant $ 6 levels 8;
run; |
|