Consider the following simple linear regression model:
PRICE = β0 +β1 POOL +ε
In the data set, verify that POOL is 1 if a house has a swimming pool, 0 otherwise. Using the proc ttest output, complete the table below. Use only the proc ttest output.
So I was trained to use proc reg on a question like this but was asked to use proc t-test. I think this should be my code since "1" means there is a pool and "0" means there isn't a pool
proc ttest data=sarah.homeprices;
var price;
class pool;
run;
The problem is that when I run a proc ttest I don't see the model parameter or point estimate like proc reg
Yes. Most estimates are "point estimates" with the exception of confidence intervals, which we call "interval estimates."
The equivalent regression code is not PROC REG because PROC REG does not support a CLASS statement. The equivalent regression procedure is PROC GLM. Compare your TTEST output to the output of this call to PROC GLM:
proc GLM data=sarah.homeprices plots=none;
class pool;
model price = pool / solution;
quit;
When POOL=1, the mean is 163050. When POOL=0, the mean is 169672. Therefore, the difference is –6622. This would be the point estimate. From there, you can read the confidence interval from the table, making sure to change the sign.
Yes. Most estimates are "point estimates" with the exception of confidence intervals, which we call "interval estimates."
The equivalent regression code is not PROC REG because PROC REG does not support a CLASS statement. The equivalent regression procedure is PROC GLM. Compare your TTEST output to the output of this call to PROC GLM:
proc GLM data=sarah.homeprices plots=none;
class pool;
model price = pool / solution;
quit;
Neither procedure has syntax for the term β2(SIZE −2200). You would need to create a new variable (SIZE22 = SIZE - 2200) and then use that variable in the model.
In general, PROC GLM can fit all the linear models that PROC REG fits. GLM is more useful when you have classification variables that contain more than two levels or when you have character variables.
> for some reason its making a new dataset with just the variable size2200
??? The NEWPRICES data set should contain all the original variables, plus the new SIZE2200 variable.
Run
PROC CONTENTS data=sarah.homeprices; run;
Is it possible that you accidentally overwrote sarah.homeprices? Sometimes novices try to modify a data set in place like this:
/* DON'T DO THIS. A MISTAKE CAN DELETE THE DATA */
data sarah.homeprices;
set sarah.homeprices;
/* more lines go here....
but an ERROR muight result in overwriting the data with 0 obs */
run;
95% confidence limits for what? The regression coefficients? You can use
model price = size2200 pool / CLB;
The documentation for PROC REG discusses other options that produce different confidence intervals. Please read about the options on the MODEL statement.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.