I would like to obtain fitted variables for a GLM regression model excluding the class variable. I am familiar with getting fitted values for a full statistical model in general, which can be done with something of the form
proc glm data=this.data;
class categorical_var;
model y x1 x2 categorical_var / solution;
output out=this.outdata predicted=yhat;
run;
What I want is yhat to contain *just* the fittted values for the x1 and x2 variables, excluding the categorical_var values. Is anyone aware of a way (preferably simple!) to do this without viewing the regression output and hard-coding in the parameters for x1 and x2 myself? I am using SAS 9.4.
Thanks!
First method:
Remove the categorical variable from the model. That will give you estimates that are weighted by the size of each class in your data.
Second method
If you want the classes to have equal weight in your estimates, you could use a variant of the adding missing values to the data technique:
data carsReg;
set sashelp.cars;
id = _n_;
output;
call missing(MSRP);
extra = 1;
do type = "Hybrid", "SUV", "Sedan", "Sports", "Truck", "Wagon";
output;
end;
keep id MSRP horsepower weight type extra;
run;
proc glm data=carsReg plots=none;
class type;
model MSRP = horsepower weight type;
output out=carsPred predicted=MSRP_pred stdp=MSRP_STDP;
run;
quit;
proc sql;
create table carsAvgType as
select
id, horsepower, weight,
mean(MSRP_pred) as MSRP_pred,
sqrt(mean(MSRP_STDP**2)) as MSRP_STDP
from carsPred
where extra
group by id, horsepower, weight;
quit;
First method:
Remove the categorical variable from the model. That will give you estimates that are weighted by the size of each class in your data.
Second method
If you want the classes to have equal weight in your estimates, you could use a variant of the adding missing values to the data technique:
data carsReg;
set sashelp.cars;
id = _n_;
output;
call missing(MSRP);
extra = 1;
do type = "Hybrid", "SUV", "Sedan", "Sports", "Truck", "Wagon";
output;
end;
keep id MSRP horsepower weight type extra;
run;
proc glm data=carsReg plots=none;
class type;
model MSRP = horsepower weight type;
output out=carsPred predicted=MSRP_pred stdp=MSRP_STDP;
run;
quit;
proc sql;
create table carsAvgType as
select
id, horsepower, weight,
mean(MSRP_pred) as MSRP_pred,
sqrt(mean(MSRP_STDP**2)) as MSRP_STDP
from carsPred
where extra
group by id, horsepower, weight;
quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.