Hi,
I'm trying to run a fixed effects linear regression with effects at the participant level.
My data look like this:
ID Var1 Var2 Var3 Var4
1 5 2 3 23
2 2 1 4 37
3 1 3 2 42
Where ID is a unique identifier for each participant, Var1-Var3 are categorical, and Var4 is continuous
I want to check the effects of all variables (including the ID variable), on Var4.
Thanks!
I don't think you can check the effects of the ID Variable against Var 4. Doesn't make sense given the data you have.
With regards to the rest of the variables
PROC GLM DATA=whatever;
CLASS VAR1-VAR3;
MODEL VAR4=VAR1-VAR3;
RUN;
Perhaps an even better idea is to use PROC PLS instead of PROC GLM, which isn't exactly a linear regression, but it might be a model that fits and predicts better.
As an additional suggestion ... which maybe you have not provided because this is such a simple example, I would give meaningful names to VAR1 through VAR4
How do you expect ID to interact with the response variable? Does it index enrollment in the study or something of that nature? Ordinarily, I would consider ID as a random effect, which in this context is the source of residual error. Thus, the regression would look like:
proc reg data=yourdata;
model var4=var1-var3;
quit;
run;
From this you can explore collinearity, influential observations and other problems related to regression.
Steve Denham
SteveDenham wrote:
How do you expect ID to interact with the response variable? Does it index enrollment in the study or something of that nature? Ordinarily, I would consider ID as a random effect, which in this context is the source of residual error. Thus, the regression would look like:
proc reg data=yourdata;
model var4=var1-var3;
quit;
run;
Does this really work if var1 to var3 are categorical, as stated in the original post?
Shouldn't the run; go before the quit;?
Well, the code will work, but it's not optimal with categorical variables.
So after reading all of the post, this isn't really a regression, as you can't regress on categories (unless you make some assumptions about equal spacing and so forth). It's a simple multi-factor analysis of variance, with no interactions. Your GLM code is ideal.
I always put the quit; before the run; for interactive PROCs (GLM, REG, SQL). Maybe I've had it wrong all this time.
Steve Denham
Yes, I have to admit that the original post stated that this was about linear regression, and that threw me off for a few minutes, as this really isn't linear regression without a single continuous variable.
I'm pretty sure run; goes before quit;, that's always how I've done it, but I was asking because maybe you had a good reason to do so — but run; is completely useless in PROC SQL, waste of type typing those 4 characters in PROC SQL
A lot of times I just type, and run; gets stuck in there when I know I have reached a "border" in what I'm doing. Old habits die hard.
Steve Denham
To estimate a fixed effects model with (fixed) effects at the participant level, you will need to have more than one observation per person - which you don't seem to have. Think of it as adding in a dummy variable for each person (though the software does this for you automatically).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.