Hi all!
I have a confussion to perform GLM with a macro.
I have this:
A | B1 | B2 | B3 | B4 |
0.1 | 0 | 2 | 1 | 1 |
0.9 | 1 | 1 | 0 | 2 |
0.78 | 2 | 1 | 1 | 1 |
0.64 | 0 | 2 | 0 | 0 |
and I need to perform several regression over variable the A. For example: A ~B1, A~B2, A~B3 etc.... (I have ~300 of variables).
Later, I need all the p-values of each association to evaluate if one of the associations is true.
But I don't have clue how to do it...
I really appreciate your help, as always!
Thanks in advance!
I think you need to add a CLASS statement to your PROC GLM, since your variable named value is CLASS and not continuous.
If you want the overall p-value for the model, you also need the ODS OUTPUT statement like this:
ods output overallanova=pe;
proc glm data=try1 noprint;
by VarName;
class value;
model south_ref = Value;
quit;
All explained here: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html
No macros needed.
Also, in my opinion, not a good idea to perform regressions like this, hundreds or thousands of variables individually.
Maybe, I didn't explain very well, but my variable X is categorical (just 3 values are possible) and Y variable is continuous.
So, I think is not possible to perform that code, or am I wrong?
Use PROC GLM instead of PROC REG.
Show us the code you tried.
I'm trying with something simple like this:
proc glm data=exp;
class b1;
model a=b1;
run;
but I need to incorporate a macro because I have B1-B300 and also modified to retain just the p-values for each association.
You don't need a macro. The link I gave explains how to do this without a macro. Scroll down to the section entitled "The BY way for many models". Give that a try. If you get stuck, show us your code.
I modified the code but the problem now, is that the code don't give the p-value for each association evaluated. What is coded as "value" in proc print is not p-value of the association.
data try1;
set exp; /* <== specify data set name HERE */
array new_SNP [*] new_SNP1 - new_SNP280; /* <== specify explanatory variables HERE */
do varNum = 1 to dim(new_SNP);
VarName = vname(new_SNP[varNum]); /* variable name in char var */
Value = new_SNP[varNum]; /* value for each variable for each obs */
output;
end;
drop new_SNP:;
run;
/* 2. Sort by BY-group variable */
proc sort data=try1; by VarName; run;
/* 3. Call PROC REG and use BY statement to compute all regressions */
proc glm data=try1 noprint outest=PE;
by VarName;
model south_ref = Value;
quit;
/* Look at the results */
proc print data=PE(obs=5);
var VarName Intercept Value;
run;
I think you need to add a CLASS statement to your PROC GLM, since your variable named value is CLASS and not continuous.
If you want the overall p-value for the model, you also need the ODS OUTPUT statement like this:
ods output overallanova=pe;
proc glm data=try1 noprint;
by VarName;
class value;
model south_ref = Value;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.