Multiple regressions using arrays

Reply
New Contributor
Posts: 4

Multiple regressions using arrays

Hi,

 

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

 

Thank you!

Super User
Posts: 23,224

Re: Multiple regressions using arrays

[ Edited ]

I suspect your best approach is to Reformat your data and use a BY statement.

 

If you want further suggestions please provide more detailed information including how your information currently looks and the type or PROC REG statements you're looking to develop.

 


AbuYusuf wrote:

Hi,

 

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

 

Thank you!


 

New Contributor
Posts: 4

Re: Multiple regressions using arrays

Thanks a lot for the reply!

 

Data is a large administrative database that has a number of diagnostic codes which I use to create a number of disease variables, say, diabetes, influenza, etc. I have cost as the dependent variable and I want to run a number of regressions where each disease is an independent variable, and adjust for a number of other factors, such as length of stay, etc. I also have age and province as categorical variables, so I run these regressions BY age category and province.

 

In STATA I would have done it like this:

local diseases "diabetes influenza"

foreach a of local diseases {

regress cost diseases[`a']

}

 

That's basically it. Thank you very much for your help!

Super User
Posts: 23,224

Re: Multiple regressions using arrays

Respected Advisor
Posts: 3,845

Re: Multiple regressions using arrays


AbuYusuf wrote:

Hi,

 

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

 

Thank you!


proc transpose name=IndVar data=sashelp.class out=class2(rename=col1=X);
   by name age;
   var height weight;
   run;
proc sort data=class2;
   by indvar;
   run;
proc reg data=class2;
   by indvar;
   model age = x;
   attrib _all_ label='';
   run;
Respected Advisor
Posts: 2,794

Re: Multiple regressions using arrays

Posted in reply to data_null__

This entire thread fits under the category of:

 

Just because you CAN do the regressions this way, it doesn't mean you SHOULD do the regressions this way.

 

Instead of ordinary least squares regression, I recommend partial least squares regression (PROC PLS) which has better statistical properties (smaller root mean square error of predictions and of regression coefficients) than doing many regressions.

 

If you want to determine which variables are important in predicting the response, and you do many regressions, you are not accounting for possible confounding of one x variable with another x variable. PLS handles this better.

--
Paige Miller
New Contributor
Posts: 4

Re: Multiple regressions using arrays

Posted in reply to PaigeMiller
Thank you. I will check out proc pls.
New Contributor
Posts: 4

Re: Multiple regressions using arrays

Posted in reply to data_null__

Thank you very much! I tried the code, but my adaptation of it to my data didn't work...

PROC Star
Posts: 2,305

Re: Multiple regressions using arrays

It's a bit hard to reply without seeing the start and end points.

Please provide a small example of data and the desired procedure calls.

 

Esteemed Advisor
Posts: 5,474

Re: Multiple regressions using arrays

Use a variable list. In proc reg you may specify

 

model a -- z = myVar;

 

to regress all variables in your dataset variable list from a to z against myVar.

PG
Respected Advisor
Posts: 3,845

Re: Multiple regressions using arrays


PGStats wrote:

Use a variable list. In proc reg you may specify

 

model a -- z = myVar;

 

to regress all variables in your dataset variable list from a to z against myVar.


I thought the OP said there are many independent (X) variables.  

Esteemed Advisor
Posts: 5,474

Re: Multiple regressions using arrays

Posted in reply to data_null__

OOps!

PG
Esteemed Advisor
Posts: 5,474

Re: Multiple regressions using arrays

[ Edited ]

You can automate with call execute()

 

data _null_;
length reg $200;
set sashelp.cars;
array x Invoice -- length;
do i = 1 to dim(x);
    reg = cats(
        "proc reg data=sashelp.cars plots=none outest=out_",
        vname(x{i}),
        "; model MSRP=", 
        vname(x{i}),
        "; run;" );
    call execute (reg);
    end;
stop;
run;

data est_all;
set out_: ;
run;

proc print data=est_all; run;

 

 

 

PG
Esteemed Advisor
Posts: 5,474

Re: Multiple regressions using arrays

Another way is to trick proc reg into testing every variable for best subset selection

 


proc reg data=sashelp.cars outest=all_est;
model MSRP = Invoice -- length / selection=RSQUARE stop=1;
run;
quit;

proc print data=all_est; run;
PG
Super User
Posts: 23,224

Re: Multiple regressions using arrays

https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

 

Here's a full write up on macros if you choose to go down that route.

Ask a Question
Discussion stats
  • 14 replies
  • 228 views
  • 7 likes
  • 6 in conversation