turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Multiple regressions using arrays

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-25-2018 03:39 PM

Hi,

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

Thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 03:44 PM - edited 01-25-2018 03:51 PM

I suspect your best approach is to Reformat your data and use a BY statement.

If you want further suggestions please provide more detailed information including how your information currently looks and the type or PROC REG statements you're looking to develop.

AbuYusuf wrote:

Hi,

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

Thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

01-25-2018 03:52 PM

Thanks a lot for the reply!

Data is a large administrative database that has a number of diagnostic codes which I use to create a number of disease variables, say, diabetes, influenza, etc. I have cost as the dependent variable and I want to run a number of regressions where each disease is an independent variable, and adjust for a number of other factors, such as length of stay, etc. I also have age and province as categorical variables, so I run these regressions BY age category and province.

In STATA I would have done it like this:

local diseases "diabetes influenza"

foreach a of local diseases {

regress cost diseases[`a']

}

That's basically it. Thank you very much for your help!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 03:54 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 03:49 PM

AbuYusuf wrote:

Hi,

I would like to run a number of linear univariate regressions of the form Y = aXi + e and have been trying to figure out how to use arrays for that, instead of writing out 20 regressions. I managed to create an array of independent variables in the data step, but I cannot figure out a way to access it in proc regress. There must be a simple way to do this, but having spent a day looking I haven't found it. Would be grateful if you guys could help.

Thank you!

```
proc transpose name=IndVar data=sashelp.class out=class2(rename=col1=X);
by name age;
var height weight;
run;
proc sort data=class2;
by indvar;
run;
proc reg data=class2;
by indvar;
model age = x;
attrib _all_ label='';
run;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

01-25-2018 04:00 PM

This entire thread fits under the category of:

Just because you CAN do the regressions this way, it doesn't mean you SHOULD do the regressions this way.

Instead of ordinary least squares regression, I recommend partial least squares regression (PROC PLS) which has better statistical properties (smaller root mean square error of predictions and of regression coefficients) than doing many regressions.

If you want to determine which variables are important in predicting the response, and you do many regressions, you are not accounting for possible confounding of one x variable with another x variable. PLS handles this better.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

01-29-2018 11:06 AM

Thank you. I will check out proc pls.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

01-29-2018 11:05 AM

Thank you very much! I tried the code, but my adaptation of it to my data didn't work...

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 04:00 PM

It's a bit hard to reply without seeing the start and end points.

Please provide a small example of data and the desired procedure calls.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 05:02 PM

Use a variable list. In proc reg you may specify

**model a -- z = myVar; **

to regress all variables in your dataset variable list from a to z against myVar.

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

01-25-2018 05:05 PM

PGStats wrote:

Use a variable list. In proc reg you may specify

model a -- z = myVar;

to regress all variables in your dataset variable list from a to z against myVar.

I thought the OP said there are many independent (X) variables.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

01-25-2018 05:29 PM

OOps!

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 05:55 PM - edited 01-25-2018 05:56 PM

You can automate with **call execute()**

```
data _null_;
length reg $200;
set sashelp.cars;
array x Invoice -- length;
do i = 1 to dim(x);
reg = cats(
"proc reg data=sashelp.cars plots=none outest=out_",
vname(x{i}),
"; model MSRP=",
vname(x{i}),
"; run;" );
call execute (reg);
end;
stop;
run;
data est_all;
set out_: ;
run;
proc print data=est_all; run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 06:16 PM

Another way is to trick proc reg into testing every variable for best subset selection

```
proc reg data=sashelp.cars outest=all_est;
model MSRP = Invoice -- length / selection=RSQUARE stop=1;
run;
quit;
proc print data=all_est; run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AbuYusuf

01-25-2018 06:19 PM

https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

Here's a full write up on macros if you choose to go down that route.