Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- How to iterate procedures (probit)

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-23-2017 09:29 PM
(795 views)

Hello, I am working on my Master's thesis and have hit something of a snag. I want to run a very simple probit regression over a wide range of columns (several hundred), where each would be using the same model and same dependent variable. I am working with real time data, so each column represents the same series estimated at a different date. I essentially want to test how well every single date performs when used in a very simple probit regression. The columns are all in the same dataset etc, but I'm not sure how to efficiently automate the process, as doing it by hand would be extremely time consuming.

So my question is how would I create a macro that iterates over this worksheet and runs the probit regression on every single variable/column in it in a given range? If a macro is not an efficient way to do this, what would be? I understand the basics of writing macros, of SQL, of iteration etc but I'm not sure how to put them all together to get what I need. Furthermore, how would I accomplish this if I wanted to do the same regression but using 2-3 variables at a time?

Any help with this dilemma would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@vonkraush wrote:

Hello, I am working on my Master's thesis and have hit something of a snag. I want to run a very simple probit regression over a wide range of columns (several hundred), where each would be using the same model and same dependent variable. I am working with real time data, so each column represents the same series estimated at a different date. I essentially want to test how well every single date performs when used in a very simple probit regression. The columns are all in the same dataset etc, but I'm not sure how to efficiently automate the process, as doing it by hand would be extremely time consuming.

So my question is how would I create a macro that iterates over this worksheet and runs the probit regression on every single variable/column in it in a given range? If a macro is not an efficient way to do this, what would be? I understand the basics of writing macros, of SQL, of iteration etc but I'm not sure how to put them all together to get what I need. Furthermore, how would I accomplish this if I wanted to do the same regression but using 2-3 variables at a time?

Any help with this dilemma would be greatly appreciated!

The answer to your first question is to transpose and use BY groups in your regression instead.

The answer to your second is to create a macro and then call it as desired. There was a question on here earlier this week about creating all possible 2/3 pairs of combinations from a list of variables. (CALL ALLCOMB/LEXCOMB + CALL EXECUTE)

http://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@vonkraush wrote:

Hello, I am working on my Master's thesis and have hit something of a snag. I want to run a very simple probit regression over a wide range of columns (several hundred), where each would be using the same model and same dependent variable. I am working with real time data, so each column represents the same series estimated at a different date. I essentially want to test how well every single date performs when used in a very simple probit regression. The columns are all in the same dataset etc, but I'm not sure how to efficiently automate the process, as doing it by hand would be extremely time consuming.

So my question is how would I create a macro that iterates over this worksheet and runs the probit regression on every single variable/column in it in a given range? If a macro is not an efficient way to do this, what would be? I understand the basics of writing macros, of SQL, of iteration etc but I'm not sure how to put them all together to get what I need. Furthermore, how would I accomplish this if I wanted to do the same regression but using 2-3 variables at a time?

Any help with this dilemma would be greatly appreciated!

The answer to your first question is to transpose and use BY groups in your regression instead.

The answer to your second is to create a macro and then call it as desired. There was a question on here earlier this week about creating all possible 2/3 pairs of combinations from a list of variables. (CALL ALLCOMB/LEXCOMB + CALL EXECUTE)

http://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the follow up. I am trying to use the By procedure like you said but I'm a little confused by how it works, I can't find a great example of sample code anywhere. So to make it work properly, what exactly do I need.

In the example provided, you include the following line of code:

```
PROC GLM DATA=sample;
BY Y_Index;
model y= x1 x2 x3;
run;
```

Sample is obviously just the dataset you want to run the regression over, but I need some clarification as to the nature of Y_Index. Is it just a dataset with one row ('y') which contains the name of every dependent variable I want to run the regression over?

ALSO: in this regression the dependent variable is constant, it's the explanatory variables that change. Could I still use BY in the same manner, but with an index of the X variables instead, something like:

```
PROC GLM DATA=sample;
BY X_Index;
model y= x;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm not following your question.

The link in the first link 🙂 has a worked example, they included their data as a question and my code has the rest of the solution so you could work through the exercise if desired.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry for not explaining myself clearly! I went over that example and understand a few things better, but am still confused on some points. I attempted to go through the excercise by myself to get a feel for 'BY', but ran into problems early on. Mainly at this step:

*Create Returns and Squared Returns; data data2;set data; vars(*) a1--a50; array r(50); do i=1 to dim(vars); end; drop i; var(*) r1--r50; array rsq(50); do i=1 to dim(var); end; drop i; run;

vars(*) and var(*) returned multiple errors,mostly relating to 'undeclared array errors'. I tried toying around with this by explicitly making them arrays but the end result was still just 50 r and 50rsq variables which were comprised entirely of blank values, which based on later steps doesn't seem correct either. What was the intent of this step, and why wouldn't it be working properly for me?

In case it is relevent: I am doing most of this using SAS studio, I have access to normal SAS at my university, but in order to test and develop code I've mostly been working with studio.

⏰

Time is running out to save with the early bird rate. Register by Friday, March 1 for just $695 - $100 off the standard rate.

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.** **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.