☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Quartz | Level 8

## How to add subset data step by step

Hi all,

I wanted to examine data with subset cohort included step by step. For example, there are five cohorts, A-E, and I wanted to run regression with the first cohort, and then I would run regression with an additional cohort each time. How can I conduct this using MACRO or iteration efficiently?

proc reg data=mydata;

model dependent=independents;

where cohort in ('A');

run;

proc reg data=mydata;

model dependent=independents;

where cohort in ('A','B');

run;

proc reg data=mydata;

model dependent=independents;

where cohort in ('A','B','C');

run;

proc reg data=mydata;

model dependent=independents;

where cohort in ('A','B','C','D');

run;

proc reg data=mydata;

model dependent=independents;

where cohort in ('A','B','C','D','E');

run;

I would think I can probably start with creating the list like below:

proc sql noprint;
select distinct quote(cohort) into :st_list separated by ','
from mydata;
quit;

1 ACCEPTED SOLUTION

Accepted Solutions
PROC Star

## Re: How to add subset data step by step

OK, here's a bit more of what a macro could look like.  All of this would follow the code you originally posted.  It assumes you remove the comma and use a blank as the "separated by" character.

``````%macro regloop;
%local n subset;
%do n=1 to &sqlobs;
%let subset = &subset %scan(&stlist, &n);
proc reg data=mydata;
model dependent = independents;       where cohort in (&subset);      run;
%end;%mend region;%regloop``````

This will generate 5 PROC REGs, just like you have.  But the cohort values (and the actual number of PROC REGs) will depend on the data set you are processing.

7 REPLIES 7
PROC Star

## Re: How to add subset data step by step

As part of your plan, note that the IN operator does not require commas. You could shoot for (for example):

where cohort in ('A' 'B' 'C')

That will simplify subsequent programming.

Also note, SQL automatically creates &sqlobs which would be 5 in this case (number of items extracted). So your program could continue by writing a macro along these lines:

%macro regloop;
%local n;
%do n=1 %to &sqlobs;

Then have the macro generate PROC REG, using the first &n items from &stlist.

Definitely needs some details worked out, but that should give you a way to approach the problem.
Diamond | Level 26

## Re: How to add subset data step by step

I assume that the code you showed works. What would a macro do that this code does not do? You mention "iteration", what would be iterating?

I also question the statistical methodology here about adding cohort after cohort into the data in sequential fashion like you are doing. I would imagine a better way to see the impact of cohort is to add it into the model so that you can have different intercepts and/or different slopes for the different cohorts.

--
Paige Miller
Quartz | Level 8

## Re: How to add subset data step by step

Thank you both! I'm testing out my code, but cannot make it work. Just not good at iteration, especially when it's in macro.
Diamond | Level 26

## Re: How to add subset data step by step

@lichee wrote:
I'm testing out my code, but cannot make it work. Just not good at iteration, especially when it's in macro.

Its still not clear to me what "iteration" you mean, where in this problem is there "iteration"? Please describe in detail.

--
Paige Miller
Quartz | Level 8

## Re: How to add subset data step by step

I'd like to see how to add cohorts one by one given the list of cohorts. Thanks!
Quartz | Level 8

## Re: How to add subset data step by step

Thank you Paige! I know it's statistically not correct way to add cohort after cohort into regression analysis. But I'm just giving a quick and simple example.
PROC Star

## Re: How to add subset data step by step

OK, here's a bit more of what a macro could look like.  All of this would follow the code you originally posted.  It assumes you remove the comma and use a blank as the "separated by" character.

``````%macro regloop;
%local n subset;
%do n=1 to &sqlobs;
%let subset = &subset %scan(&stlist, &n);
proc reg data=mydata;
model dependent = independents;       where cohort in (&subset);      run;
%end;%mend region;%regloop``````

This will generate 5 PROC REGs, just like you have.  But the cohort values (and the actual number of PROC REGs) will depend on the data set you are processing.

Discussion stats
• 7 replies
• 948 views
• 0 likes
• 3 in conversation