create multiple data sets use loop

Accepted Solution Solved
Reply
Contributor
Posts: 30
Accepted Solution

create multiple data sets use loop

Hi -,

I want to create sub data sets from main data sets using loop, my data looks like this:

id  group       subgroup 

1   patient care   physical contact

1   patient care   family contact

1   patient care   other

2   education      conference

2   education      read

2   education      teaching 

I want to sub set the data by each group and each subgroup:

data patient_care;
    set data;
     if group = 'patient care';
run;

data edu;
   set data;
   if group = 'Education';
run;

data patient_care1;
    set data;
    if subgroup = 'physical contact';
run;

data patient_care2;
   set data;
   if subgroup = 'family contact';
run;

data patient_care3;
   set data;
   if subgroup = 'other';
run;

data edu1;
   set data;
   if subgroup = 'conference';
run;

data edu2;
   set data;
   if subgroup = 'read';
run;

data edu3;
   set data;
   if subgroup  = 'teaching';
run;

The code is very redundant, I think looping will be helpful but not sure exactly what to do, any idea?

Thanks!!!


Accepted Solutions
Solution
‎07-11-2017 09:55 AM
Super User
Posts: 7,422

Re: create multiple data sets use loop

You can do hierarchical sums in a data step:

data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;

data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
  value = sum;
  output;
end;
if last.group
then do;
  subgroup = '';
  value = groupsum;
  output;
end;
run;

If you just want a report, proc report is perfectly capable of doing that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 19,093

Re: create multiple data sets use loop

This is rarely a good way to work with your data. Why do you want to split it in the first place?

 

FYI - the best answer to this question is usually don't do it. Because many people insist anyways see these posts:

 

http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/

http://www.sascommunity.org/wiki/Split_Data_into_Subsets

Contributor
Posts: 30

Re: create multiple data sets use loop

Hi - 

I want to split the data because I am not sure how to sum all observations across rows by group and subgroup and generate new observations as sums of those groups, right now I am using proc print and copy and paste the output in excel:

proc print data = data;
var inter0-inter23;
run;

The original data is like this, each id has one unique group:

id  group    inter0  inter1  inter2  inter3   ...   inter23

1   patient care

1   education

2   patinet care

2    education

each id has one unique subgroup:

id subgroup       inter0 inter1 inter2 inter3 ... inter23

1   patient interaction

1   family interaction

1   conference

1    read

1    teaching

2    patient interaction

2    family interaction

2    conference

2    read

2   teaching 

 

 

I want to sum all ids by group and subgroup, the ideal ouput looks like this

group            inter0   inter1 inter2 ..... inter23

patient care

education

subgroup

patient interaction

family interaction

conference

read 

teaching

Not sure how to do that, any idea?

 

Solution
‎07-11-2017 09:55 AM
Super User
Posts: 7,422

Re: create multiple data sets use loop

You can do hierarchical sums in a data step:

data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;

data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
  value = sum;
  output;
end;
if last.group
then do;
  subgroup = '';
  value = groupsum;
  output;
end;
run;

If you just want a report, proc report is perfectly capable of doing that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 30

Re: create multiple data sets use loop

Don't know why but notsorted did not work well, after I remove notsorted command, I can get the result I want (need to sort by group and subgroup first):

proc sort have;
by group subgroup;
run;
data want (keep=id group subgroup value); set have; by group subgroup; retain groupsum sum; if first.group then groupsum = 0; if first.subgroup then sum = 0; groupsum + value; sum + value; if last.subgroup then do; value = sum; output; end; if last.group then do; subgroup = ''; value = groupsum; output; end; run;
Super User
Posts: 7,422

Re: create multiple data sets use loop

I used the notsorted option so that the code works when there is no ascending/descending order. One has to make sure that the groups come in blocks, though (no identical by values scattered across the dataset). If sorting does not interfere with other logic, then it is to be preferred.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 11,121

Re: create multiple data sets use loop

First would be to explain what you are doing that requires all of those sets. The request for many sets is usually immediately followed by questions on how to use them to do the same processing.

 

You could extend this logic:

data patient_care
     edu
;
    set data;
     if group = 'patient care' then output Patient_care;
     if group = 'Education' then output edu;
run;

each condition can send output to the specified output set.

 

Super User
Posts: 7,422

Re: create multiple data sets use loop

Put your control data in a dataset, and let a data step create the split code with call execute:

data lookup;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16.;
cards;
1,patient care,physical contact
1,patient care,family contact
1,patient care,other
2,education,conference
2,education,read
2,education,teaching
;
run;

data _null_;
call execute('data');
do until (eof1);
  set lookup end=eof1;
  by id;
  if first.id then call execute(' ' !! translate(trim(group),'_',' '));
  call execute(' ' !! translate(trim(subgroup),'_',' '));
end;
call execute('; set data;');
do until (eof2);
  set lookup end=eof2;
  by id;
  if first.id then call execute('if group = "' !! trim(group) !! '" then output ' !! translate(trim(group),'_',' ') !! ';');
  call execute('if subgroup = "' !! trim(subgroup) !! '" then output ' !! translate(trim(subgroup),'_',' ') !! ';');
end;
call execute('run;');
stop;
run;

Note that the created code will make just one pass through the original dataset.

 

The question remains, what will you do with the resulting datasets? If you perform the same analytics/reports on all of them, you're better off doing it in one pass with by-group processing.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 140 views
  • 5 likes
  • 4 in conversation