Hi -,
I want to create sub data sets from main data sets using loop, my data looks like this:
id group subgroup
1 patient care physical contact
1 patient care family contact
1 patient care other
2 education conference
2 education read
2 education teaching
I want to sub set the data by each group and each subgroup:
data patient_care;
set data;
if group = 'patient care';
run;
data edu;
set data;
if group = 'Education';
run;
data patient_care1;
set data;
if subgroup = 'physical contact';
run;
data patient_care2;
set data;
if subgroup = 'family contact';
run;
data patient_care3;
set data;
if subgroup = 'other';
run;
data edu1;
set data;
if subgroup = 'conference';
run;
data edu2;
set data;
if subgroup = 'read';
run;
data edu3;
set data;
if subgroup = 'teaching';
run;
The code is very redundant, I think looping will be helpful but not sure exactly what to do, any idea?
Thanks!!!
You can do hierarchical sums in a data step:
data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;
data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;
If you just want a report, proc report is perfectly capable of doing that.
This is rarely a good way to work with your data. Why do you want to split it in the first place?
FYI - the best answer to this question is usually don't do it. Because many people insist anyways see these posts:
http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
Hi -
I want to split the data because I am not sure how to sum all observations across rows by group and subgroup and generate new observations as sums of those groups, right now I am using proc print and copy and paste the output in excel:
proc print data = data;
var inter0-inter23;
run;
The original data is like this, each id has one unique group:
id group inter0 inter1 inter2 inter3 ... inter23
1 patient care
1 education
2 patinet care
2 education
each id has one unique subgroup:
id subgroup inter0 inter1 inter2 inter3 ... inter23
1 patient interaction
1 family interaction
1 conference
1 read
1 teaching
2 patient interaction
2 family interaction
2 conference
2 read
2 teaching
I want to sum all ids by group and subgroup, the ideal ouput looks like this
group inter0 inter1 inter2 ..... inter23
patient care
education
subgroup
patient interaction
family interaction
conference
read
teaching
Not sure how to do that, any idea?
You can do hierarchical sums in a data step:
data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;
data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;
If you just want a report, proc report is perfectly capable of doing that.
Don't know why but notsorted did not work well, after I remove notsorted command, I can get the result I want (need to sort by group and subgroup first):
proc sort have;
by group subgroup;
run;
data want (keep=id group subgroup value);
set have;
by group subgroup;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;
I used the notsorted option so that the code works when there is no ascending/descending order. One has to make sure that the groups come in blocks, though (no identical by values scattered across the dataset). If sorting does not interfere with other logic, then it is to be preferred.
First would be to explain what you are doing that requires all of those sets. The request for many sets is usually immediately followed by questions on how to use them to do the same processing.
You could extend this logic:
data patient_care edu ; set data; if group = 'patient care' then output Patient_care; if group = 'Education' then output edu; run;
each condition can send output to the specified output set.
Put your control data in a dataset, and let a data step create the split code with call execute:
data lookup;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16.;
cards;
1,patient care,physical contact
1,patient care,family contact
1,patient care,other
2,education,conference
2,education,read
2,education,teaching
;
run;
data _null_;
call execute('data');
do until (eof1);
set lookup end=eof1;
by id;
if first.id then call execute(' ' !! translate(trim(group),'_',' '));
call execute(' ' !! translate(trim(subgroup),'_',' '));
end;
call execute('; set data;');
do until (eof2);
set lookup end=eof2;
by id;
if first.id then call execute('if group = "' !! trim(group) !! '" then output ' !! translate(trim(group),'_',' ') !! ';');
call execute('if subgroup = "' !! trim(subgroup) !! '" then output ' !! translate(trim(subgroup),'_',' ') !! ';');
end;
call execute('run;');
stop;
run;
Note that the created code will make just one pass through the original dataset.
The question remains, what will you do with the resulting datasets? If you perform the same analytics/reports on all of them, you're better off doing it in one pass with by-group processing.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.