turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- create multiple data sets use loop

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-10-2017 04:38 PM

Hi -,

I want to create sub data sets from main data sets using loop, my data looks like this:

id group subgroup

1 patient care physical contact

1 patient care family contact

1 patient care other

2 education conference

2 education read

2 education teaching

I want to sub set the data by each group and each subgroup:

```
data patient_care;
set data;
if group = 'patient care';
run;
data edu;
set data;
if group = 'Education';
run;
data patient_care1;
set data;
if subgroup = 'physical contact';
run;
data patient_care2;
set data;
if subgroup = 'family contact';
run;
data patient_care3;
set data;
if subgroup = 'other';
run;
data edu1;
set data;
if subgroup = 'conference';
run;
data edu2;
set data;
if subgroup = 'read';
run;
data edu3;
set data;
if subgroup = 'teaching';
run;
```

The code is very redundant, I think looping will be helpful but not sure exactly what to do, any idea?

Thanks!!!

Accepted Solutions

Solution

07-11-2017
09:55 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 09:29 AM

You can do hierarchical sums in a data step:

```
data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;
data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;
```

If you just want a report, proc report is perfectly capable of doing that.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-10-2017 04:46 PM

This is rarely a good way to work with your data. Why do you want to split it in the first place?

FYI - the best answer to this question is usually don't do it. Because many people insist anyways see these posts:

http://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 09:19 AM

Hi -

I want to split the data because I am not sure how to sum all observations across rows by group and subgroup and generate new observations as sums of those groups, right now I am using proc print and copy and paste the output in excel:

```
proc print data = data;
var inter0-inter23;
run;
```

The original data is like this, each id has one unique group:

id group inter0 inter1 inter2 inter3 ... inter23

1 patient care

1 education

2 patinet care

2 education

each id has one unique subgroup:

id subgroup inter0 inter1 inter2 inter3 ... inter23

1 patient interaction

1 family interaction

1 conference

1 read

1 teaching

2 patient interaction

2 family interaction

2 conference

2 read

2 teaching

I want to sum all ids by group and subgroup, the ideal ouput looks like this

group inter0 inter1 inter2 ..... inter23

patient care

education

subgroup

patient interaction

family interaction

conference

read

teaching

Not sure how to do that, any idea?

Solution

07-11-2017
09:55 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 09:29 AM

You can do hierarchical sums in a data step:

```
data have;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16. value;
cards;
1,patient care,physical contact,2
1,patient care,physical contact,3
1,patient care,family contact,1
1,patient care,family contact,6
1,patient care,other,3
1,patient care,other,3
2,education,conference,2
2,education,conference,4
2,education,read,1
2,education,read,5
2,education,teaching,7
2,education,teaching,2
;
run;
data want (keep=id group subgroup value);
set have;
by group notsorted subgroup notsorted;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;
```

If you just want a report, proc report is perfectly capable of doing that.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 09:57 AM

Don't know why but notsorted did not work well, after I remove notsorted command, I can get the result I want (need to sort by group and subgroup first):

`proc sort have;`

by group subgroup;

run;

data want (keep=id group subgroup value);
set have;
by group subgroup;
retain groupsum sum;
if first.group then groupsum = 0;
if first.subgroup then sum = 0;
groupsum + value;
sum + value;
if last.subgroup
then do;
value = sum;
output;
end;
if last.group
then do;
subgroup = '';
value = groupsum;
output;
end;
run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 10:00 AM

I used the notsorted option so that the code works when there is no ascending/descending order. One has to make sure that the groups come in blocks, though (no identical by values scattered across the dataset). If sorting does not interfere with other logic, then it is to be preferred.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-10-2017 04:47 PM

First would be to explain what you are doing that requires all of those sets. The request for many sets is usually immediately followed by questions on how to use them to do the same processing.

You could extend this logic:

data patient_care edu ; set data; if group = 'patient care' then output Patient_care; if group = 'Education' then output edu; run;

each condition can send output to the specified output set.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-11-2017 02:08 AM

Put your control data in a dataset, and let a data step create the split code with call execute:

```
data lookup;
infile cards dlm=',';
input id :$1. group :$12. subgroup :$16.;
cards;
1,patient care,physical contact
1,patient care,family contact
1,patient care,other
2,education,conference
2,education,read
2,education,teaching
;
run;
data _null_;
call execute('data');
do until (eof1);
set lookup end=eof1;
by id;
if first.id then call execute(' ' !! translate(trim(group),'_',' '));
call execute(' ' !! translate(trim(subgroup),'_',' '));
end;
call execute('; set data;');
do until (eof2);
set lookup end=eof2;
by id;
if first.id then call execute('if group = "' !! trim(group) !! '" then output ' !! translate(trim(group),'_',' ') !! ';');
call execute('if subgroup = "' !! trim(subgroup) !! '" then output ' !! translate(trim(subgroup),'_',' ') !! ';');
end;
call execute('run;');
stop;
run;
```

Note that the created code will make just one pass through the original dataset.

The question remains, what will you do with the resulting datasets? If you perform the same analytics/reports on all of them, you're better off doing it in one pass with by-group processing.

Maxims of Maximally Efficient SAS Programmers