Solved
New Contributor
Posts: 4

# Splitting a dataset into multiple dataset

I have a dataset and I need to split this dataset based on a variable (say deptid) in it . The number of splits depends on the number of distinct departments.The name of the output dataset comes from the deptdescription.

For eg:
This is how my data looks now

deptid deptdescription etc....... ;
1 Maths
2 Astronomy
1 Maths
3 Physics
4 Public Health
3 Physics
.....

This is how I want my output

Dataset named Maths
1 Maths
1 Maths

Dataset named Astronomy
2 Astronomy

and Etc

Thank You.

Accepted Solutions
Solution
‎09-29-2017 03:38 PM
SAS Employee
Posts: 11

## Re: Splitting a dataset into multiple dataset

This is an approach to put all observations of the same Deptid value into separate data sets named from the Datadescription variable.

data have;

infile datalines truncover;

input @1 deptid 1. deptdescription \$ & 3-20;

datalines;

1 Maths

2 Astronomy

1 Maths

3 Physics

4 Public Health

3 Physics

;

run;

/* Sort by Deptid so first./last. logic can be used in the next step */

proc sort data=have;

by deptid;

run;

/* CALL SYMPUTX creates macro variables &Subj1, &Subj2, etc from the values of Deptdescription.*/

/* The total unique values of Deptdescription are stored in &Total                             */

data _null_;

set have;

by deptid;

if first.deptid then do;

i+1;

call symputx('subj'|| left(put(i,2.)),deptdescription);

end;

if last.deptid then call symputx('total',i);

run;

/* Within a macro %DO loop the data sets are created and those observations with the same */

/* Deptdescription values are read with a WHERE clause                                    */

%macro test;

%do i=1 %to &total;

data &&subj&i;

set have;

where deptdescription="&&subj&i";

run;

%end;

%mend test;

/* invoke the macro */

%test

All Replies
Frequent Contributor
Posts: 82

## Re: Splitting a dataset into multiple dataset

data ds1 ds2 ds3;
set sashelp.class;
if weight < 85 then output ds1;
else if weight >= 85 and weight <= 110 then output ds2;
else if weight > 110 then output ds3;
run;

Nancy
New Contributor
Posts: 4

## Re: Splitting a dataset into multiple dataset

This will work if I know the class labels(i.e. weight 85 or 110 etc).
There are over 100 classes (departments) in my dataset.

I am thinking of something that will read these different classes from the dataset and split it accordingly.
PROC Star
Posts: 8,163

## Re: Splitting a dataset into multiple dataset

Annu,

Take a look at the method suggested in the following tip:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1101C&L=sas-l&P=R29917&D=1&H=0&O=D&T=1

It provides a general method, using hash, that only requires that you first sort and index the file (and shows you how to do that as well).

HTH,
Art
> This will work if I know the class labels(i.e. weight
> 85 or 110 etc).
> There are over 100 classes (departments) in my
> dataset.
>
> I am thinking of something that will read these
> different classes from the dataset and split it
> accordingly.
Valued Guide
Posts: 653

## Re: Splitting a dataset into multiple dataset

For a third method take a look at
http://www.sascommunity.org/wiki/Automatically_Separating_Data_into_Excel_Sheets
The article is similar to what you are doing.
Posts: 4,736

## Re: Splitting a dataset into multiple dataset

data have;
infile datalines truncover;
input deptid 1-1 deptdescription \$ 3-20;
datalines;
1 Maths
2 Astronomy
1 Maths
3 Physics
4 Public Health
3 Physics
;
run;

proc sql noprint;
select DISTINCT
catx(' ',compress(deptdescription)),
cat('when (',deptid,') output ',compress(deptdescription),';')
into Snames separated by ' '
,:WhenStatements separated by ' '
from have
order by deptid
;
quit;

%put &DSNames;
%put %bquote(&WhenStatements);

data &DSNames;
set have;
select(deptid);
&WhenStatements
otherwise;
end;
run;

HTH
Patrick
Solution
‎09-29-2017 03:38 PM
SAS Employee
Posts: 11

## Re: Splitting a dataset into multiple dataset

This is an approach to put all observations of the same Deptid value into separate data sets named from the Datadescription variable.

data have;

infile datalines truncover;

input @1 deptid 1. deptdescription \$ & 3-20;

datalines;

1 Maths

2 Astronomy

1 Maths

3 Physics

4 Public Health

3 Physics

;

run;

/* Sort by Deptid so first./last. logic can be used in the next step */

proc sort data=have;

by deptid;

run;

/* CALL SYMPUTX creates macro variables &Subj1, &Subj2, etc from the values of Deptdescription.*/

/* The total unique values of Deptdescription are stored in &Total                             */

data _null_;

set have;

by deptid;

if first.deptid then do;

i+1;

call symputx('subj'|| left(put(i,2.)),deptdescription);

end;

if last.deptid then call symputx('total',i);

run;

/* Within a macro %DO loop the data sets are created and those observations with the same */

/* Deptdescription values are read with a WHERE clause                                    */

%macro test;

%do i=1 %to &total;

data &&subj&i;

set have;

where deptdescription="&&subj&i";

run;

%end;

%mend test;

/* invoke the macro */

%test

☑ This topic is solved.