If I have a dataset with a variable called "Color"
From this dataset, I would like to create separate datasets based on color
(but I might not always know how many different colors are in the dataset) - meaning, I don't want to have to hardcode a list of all possible colors rather I want to sort the file by color, then as the color changes, I want to create a new dataset and name it the new the new /next value of the variable.
So if I had a dataset with records containing the "color" Red. Blue. Green. - I would want to output 3 datasets
DSN1 = Color_Blue
DSN2 = Color_Green
DSN3 = Color_Red
If there were more colors, I want to output a dataset for each based on their "Color"
DSN.n - Color_<var-value>
Thanks
BY group processing within ODS EXCELXP or ODS EXCEL.
https://support.sas.com/rnd/base/ods/odsmarkup/excelxp_demo.html#data
You could write a macro where you cycle through a unique list of the colors in the data set (untested).
proc sql;
create table color_list as
select distinct color
from your_data;
quit;
%macro output_color_sets;
proc sql;
select color into :color1-:color99 from color_list;
quit;
%do i = 1 %to &sqlObs;
data color_&&color&i;
set your_data;
where color = "&&color&i";
run;
%end;
%mend output_color_sets;
%output_color_sets
This might do what you want without a macro:
Awesome, thanks for the quick response - I will try these out.
One twist - what if I wanted to create a single excel spreadsheet of the same with each color as a different tab/sheet?
BY group processing within ODS EXCELXP or ODS EXCEL.
https://support.sas.com/rnd/base/ods/odsmarkup/excelxp_demo.html#data
Awesome, exactly what I needed. Thank you
Or as a variant code which can output tables to any destination:
data have;
length color $20;
do color='Blue', 'Red', 'Green';
do i=1 to 10;
output;
end;
end;
stop;
run;
proc sort data=have;
by color;
run;
libname mywb xlsx 'c:\temp\demo_wb.xlsx';
data _null_;
if _n_=1 then
do;
dcl hash h1(dataset:'have(obs=0)', multidata:'y');
h1.defineKey('color');
h1.defineData(all:'y');
h1.defineDone();
end;
do until(last.color);
set have;
by color;
h1.add();
end;
h1.output(dataset:cats('mywb.want_',color));
h1.clear();
run;
libname mywb clear;
All the solutions you've been provided so far require reading the dataset twice. The solution below needs to read it only once, followed by renaming dummy dataset names to names based on the values encountered. The example use the sashelp.stocks data set, and it doesn't need a sorted data set, nor does it need to have enough memory to hold data temporarily in hash objects:
data d1 d2 d3 d4 d5 d6 d7 d8;
length st_list $500 proc_ds $3000;
retain st_list ' '
proc_ds 'proc datasets library=work nolist; change ';
set sashelp.stocks end=eos;
d=findw(st_list,trim(stock),' ','E');
if d=0 then do;
d=countw(st_list)+1;
st_list=catx(' ',st_list,stock);
proc_ds=catx(' ',proc_ds,cats('d',d,'=',stock));
end;
select (d);
when (1) output d1;
when (2) output d2;
when (3) output d3;
when (4) output d4;
when (5) output d5;
when (6) output d6;
when (7) output d7;
when (8) output d8;
end;
if eos then do;
proc_ds=cats(proc_ds,'; delete d0');
if countw(st_list)<8 then do d=countw(st_list)+1 to 8;
proc_ds=catx(' ',proc_ds,cats('d',d));
end;
proc_ds=catx(' ',proc_ds,';quit;');
call execute(trim(proc_ds));
end;
drop d st_list proc_ds;
run;
The benefits of this approach are
bb
Still another method, using call execute to create a dynamic data step:
proc sort data=have (keep=color) out=lookup nodupkey;
by color;
run;
data _null_;
call execute('data');
do until (eof1);
set lookup end=eof1;
call execute(' ' !! trim(color));
end;
call execute('; set have; select (color); ');
do until (eof2);
set lookup end=eof2;
call execute('when ("' !! trim(color) !! '") output ' !! trim(color) !! '; ');
end;
call execute('run;');
run;
This will require two passes through your dataset, but be completely flexible; the only limit is the number of distinct color values, which could break the data statement created by the call execute if it exceeds 32767 characters. Or if it exceeds the system limit for concurrently open filehandles.
A requirement is of course that the values in color can be used as valid SAS names.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.