Hi All,
I'm trying to split dataset using product id , but seems it's not working. Can someone have a look and correct me please?
data sample;
input custid prdid prdname $30.;
cards;
222 10 aaaa
333 20 bbbb
444 10 aaaa
444 20 bbbb
555 30 cccc
;
run;
data _null_;
set sample end=last;
If last then call symputx("n",_n_);
run;
%PUT &n.;
%macro split(dsn=);
proc sql;
select prdid into :xx from &dsn.;
run;
%do i = 1 %to &n;
data &dsn.&i.;
set &dsn.;
where prdid = &xx.;
run;
%end;
%mend split;
%split(dsn=sample);
Your logic is a little confused. You don't care how large the original dataset is, just how many datasets you need to create.
%macro split(dsn=);
%if not %symexist(nds) %then %global nds;
%local xx i;
proc sql noprint;
select prdid into :xx separated by ' ' from &dsn.;
%let nds=&sqlobs;
run;
%do i = 1 %to &nds;
data &dsn.&i.;
set &dsn.;
where prdid = %scan(&xx.,&i,%str( ));
run;
%end;
%mend split;
%split(dsn=sample);
If your PRDID variable is character (why would you use a numeric variable for an ID?) then add quotes into the macro variable.
select quote(trim(prdid),"'") into :xx separated by ' ' from &dsn.;
Note unless your source dataset is indexed on the split variable it is probably faster to make them all in the same data step.
data
%do i = 1 %to &nds;
&dsn.&i.
%end;
;
set &dsn.;
%do i = 1 %to &nds;
if prdid = %scan(&xx.,&i,%str( )) then output &dsn.&i;
%end;
run;
Why?
What can you do with multiple data sets that you can't do with one?
Typically "splitting" data this way adds more complication than any benefit and often means that some one does not understand what is possible with SAS BY group processing.
Hi,
I've business requirement to split multiple datasets based on the product codes which is input to another team, so thought of using macro to split datasets but seems not working. Can u please correct me with the code?
Do those other teams actually get SAS datasets, or do they need the data in some other form?
Hi,
Yeah, they need SAS datasets.
@AshokD wrote:
Hi,
Yeah, they need SAS datasets.
Unless there are supposed to be restrictions of data for security or similar issues I would suggest placing the "big" data set in a shared storage location and let the users select data based on a Where statement as needed.
If your data changes frequently then you can cause availability due to timing of code execution issues with this split data set approach.
Your logic is a little confused. You don't care how large the original dataset is, just how many datasets you need to create.
%macro split(dsn=);
%if not %symexist(nds) %then %global nds;
%local xx i;
proc sql noprint;
select prdid into :xx separated by ' ' from &dsn.;
%let nds=&sqlobs;
run;
%do i = 1 %to &nds;
data &dsn.&i.;
set &dsn.;
where prdid = %scan(&xx.,&i,%str( ));
run;
%end;
%mend split;
%split(dsn=sample);
If your PRDID variable is character (why would you use a numeric variable for an ID?) then add quotes into the macro variable.
select quote(trim(prdid),"'") into :xx separated by ' ' from &dsn.;
Note unless your source dataset is indexed on the split variable it is probably faster to make them all in the same data step.
data
%do i = 1 %to &nds;
&dsn.&i.
%end;
;
set &dsn.;
%do i = 1 %to &nds;
if prdid = %scan(&xx.,&i,%str( )) then output &dsn.&i;
%end;
run;
Thanks @Tom
It works now, You've saved me.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.