BookmarkSubscribeRSS Feed
R_Win
Calcite | Level 5
Hi i will get dataset i dont know the number of observastions are there i want to divide it in to 5 dataset
and name them as d1 to d5 dynamically how can i do by loop and each dataset obs should not go to another dataset based on id variable
ex:
data test;
input id;
cards;
1
1
2
2
2
3
3
4
5
6
6
7
7
8
run;
output
d1
1
1
1
d2
2
2
3
3
d3
4
5
d4
6
6
d5
7
7
8


As the dataset should divide in to 5 datasets and the id variable obs should be only in one dataset only for ex:d3 is having obs '4' the obs of that dataset should be only in d3 it sholuld not be in another dataset .The test dataset should divide in to 5 not exactly all the five dataset should be having the same no of obs it can vary can any one help me in this.
6 REPLIES 6
DF
Fluorite | Level 6 DF
Fluorite | Level 6
Not sure if there's a better way, but as a quick and dirty you could try something like this:


data test;
format i 8.;
do i = 1 to 10;
output;
end;
run;

data a b c d e error;
set test;
select (mod(_n_, 5));
when (0) output a;
when (1) output b;
when (2) output c;
when (3) output d;
when (4) output e;
otherwise output error;
end;
run;
Ksharp
Super User
Hi.
I have to leave now.
Give you some clue. in the dictionary table dictionary.tables ,there is a 'nobs' contains the table 's physical number of observations,You can calculated it by yourself to decide the number of obs for dataset d1 - d5 .


Ksharp
SAS333
Calcite | Level 5
I am not sure why the complete post is not visible. Please edit to view the complete post.
I think this works:

data a b c d e;
set act.talent nobs=t;
/*In my case I am interested in 5 data sets */
t1=round(t/5);
if _n_<=t1 then output a;
else if _n_<=t1*2 then output b;
else if _n_<=t1*3 then output c;
else if _n_<=t1*4 then output d;
else if _n_<=t1*5 then output e;
run;

The above code is useful when you want to create less number of datasets from the input data set.
Its better to use the below macro code:

%macro numofdatasets(no);
%do i=1 %to &no;
data _&i;
set act.talent nobs=t;
/*In my case I am interested in 3 data sets */
t1=round(t/&no);
if t1*(&i-1) < _n_ <=t1*&i then output _&i;
run;
%end;
%mend;
%numofdatasets(5);

You can change the number of data set you want to create easily in the macro call. But the last data set will have less obervations if the number of observations is not a multiple of the number of datasets required.

Message was edited by: SAS333

Message was edited by: SAS333 Message was edited by: SAS333
RickM
Fluorite | Level 6
The forum doesn't like the symbols for "less than or equal to" etc. so it is best to use SAS's abbreviation's (lt, le, gt, ge, eq, ne). Only you can edit your own posts.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello,

Use < instead of < and > instead of > (Also add ; after < and ; after >).

SPR

Message was edited by: SPR

Message was edited by: SPR

Message was edited by: SPR Message was edited by: SPR
Ksharp
Super User
Emmmm.
It is much complicated.
Suppose your id variable is numeric.


[pre]
data test;
input id;
cards;
1
1
2
2
2
3
3
4
5
6
6
7
7
8
;
run;
ods output nlevels=level;
proc freq data=test nlevels ;
table id /out=id(keep=id) nopercent nofreq nocum ;
run;
data temp(keep=id flag);
set id;
if _n_ eq 1 then set level;
retain flag 0;
div=floor(nlevels/5);
if div eq 1 then do;
if mod(_n_,div) eq 0 and flag le 4 then flag+1;
end;
else do;
if mod(_n_,div) eq 1 and flag le 4 then flag+1;
end;
run;
data op;
set temp;
by flag;
length level $ 200;
retain level;
if first.flag then call missing(level);
level=catx(' ',level,id);
if last.flag then output;
run;
data _null_;
set op;
call symputx(cats('d',flag),level);
run;
data d1 d2 d3 d4 d5;
set test;
select;
when (id in (&d1)) output d1;
when (id in (&d2)) output d2;
when (id in (&d3)) output d3;
when (id in (&d4)) output d4;
when (id in (&d5)) output d5;
otherwise;
end;
run;



[/pre]



Ksharp Message was edited by: Ksharp

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3112 views
  • 0 likes
  • 6 in conversation