I have a dataset and I want one fourth of observation to go in one dataset and another one fouth to a different dataset and another one fourth to different dataset and so on .How can i do this
Hi,
If there is no particular criterion to select which data goes where, the following program should do what you want.
data want1 want2 want3 want4;
set sashelp.class;
if mod(_N_,4)=0 then output want1;
if mod(_N_,4)=1 then output want2;
if mod(_N_,4)=2 then output want3;
if mod(_N_,4)=3 then output want4;
run;
what i want first four observation should go to want1 dataset and then next four to want2 and so on
I will go with a macro (sorry RW9)
%macro create_data;
proc sql;
SELECT ceil(count(*)/4)
INTO :nbdatasets
FROM sashelp.class;
quit;
%let datasets=;
%do i=1 %to &nbdatasets.;
%let datasets=&datasets want&i.;
%end;
data &datasets.;
set sashelp.class;
%do i=1 %to &nbdatasets.;
if ceil(_N_/4)=&i. then output want&i.;
%end;
run;
%mend;
%create_data;
Seeing as you invoked my macros aversions
data _null_; set sashelp.vtable (where=(libname="WORK" and memname="<yourds>")); do i=1 to nobs by 4; call execute(cats('data want',put(i,best.),'; set work.<yourds> point=',put(i,best.),' obs=4; run;')); end; run;
That being said however, I would really question the need to split your data up in the first place. The question is why? SAS is built around the concept of by group processing, each of the functions are optimised to do thing over groups. Read/write to dataset is one of the biggest resource hogs in most proceses. So why not in your one dataset assign a group value, e.g. a number which starts at 1 and then increment by one each 4 observations. You can then do:
proc means data=have; by group; ...
For example to get a means for each group. Its also far simpler code.
Nah, I meant sashelp.vtable (corrected now) or you could use dictionary.tables.
Sorry, hard to tell if thats a response to my code, so will add. If you only want four tables then @Patrick has a good solution, you just need to fiddle with the select part. However still better to assign groups in the one dataset, and then use by group processing.
A variation of @gamotte initial post which splits up the data as per your requirement.
data want1 want2 want3 want4;
set sashelp.class;
select ( mod(floor((_N_-1)/4),4) );
when (0) output want1;
when (1) output want2;
when (2) output want3;
when (3) output want4;
otherwise;
end;
run;
%let dsid=%sysfunc(open(sashelp.class)); %let nobs=%sysfunc(attrn(&dsid,nlobs)); %let n=%sysevalf(&nobs/4,i); %let dsid=%sysfunc(close(&dsid)); data want1 want2 want3 want4; set sashelp.class; if _n_ le &n then output want1; else if _n_ le 2*&n then output want2; else if _n_ le 3*&n then output want3; else output want4; run;
If you want randomly select obs ,try proc surveyselect.
proc surveyselect data=sashelp.class groups=4 out=temp;
run;
data want1 want2 want3 want4;
set temp;
if groupid=1 then output want1;
else if groupid=2 then output want2;
else if groupid=3 then output want3;
else output want4;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.