Dear All:
I have a large file and want to split it into several smaller files using the Variable ID. The ID's are in numerical format and there are at least 100 of them.
I also want to split these files by ID to the CSV format.
Thank you.
Randy
Splitting data sets into many different files is generally a sub-optimal thing to do, and requires more programming to get the desired results. Why do you need separate data sets? What does that allow you to do compared to keeping everything in one data set?
It is simple to write to CSV file using a data step. You can use the FILENAME= option on the FILE statement dynamically write to a different file based on the ID value.
data _null_;
set mydata ;
length filename $200 ;
filename = cats('/somepath/basename_',id,'.csv');
file csv dsd filename=filename ;
put var1 var2 .... ;
run;
If you want to include a header row then you will need to process the dataset BY ID and include an additional put to write the header row.
data _null_;
set mydata ;
by id;
length filename $200 ;
filename = cats('/somepath/basename_',id,'.csv');
file csv dsd filename=filename ;
if first.id then put 'var1,var2,....';
put var1 var2 .... ;
run;
You do not need to split the dataset before you create the separate CSV's:
proc sort data=have;
by id;
run;
data _null_;
set have;
by id;
length fname $254;
fname = cats("/path/",id,".csv");
file dummy filevar=fname dlm="," dsd;
if first.id then put "ID,othervar";
put
id
othervar
;
run;
Dear All
Just to clarify I want to split the data into files both in SAS and CSV format
Thank you
@RandyStan wrote:
Dear All
Just to clarify I want to split the data into files both in SAS and CSV format
Thank you
What is the justification for creating separate SAS datasets for each value of ID? In general it is not worth the effort as it so easy to just add a WHERE statement to limit any work you are doing to the ID of interest. (Plus when the data is all together you can do analysis on multiple ID values at once.)
@RandyStan wrote:
Dear All
Just to clarify I want to split the data into files both in SAS and CSV format
Thank you
What for? You can always use BY processing to do analysis at once.
I have done a split along a variable only once, when processing a whole dataset at once would run out of disk space. In 20+ years of SAS work and in one of 1000+ batch programs.
data have;
set sashelp.class;
run;
proc freq data=have noprint;
table age/out=age nopercent;
run;
data _null_;
set age;
call execute(catt('data _',age,';set have;if age=',age,';run;
proc export data=_',age,' outfile="c:\temp\',age,'.csv" dbms=csv replace;run;'));
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.