Hello,
I'm struggling with figuring out a solution for this. I have a large dataset that can contain several million records (it will vary each week). I need to export this dataset to multiple .txt files, each containing 250,000 records (with the remainder also being placed into a .txt file). So, for example, if I have 2,001,000 records in the original dataset then I would need 8 .txt files of 250,000 records and 1 .txt file with 1,000 records. Also, the .txt files should have incremented names e.g. texfile1, textfile2, textfile3, etc...
Any help would be appreciated.
Create som pre code that calculates how many txt you need to create.
Then create a %do loop that will generate dynamically output statements, which are using the value _n_.
Unfortunately I'm fairly new to SAS and I'm not sure how to do that. I found an example online that sounds like it does what I need and I'm trying to tweak it to fit my needs but I'm a bit lost...
%MACRO makecsv (dsn, nvials=300)
* Determine number of records in &DSN *;
%LET dsid=%SYSFUNC(OPEN(&dsn));
%LET numobs=%SYSFUNC(ATTRN(&dsid, NLOBS));
%LET rc=%SYSFUNC(CLOSE(&dsid));
%LET k = 0;
%DO i = 1 %TO &numobs %BY &nvials ;
%LET k = %EVAL(&k+1);
PROC EXPORT DATA=&dsn (FIRSTOBS=&i OBS=%EVAL(&i +&nvials -1))
OUTFILE = "batch&dsn&k..&SYSDATE..csv"
DBMS = CSV REPLACE;
RUN;
%END;
%MEND makecsv;
I'm thinking you are very close.
proc sql;
select nobs/250000 into :amax
from DICTIONARY.TABLES
where (memtype='DATA')
and memname='CARS'
and libname='SASHELP';
quit;
OR
dsid=%sysfunc(open(SASHELP.CARS));
num=%sysfunc(attrn(&dsid,nlobs));
rc=%sysfunc(close(&dsid));
Then just iterate thru with a do loop
%put &amax #
%macro do_loop;
%do i = 1 %to &amax+1;
data tmp;
set work.mybigdsn;
if 1=%eval(&1) and _n_ le 250K;
else (&I-1)*250K le _n_ le (&i*250k);
run;
proc export data=tmp outfile="C:\output\mytextfile&i..txt"
run;
proc datasets
delete tmp;
quit;
%end;
%mend do_loop;
do_loop;
If you want to do this in a single data step without macro programming, look up the FILEVAR= option in the FILE statement.
Hi dfanella,
The easy way to do this is to use multiple output statements to create multiple datasets. Let me give you a really simple example.
Using a Data statement you can:
Data USA Australia Other; Creates three data sets that you will fill with data
set Kathy.AllSavings; from the AllSavings data set
select(upcase(Country)); This just capitalizes the Country code
If Country="US" then output USA; This looks for the variable I chose to define the dataset USA
else if Country="AU" then output Austrialia; This looks for the variable I chose to define the dataset Austrialia
else output other; This is the dataset that data goes to if it does not land in any other dataset.
run;
You should always have an "Other" dataset, and you can break it out to as
many datasets as you like.
Maybe this could this be an variant using the filevar option.
filename myfile 'tmp'; /*File name or path does not matter. Only virtual and no used*/
data _null_;
retain rec_per_file 100;
new_file_count = _N_ - 1;
if mod( new_file_count, rec_per_file) = 0 then do;
out_file = '/data3/sas01/workarea/svein/tmp_' || strip(( new_file_count+rec_per_file)/100) || '_file.txt';
put out_file;
end;
file myfile filevar=out_file dlm=";";
set sashelp.cars;
put _N_ make model type;
run;
Why not just use DO loops?
Here is an example using SASHELP.CLASS dataset splitting with 5 observations per output file.
%let dir=~/temp ;
%let nobs=5 ;
data _null_;
do fileno=1 by 1 ;
filevar = catx('/',"&dir",cats('file',fileno,'.txt'));
file out filevar=filevar dsd dlm='|' lrecl=2000000 ;
do _n_=1 to &nobs ;
set sashelp.class ;
put (name -- weight) (:);
end;
end;
run;
/* Create text files for Whatever Purpose */
/* Specify the input dataset and the number of rows per file */
%let input_dataset = your_dataset_name;
%let rows_per_file = 250000; /* you can change this if your file size needs change */
/* Create a macro variable for the total number of records in the dataset */
proc sql noprint;
select count(*) into :total_records
from &input_dataset;
quit;
%put &total_records;
/* Calculate the number of files needed */
%let num_files = %sysevalf((&total_records/&rows_per_file),integer);
%put &num_files;
%macro export_to_text_files;
%do i = 1 %to &num_files+1;
/* Calculate the starting and ending record numbers for each file */
%let start_record = %eval((&i - 1) * &rows_per_file + 1);
%let end_record = %eval(&i * &rows_per_file);
/* Create a data step to export a subset of the dataset to a text file */
data _null_;
file "output_file_&i..txt"; /** ad your directory here in front of the file name **/
set &input_dataset (firstobs=&start_record obs=&end_record);
put (_all_)(=); /* this makes sure only the variables you created are part of the output file */
run;
%end;
%mend;
/* Call the macro to export the dataset to text files */
%export_dataset_to_text_files;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.