01-30-2014 07:48 AM
I'm struggling with figuring out a solution for this. I have a large dataset that can contain several million records (it will vary each week). I need to export this dataset to multiple .txt files, each containing 250,000 records (with the remainder also being placed into a .txt file). So, for example, if I have 2,001,000 records in the original dataset then I would need 8 .txt files of 250,000 records and 1 .txt file with 1,000 records. Also, the .txt files should have incremented names e.g. texfile1, textfile2, textfile3, etc...
Any help would be appreciated.
01-30-2014 08:05 AM
Create som pre code that calculates how many txt you need to create.
Then create a %do loop that will generate dynamically output statements, which are using the value _n_.
01-30-2014 08:41 AM
Unfortunately I'm fairly new to SAS and I'm not sure how to do that. I found an example online that sounds like it does what I need and I'm trying to tweak it to fit my needs but I'm a bit lost...
%MACRO makecsv (dsn, nvials=300)
* Determine number of records in &DSN *;
%LET numobs=%SYSFUNC(ATTRN(&dsid, NLOBS));
%LET k = 0;
%DO i = 1 %TO &numobs %BY &nvials ;
%LET k = %EVAL(&k+1);
PROC EXPORT DATA=&dsn (FIRSTOBS=&i OBS=%EVAL(&i +&nvials -1))
OUTFILE = "batch&dsn&k..&SYSDATE..csv"
DBMS = CSV REPLACE;
01-30-2014 10:22 AM
I'm thinking you are very close.
select nobs/250000 into :amax
Then just iterate thru with a do loop
%put &amax #
%do i = 1 %to &amax+1;
if 1=%eval(&1) and _n_ le 250K;
else (&I-1)*250K le _n_ le (&i*250k);
proc export data=tmp outfile="C:\output\mytextfile&i..txt"
01-30-2014 12:16 PM
The easy way to do this is to use multiple output statements to create multiple datasets. Let me give you a really simple example.
Using a Data statement you can:
Data USA Australia Other; Creates three data sets that you will fill with data
set Kathy.AllSavings; from the AllSavings data set
select(upcase(Country)); This just capitalizes the Country code
If Country="US" then output USA; This looks for the variable I chose to define the dataset USA
else if Country="AU" then output Austrialia; This looks for the variable I chose to define the dataset Austrialia
else output other; This is the dataset that data goes to if it does not land in any other dataset.
You should always have an "Other" dataset, and you can break it out to as
many datasets as you like.
01-31-2014 06:57 PM
Maybe this could this be an variant using the filevar option.
filename myfile 'tmp'; /*File name or path does not matter. Only virtual and no used*/
retain rec_per_file 100;
new_file_count = _N_ - 1;
if mod( new_file_count, rec_per_file) = 0 then do;
out_file = '/data3/sas01/workarea/svein/tmp_' || strip(( new_file_count+rec_per_file)/100) || '_file.txt';
file myfile filevar=out_file dlm=";";
put _N_ make model type;
01-31-2014 10:31 PM
Why not just use DO loops?
Here is an example using SASHELP.CLASS dataset splitting with 5 observations per output file.
%let dir=~/temp ;
%let nobs=5 ;
do fileno=1 by 1 ;
filevar = catx('/',"&dir",cats('file',fileno,'.txt'));
file out filevar=filevar dsd dlm='|' lrecl=2000000 ;
do _n_=1 to &nobs ;
set sashelp.class ;
put (name -- weight) (;