DATA Step, Macro, Functions and more

Export large dataset to multiple .txt files

Reply
New Contributor
Posts: 2

Export large dataset to multiple .txt files

Hello,

I'm struggling with figuring out a solution for this. I have a large dataset that can contain several million records (it will vary each week). I need to export this dataset to multiple .txt files, each containing 250,000 records (with the remainder also being placed into a .txt file). So, for example, if I have 2,001,000 records in the original dataset then I would need 8 .txt files of 250,000 records and 1 .txt file with 1,000 records. Also, the .txt files should have incremented names e.g. texfile1, textfile2, textfile3, etc...

Any help would be appreciated.

Super User
Posts: 5,431

Re: Export large dataset to multiple .txt files

Create som pre code that calculates how many txt you need to create.

Then create a %do loop that will generate dynamically output statements, which are using the value _n_.

Data never sleeps
New Contributor
Posts: 2

Re: Export large dataset to multiple .txt files

Unfortunately I'm fairly new to SAS and I'm not sure how to do that. I found an example online that sounds like it does what I need and I'm trying to tweak it to fit my needs but I'm a bit lost...

%MACRO makecsv (dsn, nvials=300)

* Determine number of records in &DSN *;

%LET dsid=%SYSFUNC(OPEN(&dsn));

%LET numobs=%SYSFUNC(ATTRN(&dsid, NLOBS));

%LET rc=%SYSFUNC(CLOSE(&dsid));

%LET k = 0;

%DO i = 1 %TO &numobs %BY &nvials ;

%LET k = %EVAL(&k+1);

PROC EXPORT DATA=&dsn (FIRSTOBS=&i OBS=%EVAL(&i +&nvials -1))

OUTFILE = "batch&dsn&k..&SYSDATE..csv"

DBMS = CSV REPLACE;

RUN;

%END;

%MEND makecsv;

Occasional Contributor
Posts: 14

Re: Export large dataset to multiple .txt files

I'm thinking you are very close.

proc sql;

  select nobs/250000 into :amax            

  from DICTIONARY.TABLES 

  where (memtype='DATA')

    and memname='CARS'

    and libname='SASHELP';

quit;

OR

dsid=%sysfunc(open(SASHELP.CARS));

num=%sysfunc(attrn(&dsid,nlobs));

rc=%sysfunc(close(&dsid));


Then just iterate thru with a do loop

%put &amax   #

%macro do_loop;

  %do i = 1 %to &amax+1;

    data tmp;

      set work.mybigdsn;

      if 1=%eval(&1) and _n_ le 250K;

      else (&I-1)*250K le _n_ le (&i*250k);

    run;

    proc export data=tmp outfile="C:\output\mytextfile&i..txt"

    run;

    

    proc datasets

       delete tmp;

    quit;

   %end;

%mend do_loop;

do_loop;

         

Contributor
Posts: 22

Re: Export large dataset to multiple .txt files

If you want to do this in a single data step without macro programming, look up the FILEVAR= option in the FILE statement.

N/A
Posts: 1

Re: Export large dataset to multiple .txt files

Hi dfanella,

The easy way to do this is to use multiple output statements to create multiple datasets. Let me give you a really simple example.

Using a Data statement you can:

Data USA Australia Other;                              Creates three data sets that you will fill with data

set Kathy.AllSavings;                                      from the AllSavings data set

select(upcase(Country));                                 This just capitalizes the Country code
If Country="US" then output USA;                    This looks for the variable I chose to define the dataset USA

else if Country="AU" then output Austrialia;      This looks for the variable I chose to define the dataset Austrialia

else output other;                                           This is the dataset that data goes to if it does not land in any other dataset.

run;

                                                                    You should always have an "Other" dataset, and you can break it out to as

                                                                    many datasets as you like.

Occasional Contributor
Posts: 5

Re: Export large dataset to multiple .txt files

Maybe this could this be an variant using the filevar option.

filename myfile 'tmp'; /*File name or path does not matter. Only virtual and no used*/

data _null_;
  retain rec_per_file 100;
  new_file_count = _N_ - 1;
  if mod( new_file_count, rec_per_file) = 0 then do;
    out_file = '/data3/sas01/workarea/svein/tmp_' || strip(( new_file_count+rec_per_file)/100) || '_file.txt';
    put out_file;
  end;
  file myfile filevar=out_file  dlm=";";
  set sashelp.cars;
  put _N_ make model type;
run;

Super User
Super User
Posts: 7,060

Re: Export large dataset to multiple .txt files

Why not just use DO loops?

Here is an example using SASHELP.CLASS dataset splitting with 5 observations per output file.

%let dir=~/temp ;

%let nobs=5 ;

data _null_;

  do fileno=1 by 1 ;

    filevar = catx('/',"&dir",cats('file',fileno,'.txt'));

    file out filevar=filevar dsd dlm='|' lrecl=2000000 ;

    do _n_=1 to &nobs ;

      set sashelp.class ;

      put (name -- weight) (Smiley Happy;

    end;

  end;

run;

Ask a Question
Discussion stats
  • 7 replies
  • 2049 views
  • 0 likes
  • 7 in conversation