BookmarkSubscribeRSS Feed
paparock
Fluorite | Level 6

Hi... need assistance on how to split one text file to multiple text file to follow certain file size limit (eg. not exceeding 30MB/file).

any help rendered is much appreciated. TQ

20 REPLIES 20
fja
Lapis Lazuli | Level 10 fja
Lapis Lazuli | Level 10
Hello!
Do you need to do that in SAS? This is more conveniently done on OS level.
--FJa
paparock
Fluorite | Level 6
- currently the file that i'm working with doesn't have any header. Just need to split the text file so that it doesn't exceed the 30MB/file.
- filenames eg... filenameV1, filenamev2, etc
TQ
Kurt_Bremser
Super User

See this code, successfully tested on SAS On Demand:

/* let's make a text file */

%let textfile = ~/test;

data _null_;
file "&textfile.";
do _n_ = 1 to 10000;
  do i = 1 to nobs;
    set sashelp.class point=i nobs=nobs;
    put _all_;
  end;
end;
stop;
run;

/* now, split it */

%let chunk = %eval(1024*1024); * size of individual output files, 1MB here;

%let outfile = ~/testv; * base name of output;

data _null_;
retain count 1 size 0;
length fname $200;
fname = cats("&outfile.",put(count,z3.)); * use a z. format to keep proper order of files;
infile "&textfile.";
file dummy filevar=fname; * filevar= option creates dynamic filenames;
input;
put _infile_;
size + length(_infile_);
if size > &chunk.
then do;
  count + 1;
  size = 0;
end;
run;

 

paparock
Fluorite | Level 6
Sorry Sir, one more question.
Can the code run on desktop version? TQ for any feedback
fja
Lapis Lazuli | Level 10 fja
Lapis Lazuli | Level 10
Again, let me draw your attention to the OS. In Unix it is a one line command:
split --bytes=30M filename
Kurt_Bremser
Super User

@fja wrote:
Again, let me draw your attention to the OS. In Unix it is a one line command:
split --bytes=30M filename

which one can run in a data step:

data _null_;
infile pipe "split --bytes=30M &filename. 2>&1";
input;
put _infile_;
run;

so that all diagnostics are written to the SAS log.

paparock
Fluorite | Level 6
Tq Sir for your reply.
I'm still new user to SAS, if my file is in drive d:\test\abcd.csv, where to i put the command for this specific file? TQ
Kurt_Bremser
Super User

@paparock wrote:
Tq Sir for your reply.
I'm still new user to SAS, if my file is in drive d:\test\abcd.csv, where to i put the command for this specific file? TQ

You only need to adapt the TEXTFILE and OUTFILE macro variables to your needs.

fja
Lapis Lazuli | Level 10 fja
Lapis Lazuli | Level 10

which one can run in a data step:
data _null_;
infile pipe "split --bytes=30M &filename. 2>&1";
input;
put _infile_;
run;

Awesome! I didn't think of that one ... But you are absolutely right!

Kurt_Bremser
Super User

PS my code is just there to show how to tackle such an issue in SAS. See it as a thought experiment and an opportunity to learn SAS coding options which are not that widely known.

Using the proper tool (Maxim 14), as @fja shows, is the proper way to go. The UNIX command will be much faster in execution, and requires a s..tload less coding.