SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

CSV files are trivial to write with a SAS datastep.  If you ignore the header record you just need:

data _null_;

  set MYDATA ;

  file MYFILE dsd dlm=',' lrecl=32000 ;

  put (_all_) (:) ;

run;

To make multiple files you can use the FILEVAR option on the FILE statement to control the name of the target file with data step logic.

You can build macro variables with the list of variable names from the metadata (SASHELP.VCOLUMN or use PROC CONTENTS) to drive the put statement.  You can also build a macro variable with valid code to put the variable names (or labels) as the header.

You can use a double DOW loop to count how many records you have per by group before you start the group so that you can decide if you should start a new file before the current BY value.

So something like this: (UNTESTED)

%let varlist=A B C;

%let titles="A" ',' "B" ',' "C" ;

%let maxobs=10000 ;

data _null_;

  do nobs=1 by 1 until (last.group);

     set mydata ;

     by group ;

  end;

  do until (last.group) ;

     set mydata ;

     by group ;

     if nrecs=0 or (nrecs + nobs > &maxobs) then do;

        filenumber+1;

        nrecs=0;

        filename = cats('myfile',filenumber,'.csv');

     end;

     file dummy filevar=filename dsd dlm=',' lrecl=32000 ;

     nrecs+1;

     if nrecs=1 then put &titles;

     put &varlist ;

  end;

run;

jdmarino
Fluorite | Level 6

This blows my mind.  I don't know how to think about iterating over set and by statements.  Are you looping over the input multiple times?

Tom
Super User Tom
Super User

It is reading the input dataset twice. But it does it in one pass with two separate pointers (one for each SET statement).  Unless your BY groups have a lot of records disk cache or operating system buffering will make it use the same number of physical disk reads as a single pass.

Peter_C
Rhodochrosite | Level 12

it is worth getting over the apparent complexity (of Tom's method), when you have 47M rows that cause a lot of difficulty for the apparently "simpler" ODS approach

jdmarino
Fluorite | Level 6

Yep.

I've got the basics working (had problems when the BY group was larger than the row threshold for a file).  I'm about to research dataset introspection so I can build a list of the fields to output.  _all_ is no good for me because of the extra fields I added.

UrvishShah
Fluorite | Level 6

Hi,

If you split the data in logical way so that it won't misinterpreat any values, it will be tedious task as in a big data user can't easily make the group based on values...

So my try is to use the attached SAS Macro which creates the one CSV file with a different sheets correspond to each unique values of the variable...It will work for any variable (Charcter or Numeric)...

If this one is not meeting requirement, please let us know the approx observation so that we can think about new logic...

-Urvish

jdmarino
Fluorite | Level 6

Thanks, Urvish; it's generous of you to share.  I don't yet know the split logic (I'm helping a friend and he hasn't yet provided me the spec).  But more importantly, I'm not exporting to excel files, but to actual .CSVs, thus the multisheet approach does not apply.  (A .csv is a pure text file; there are no sheets.)

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 21 replies
  • 12936 views
  • 4 likes
  • 8 in conversation