BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

CSV files are trivial to write with a SAS datastep.  If you ignore the header record you just need:

data _null_;

  set MYDATA ;

  file MYFILE dsd dlm=',' lrecl=32000 ;

  put (_all_) (:) ;

run;

To make multiple files you can use the FILEVAR option on the FILE statement to control the name of the target file with data step logic.

You can build macro variables with the list of variable names from the metadata (SASHELP.VCOLUMN or use PROC CONTENTS) to drive the put statement.  You can also build a macro variable with valid code to put the variable names (or labels) as the header.

You can use a double DOW loop to count how many records you have per by group before you start the group so that you can decide if you should start a new file before the current BY value.

So something like this: (UNTESTED)

%let varlist=A B C;

%let titles="A" ',' "B" ',' "C" ;

%let maxobs=10000 ;

data _null_;

  do nobs=1 by 1 until (last.group);

     set mydata ;

     by group ;

  end;

  do until (last.group) ;

     set mydata ;

     by group ;

     if nrecs=0 or (nrecs + nobs > &maxobs) then do;

        filenumber+1;

        nrecs=0;

        filename = cats('myfile',filenumber,'.csv');

     end;

     file dummy filevar=filename dsd dlm=',' lrecl=32000 ;

     nrecs+1;

     if nrecs=1 then put &titles;

     put &varlist ;

  end;

run;

jdmarino
Fluorite | Level 6

This blows my mind.  I don't know how to think about iterating over set and by statements.  Are you looping over the input multiple times?

Tom
Super User Tom
Super User

It is reading the input dataset twice. But it does it in one pass with two separate pointers (one for each SET statement).  Unless your BY groups have a lot of records disk cache or operating system buffering will make it use the same number of physical disk reads as a single pass.

Peter_C
Rhodochrosite | Level 12

it is worth getting over the apparent complexity (of Tom's method), when you have 47M rows that cause a lot of difficulty for the apparently "simpler" ODS approach

jdmarino
Fluorite | Level 6

Yep.

I've got the basics working (had problems when the BY group was larger than the row threshold for a file).  I'm about to research dataset introspection so I can build a list of the fields to output.  _all_ is no good for me because of the extra fields I added.

UrvishShah
Fluorite | Level 6

Hi,

If you split the data in logical way so that it won't misinterpreat any values, it will be tedious task as in a big data user can't easily make the group based on values...

So my try is to use the attached SAS Macro which creates the one CSV file with a different sheets correspond to each unique values of the variable...It will work for any variable (Charcter or Numeric)...

If this one is not meeting requirement, please let us know the approx observation so that we can think about new logic...

-Urvish

jdmarino
Fluorite | Level 6

Thanks, Urvish; it's generous of you to share.  I don't yet know the split logic (I'm helping a friend and he hasn't yet provided me the spec).  But more importantly, I'm not exporting to excel files, but to actual .CSVs, thus the multisheet approach does not apply.  (A .csv is a pure text file; there are no sheets.)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 11154 views
  • 4 likes
  • 8 in conversation