BookmarkSubscribeRSS Feed
JasonNC
Quartz | Level 8

Hi,

I have a data set with 5 million records.I need to pull them into spread sheets.When i splitted the data set into 6 data sets and created CSV it took a long time.It took about 5 hours for each CSV file.

Our back end is unix.I am using SAS EG.The CSV files will be created on our Server.

Is there any better way to do this.

ods csv file="&path.&name..csv";

   proc print data=&dataset noobs label;

   run;

7 REPLIES 7
ballardw
Super User

The time was probably caused by using proc print which may have been attempting lots of formatting in the background.

Try using

proc export data=&dataset

     outfile="&path.&name..csv"

     dbms=CSV

     Replace;

run;

if you want variable labels instead of names as column headers add Label to the proc statement.

Tal
Pyrite | Level 9 Tal
Pyrite | Level 9

so how does this code split the set into 6 csv files? I dont understand. Can you please explain? Smiley Happy

EDIT: Oh i see.Your code actually  works for each of the splitted sets

Tom
Super User Tom
Super User

A simple data step can create a CSV file.  Only tricky part is adding the header row.

proc contents data=&dataset noprint out=contents ; run;

proc sql noprint ;

  select quote(trim(coalesce(label,name)))

    into :labels separated by " ',' "

  from contents

  order by varnum

  ;

quit;

data _null_;

  file "&path.&name..csv" dsd lrecl=100000 ;

  set &dataset ;

  if _n_=1 then put &labels ;

  put (_all_) (:) ;

run;


SASKiwi
PROC Star

In my experience, using ODS for writing very large files is very resource-intensive. The techniques mentioned above are much more efficient.

JasonNC
Quartz | Level 8

Hey Guys,

I will try the above methods.Is splitting the file and creating them is correct process or is there any other process?

SASKiwi
PROC Star

The only reason I can see for splitting your files is to stay within the current Excel limit of 1 million rows. You can use the FIRSTOBS = and OBS = dataset options to do this:

data split1;

  set large (firstobs = 1 obs = 1000000);

run;

Tom
Super User Tom
Super User

You can create multiple files in one pass of the data by using the FILEVAR option on the FILE statment.

FILEVAR=variable

defines a variable whose change in value causes the FILE statement to close the current output file and open a new one the next time the FILE statement executes. The next PUT statement that executes writes to the new file that is specified as the value of the FILEVAR= variable.

For example try this program.

%let dataset=sashelp.class ;

%let n=10 ;

%let prefix=~/temp/split ;

proc contents data=&dataset noprint out=contents ; run;

proc sql noprint ;

  select varnum

       , name

       , quote(trim(coalesce(label,name)))

    into names

       , names

       , :labels separated by " ',' "

  from contents

  order by varnum

  ;

quit;

data _null_;

  set &dataset ;

  _group = 1 + int((_n_-1)/&n) ;

  _file = cats("&prefix",_group,'.csv');

  file csvfile filevar=_file  dsd lrecl=100000 ;

  if mod(_n_,&n)=1 then put &labels ;

  put (&names) (:) ;

run;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 3020 views
  • 2 likes
  • 5 in conversation