Hi,
I have a data set with 5 million records.I need to pull them into spread sheets.When i splitted the data set into 6 data sets and created CSV it took a long time.It took about 5 hours for each CSV file.
Our back end is unix.I am using SAS EG.The CSV files will be created on our Server.
Is there any better way to do this.
ods csv file="&path.&name..csv";
proc print data=&dataset noobs label;
run;
The time was probably caused by using proc print which may have been attempting lots of formatting in the background.
Try using
proc export data=&dataset
outfile="&path.&name..csv"
dbms=CSV
Replace;
run;
if you want variable labels instead of names as column headers add Label to the proc statement.
so how does this code split the set into 6 csv files? I dont understand. Can you please explain?
EDIT: Oh i see.Your code actually works for each of the splitted sets
A simple data step can create a CSV file. Only tricky part is adding the header row.
proc contents data=&dataset noprint out=contents ; run;
proc sql noprint ;
select quote(trim(coalesce(label,name)))
into :labels separated by " ',' "
from contents
order by varnum
;
quit;
data _null_;
file "&path.&name..csv" dsd lrecl=100000 ;
set &dataset ;
if _n_=1 then put &labels ;
put (_all_) (:) ;
run;
In my experience, using ODS for writing very large files is very resource-intensive. The techniques mentioned above are much more efficient.
Hey Guys,
I will try the above methods.Is splitting the file and creating them is correct process or is there any other process?
The only reason I can see for splitting your files is to stay within the current Excel limit of 1 million rows. You can use the FIRSTOBS = and OBS = dataset options to do this:
data split1;
set large (firstobs = 1 obs = 1000000);
run;
You can create multiple files in one pass of the data by using the FILEVAR option on the FILE statment.
defines a variable whose change in value causes the FILE statement to close the current output file and open a new one the next time the FILE statement executes. The next PUT statement that executes writes to the new file that is specified as the value of the FILEVAR= variable.
For example try this program.
%let dataset=sashelp.class ;
%let n=10 ;
%let prefix=~/temp/split ;
proc contents data=&dataset noprint out=contents ; run;
proc sql noprint ;
select varnum
, name
, quote(trim(coalesce(label,name)))
into names
, names
, :labels separated by " ',' "
from contents
order by varnum
;
quit;
data _null_;
set &dataset ;
_group = 1 + int((_n_-1)/&n) ;
_file = cats("&prefix",_group,'.csv');
file csvfile filevar=_file dsd lrecl=100000 ;
if mod(_n_,&n)=1 then put &labels ;
put (&names) (:) ;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.