I'm creating a CSV output file using ODS CSV and it takes a surprisingly long time to create. It's large, but while it takes less than 17 minutes to read the data from disk and create a subset, it takes 94 minutes to write it out. Before I start investigating whether there are any hardware-related issues that could account for this, is there any SAS-related issues that I should be aware of? Is there an alternative approach that would accomplish the same thing? I chose Proc Print because I need the variables a specific order and I can easily do that. Any other solution would need to be able to the same thing.
My log is below. Thanks for your help.
Steve
109 rsubmit ;
NOTE: Remote submit to TASK1 commencing.
63 libname spj1 "\\cdc\csp_private\m103\spj1" ;
NOTE: Libref SPJ1 was successfully assigned as follows:
Engine: V9
Physical Name: \\cdc\csp_private\m103\spj1
64 options obs=max ;
65 Data spj1.VaxData ;
66 set &Grantee..Vaccination
67 (Keep= CDCID VaxGroup Vaxdate MVX CVX CPT TradeName LotNum Route Site RecordType
68 VaxComplete VaxRefusal IISValidity ProvID V_Eligibility VaxEntDate ProvChar) ;
69 where VaxGroup = 6 ;
70 retain VaxRecordNum 0 ;
71 VaxRecordNum + 100 ;
72 MatchNum = 1 ;
73 Length _PatientID $15 _VaxDate $8 _VaxEntDate $8 LotNum1-LotNum3 $1 ;
74 _PatientID = scan(CDCID,3,'-') ;
75 _VaxDate = put(VaxDate,yymmddN8.) ;
76 _VaxEntDate = put(VaxEntDate,yymmddn8.) ;
77 call missing(lotNum1, LotNum2, LotNum3) ;
78 run ;
NOTE: There were 4850003 observations read from the data set BAA.VACCINATION.
WHERE VaxGroup=6;
NOTE: The data set SPJ1.VAXDATA has 4850003 observations and 26 variables.
NOTE: DATA statement used (Total process time):
real time 16:56.65
cpu time 3:02.39
81 ods _all_ close ;
82 ods tagsets.csv file="\\cdc\csp_private\m103\spj1\VaxData.CSV" ;
NOTE: Writing TAGSETS.CSV Body file: \\cdc\csp_private\m103\spj1\VaxData.CSV
83
84 Proc Print data = spj1.VaxData noobs ;
85 Var _PatientID VaxRecordNum MatchNum _Vaxdate MVX CVX CPT TradeName LotNum1-LotNum3 Route
85 ! Site RecordType VaxComplete
86 VaxRefusal IISValidity ProvID V_Eligibility _VaxEntDate ProvChar ;
87 run ;
NOTE: There were 4850003 observations read from the data set SPJ1.VAXDATA.
NOTE: PROCEDURE PRINT used (Total process time):
real time 1:34:00.66
cpu time 1:32:46.20
88
89 ods _all_ close ;
NOTE: Remote submit to TASK1 complete.
Use a data _null_ step to manually write the CSV file.
There's an answer on here in the past two weeks related to this question.
Reeza's suggestion is a good one and probably the fastest. However if you want to stick with ODS then there is a tagset called SIMPLECSV that may be a bit faster than the standard CSV tagset. The difference between the standard CSV tagset and SIMPLECSV is that SIMPLECSV is treats currency and date values as strings instead of numbers and may quote other kinds of strings that are actually numbers. YMMV.
ods tagsets.simplecsv file="whatever";
As Reeza and Tim said it's better to use data _null_ step to create csv files . I am also using the same to create. Here is my code:
data _null_;
set result_&CIRC ;
file "/sasdata/test.csv"
dlm="^" lrecl=20000;
length head1 $2000.;
head1 = "&variable_names";
if _n_=1 then do;
put head1
/*head2 head3 head4 head5*/
;
end;
put @1
&variable_names;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.