09-17-2014 04:06 AM
I have performance issue with my report code.
My code takes around 69 minutes to complete the report. One “Proc report” procedure takes 66 minutes to complete the step.
The data has 104 columns and 168,745records, the physical size of my data is 450 MBs.
Below is my proc report code and fulltimer options log:
Proc report Data=SMALL_MIS_FINAL center style(header)=[background=cx113388 color=cxffffff]
style(summary)=[font_style=roman font_weight=bold] missing split= '~' spanrows;
Column column1 to column_103 row_count;
Define row_count / "Row Count" computed noprint;
If int(row_count/2)=row_count/2 Then Do;
CALL DEFINE (_row_,'STYLE','STYLE=[BACKGROUNDCOLOR=CXEFEFEF]');
NOTE: There were 168745 observations read from the data set WORK.SMALL_MIS_FINAL.
NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
NOTE: PROCEDURE REPORT used (Total process time):
real time 1:06:14.57
user cpu time 1:06:22.96
system cpu time 7.24 seconds
OS Memory 903724.00k
Timestamp 09/16/2014 06:10:33 PM
Page Faults 0
Page Reclaims 68733
Page Swaps 0
Voluntary Context Switches 9113
Involuntary Context Switches 13645
Block Input Operations 976
Block Output Operations 1554480
Please let me know how to reduce cpu/ real time in this step.
Thanking you in advance.
09-17-2014 04:27 AM
What ODS destination are you using? Do you send to a file?
I am not sure what you are attempting to ouput without test output as you appear to be computing OBS each record which is a big straing on resources.
Also the in statement with the function calls may take a while as has to happen each time.
I would suggest you look at what you want out and sort you data out first, e.g. put calculations in a datastep prior to the report procedure, and just have a simple proc report. If you provide test data and desired output I could suggest further.
Edit: Maybe something like:
if mod(_n_,2)=0 then skip=1;
proc report data=want;
column make model type skip=skipn row_count;
define row_count / "row count" computed noprint;
define skipn / analysis noprint;
if skipn=1 then call define (_row_,'style','style=[backgroundcolor=cxefefef]');
09-17-2014 10:32 AM
Thank You for your thoughts on this.
We are using Excel as ODS destination. we create excel file as output file (infact through XML file).
I have removed the compute statment in my proc report code. but still it takes more than 60 minutes to create excel file for 168,000 records.
Please let me know if you have any other questions/ suggestion on this.
Again Thank you for your valuable time!.
09-17-2014 10:44 AM
Well, I wouldn't recommend exporting that much data to Excel. Firstly only the most recent releases of Excel will even handle that kind of data, and you won't be able to use it in any meaningful way in Excel. As for creating the file, well its possible that it does take that long. Remember the ods system is running through a tagset conditionally outputting a lot more text per "cell" as XML is very verbose. So your 168000 records, depending on column count could run into hundreds of mb of XML.
Really, go back and asses what you are trying to do with the data, if its transfer of data from one person to another use CSV not XML. If someone wants to review 168,000 rows * x number of columns - well best of luck. If its a datalisting to go in a report, maybe use PDF.
09-17-2014 11:08 AM
Hi, you said that you are using "Excel" as the ODS destination, but you do NOT show the ODS code. Are you using ODS CSV? ODS HTML? ODS MSOFFICE2K? ODS TAGSETS.EXCELXP? or the experimental ODS EXCEL in SAS 9.4? When you use many of the ODS methods, you are NOT creating a true, binary Excel file, you are merely creating an ASCII text file that Excel knows how to open and render.
I agree with DATA _NULL_ that you are outputting a lot of rows and it seems like a lot of columns to your output with PROC REPORT and, so it is hard to debug performance issues over the Forums without the complete code and the some data to test. Given the size of your data and the incomplete information about your ODS destination, other factors that could be complicating your running time would be the operating system you're using and the version of SAS you're using, as well as the version of Office you're using to render the file.
To me, all of this points to opening a track with Tech Support, where they can look at ALL of your code (including ODS statements), a sample of your real data, and they can collect the necessary information to really do debugging and see what is causing the file to take so long to be created.
09-17-2014 11:27 AM
Hi Cynthia / Data _null_ /RW9,
Thank you for your input on this.
I use ODS TAGSETS.EXCELXP Example "ODS TAGSETS.EXCELXP options(sheet_name='Parameter' absolute_column_width='20,40' width_fudge='1.0' autofit_height ='yes' embedded_titles='yes' orientation='landscape' sheet_interval='PROC' )", using SAS v9.3 in Linux machine.
The excel/xml file output size is around 300 MBs. The number of column is 103 and number of row is 168,000.
My complete report code takes 69 minutes, out of this, only proc report procedure takes 66 minutes to write it in excel/xml file. the remaining part (Joining different tables, formating etc.) takes only 3 minutes.I will check with SAS tech support. let me k now if you guys have any other solutions to reduce proc report run time.
09-17-2014 11:29 AM
You also have not given any specs on the system that you are running on. Obviously
running on a z/80 is not a fast as running on a z/OS system.
I create a 550 Excel spreadsheets from about 980K records and it still takes me about
30 minutes on a 8 way server.