Dear Sir,
I have tried to retrieve same data from an identical table in MySQL and Sas Enterprise guide which is using sas 9.4 TS1M3 in windows 10 using intel 4 core and 12G RAM.
MySQL return result in 44 sec but Sas Enterprise guide returns in 2 min 41 sec. 3 times slower.
What memory allocation do I need to set to improve the Sas speed?
Thank you
QF
It is unlikely that memory is the cause of your performance difference. Firstly how are you querying MySQL directly - what client tool are you using? Secondly how are you querying MySQL with SAS EG? Do you have an identical query you are running in both?
Dear Kiwi,
I have the same dataset stored in mysql database(xx1) and as xx2.sas7bdat in a windows directory.
I use MySql workbench: select * from XX1 where columnA = YY and Date = 1234.
For Sas enterprise guide: proc print data= XX2 (where=(columnA = YY and Date = 1234))
So i record the timing it returns the value.
I believe there are some configuration I can modify to allocate resources to Sas. Maybe memsize or ?
Thanks
You are not comparing apples with apples. You are comparing an SQL query on a MySQL table with a PROC PRINT of the same table converted to a SAS dataset. Try an SQL query in SAS. How does that perform?
proc sql;
select * from XX2 where columnA = YY and Date = 1234;
quit;
Dear Kiwi,
in MySQL: 38 sec
In Sas: Old method: using proc print : 2min 35 sec
In Sas: using your suggestion: 2 min 33 sec
Is there other way to speed up to performance? I know there are some default configuration for installing Sas in windows. I once set the memsize to max to speed up the transfer of data from sas to mysql.
Thank you
Hi Linus,
MySQL database and Enterprise Guide Sas are both on my desktop, same local windows machine.
It seems that there are many configurations to be changed and none of the website gives a complete picture. I have a doubt, how come we need to specific many details configurations. Usually these are done automatically? There are only CPUs and RAMs to be used to improve efficency. Once we limit the cores and rams to a software. The rest of the performance are scaled according and done automatically.
Maybe there are very drastic configurations that I missed out?
I attached the log for the query.
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
22
23 GOPTIONS ACCESSIBLE;
24 proc sql;
25 select * from XXX where id='XXX' and date=XXX;
INFO: Data file XXX.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
26 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 2:36.57
cpu time 2:15.14
27
28 Options msglevel = I sastrace = ',,,d' SASTRACELOC = saslog nostsuffix;
29
30 GOPTIONS NOACCESSIBLE;
31 %LET _CLIENTTASKLABEL=;
32 %LET _CLIENTPROCESSFLOWNAME=;
33 %LET _CLIENTPROJECTPATH=;
34 %LET _CLIENTPROJECTNAME=;
35 %LET _SASPROGRAMFILE=;
36
37 ;*';*";*/;quit;run;
38 ODS _ALL_ CLOSE;
39
40
41 QUIT; RUN;
42
Your intent is to run the PROC SQL against a SAS dataset, right? If so, isn't the answer to your question in the INFO note from your log fragment (italicized and bolded below)?
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
24 proc sql;
25 select * from XXX where id='XXX' and date=XXX;
INFO: Data file XXX.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
26 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 2:36.57
cpu time 2:15.1427
The note is telling you that dataset XXX may be a sas dataset, but in a format that is native to another host, (or the file encoding does not match encoding in your SAS session). We get that for instance when a sas dataset created in a unix environment is downloaded unchanged and then accessed using Windows SAS. Make a native copy of SAS dataset XXX and rerun the PROC SQL.
Hi Linus,
I have the same data but residing in MySQL and as sas7bdat in local drive separately. So I try to print the result using MySQL and Sas EG and discovered the great difference in performance. Both are in the same local desktop machine.
I already changed memsize Max in my sasv9.cfg.
What could go wrong?
There are many ways to speed up such an extract.
Storage format (SPDE engine vs V9 engine, compression type) has a huge influence.
For example, if your query is I/O-bound and not CPU bound, an SPDE library with binary compression could speed things up dramatically.
Other issues since you have a where clause is the table sort order and the indexes.
Another critical factor is the way you retrieve your data in terms of memory utilisation. The data set's buffer size, the number of buffers used, whether or not Windows's Direct-IO is used all have a huge effect. To illustrate this point, here is a benchmark example taken from
https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490
Table 5.5: Run times for various values of BUFSIZE, BUFNO and SGIO for a 10,000 MB data set
10,000,000 kB
|
BUFNO |
|||||
1 |
5 |
25 |
100 |
500 |
||
BUFSIZE |
SGIO |
195.45 |
. |
. |
. |
. |
0 |
no |
|||||
4k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
1177.55 |
572.32 |
157.28 |
93.60 |
93.96 |
|
8k |
no |
163.77 |
162.05 |
164.22 |
164.05 |
161.34 |
yes |
542.85 |
268.86 |
95.45 |
81.07 |
80.84 |
|
16k |
no |
138.03 |
140.26 |
136.16 |
137.49 |
136.11 |
yes |
265.46 |
142.55 |
75.28 |
74.61 |
81.70 |
|
32k |
no |
143.70 |
148.85 |
137.96 |
147.67 |
143.98 |
yes |
140.56 |
93.84 |
73.35 |
73.33 |
75.48 |
|
64k |
no |
180.47 |
171.39 |
166.63 |
151.69 |
173.16 |
yes |
88.62 |
70.14 |
71.62 |
72.00 |
. |
|
128k |
no |
235.25 |
236.25 |
235.56 |
239.78 |
235.39 |
yes |
70.72 |
71.30 |
71.34 |
70.33 |
. |
The default run time is 195 seconds, which can be reduced to 70s by simply changing the memory usage settings.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.