BookmarkSubscribeRSS Feed
TanQF
Fluorite | Level 6

Dear Sir,

 

I have tried to retrieve same data from an identical table in MySQL and Sas Enterprise guide which is using sas 9.4 TS1M3 in windows 10 using intel 4 core and 12G RAM.

 

MySQL return result in 44 sec but Sas Enterprise guide returns in 2 min 41 sec. 3 times slower.

 

What memory allocation do I need to set to improve the Sas speed?

 

Thank you

QF

10 REPLIES 10
SASKiwi
PROC Star

It is unlikely that memory is the cause of your performance difference. Firstly how are you querying MySQL directly - what client tool are you using? Secondly how are you querying MySQL with SAS EG? Do you have an identical query you are running in both?

TanQF
Fluorite | Level 6

Dear Kiwi,

 

I have the same dataset stored in mysql database(xx1) and as xx2.sas7bdat in a windows directory.

 

I use MySql workbench: select * from XX1 where columnA = YY and Date = 1234.

 

For Sas enterprise guide: proc print data= XX2 (where=(columnA = YY and Date = 1234))

 

So i record the timing it returns the value.

 

I believe there are some configuration I can modify to allocate resources to Sas. Maybe memsize or ?

 

Thanks

SASKiwi
PROC Star

You are not comparing apples with apples. You are comparing an SQL query on a MySQL table with a PROC PRINT of the same table converted to a SAS dataset. Try an SQL query in SAS. How does that perform?

proc sql;
  select * from XX2 where columnA = YY and Date = 1234;
quit;
TanQF
Fluorite | Level 6

Dear Kiwi,

 

in MySQL: 38 sec

In Sas: Old method: using proc print : 2min 35 sec

In Sas: using your suggestion: 2 min 33 sec

 

Is there other way to speed up to performance? I know there are some default configuration for installing Sas in windows. I once set the memsize to max to speed up the transfer of data from sas to mysql.

 

Thank you

 

LinusH
Tourmaline | Level 20
First verify that the SAS/ACCESS engine transforms your query the way you expect :
Options msglevel = I sastrace = ',,,d' SASTRACELOC = saslog nostsuffix;

Is the MySQL database on the same server as your EG server (local or remote workspace)? If not, investigate any network constraints.

There are some libname options thatcan speed up query performance, like readbuf. Check the online doc. Also, there are lots of papers out there regarding optimize SAS/ACCESS connections.
Data never sleeps
TanQF
Fluorite | Level 6

Hi Linus,

 

MySQL database and Enterprise Guide Sas are both on my desktop, same local windows machine. 

 

It seems that there are many configurations to be changed and none of the website gives a complete picture. I have a doubt, how come we need to specific many details configurations. Usually these are done automatically? There are only CPUs and RAMs to be used to improve efficency. Once we limit the cores and rams to a software. The rest of the performance are scaled according and done automatically.

 

Maybe there are very drastic configurations that I missed out?

 

I attached the log for the query.

 

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
22
23 GOPTIONS ACCESSIBLE;
24 proc sql;
25 select * from XXX where id='XXX' and date=XXX;
INFO: Data file XXX.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
26 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 2:36.57
cpu time 2:15.14

27
28 Options msglevel = I sastrace = ',,,d' SASTRACELOC = saslog nostsuffix;
29
30 GOPTIONS NOACCESSIBLE;
31 %LET _CLIENTTASKLABEL=;
32 %LET _CLIENTPROCESSFLOWNAME=;
33 %LET _CLIENTPROJECTPATH=;
34 %LET _CLIENTPROJECTNAME=;
35 %LET _SASPROGRAMFILE=;
36
37 ;*';*";*/;quit;run;
38 ODS _ALL_ CLOSE;
39
40
41 QUIT; RUN;
42

mkeintz
PROC Star

Your intent is to run the PROC SQL against a SAS dataset, right?  If so, isn't the answer to your question in the INFO note from your log fragment (italicized and bolded below)? 

 

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
24 proc sql;
25 select * from XXX where id='XXX' and date=XXX;
INFO: Data file XXX.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
26 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 2:36.57
cpu time 2:15.14

27


The note is telling you that dataset XXX may be a sas dataset, but in a format that is native to another host, (or the file encoding does not match encoding in your SAS session).  We get that for instance when a sas dataset created in a unix environment is downloaded unchanged and then accessed using Windows SAS.  Make a native copy of SAS dataset XXX and rerun the PROC SQL.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
LinusH
Tourmaline | Level 20
1. The options statement must be issued prior to the query.
2. The query doesn't seem to involve your MySQL data.

CPU settings is "automatic" by default.
Memory options usually isn't. Check the documentation for memsize and sortsize options.
If you have a output defined in your query you need to consider write performance as well.
To rule that out, try to do a data _null_ step that reads your MySQL table.
Data never sleeps
TanQF
Fluorite | Level 6

Hi Linus,

 

I have the same data but residing in MySQL and as sas7bdat in local drive separately. So I try to print the result using MySQL and Sas EG and discovered the great difference in performance. Both are in the same local desktop machine. 

 

I already changed memsize Max in my sasv9.cfg. 

 

What could go wrong?

ChrisNZ
Tourmaline | Level 20

There are many ways to speed up such an extract.

 

Storage format (SPDE engine vs V9 engine, compression type) has a huge influence.

For example, if your query is I/O-bound and not CPU bound, an SPDE library with binary compression could speed things up dramatically.

 

Other issues since you have a where clause is the table sort order and the indexes.

 

Another critical factor is the way you retrieve your data in terms of memory utilisation. The data set's buffer size, the number of buffers used, whether or not Windows's Direct-IO is used all have a huge effect. To illustrate this point, here is a benchmark example taken from

 

https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490

 

 

Table 5.5: Run times for various values of BUFSIZE, BUFNO and SGIO for a 10,000 MB data set

10,000,000 kB

 

BUFNO

1

5

25

100

500

BUFSIZE

SGIO

195.45

.

.

.

.

0

no

4k

no

235.25

236.25

235.56

239.78

235.39

yes

1177.55

572.32

157.28

93.60

93.96

8k

no

163.77

162.05

164.22

164.05

161.34

yes

542.85

268.86

95.45

81.07

80.84

16k

no

138.03

140.26

136.16

137.49

136.11

yes

265.46

142.55

75.28

74.61

81.70

32k

no

143.70

148.85

137.96

147.67

143.98

yes

140.56

93.84

73.35

73.33

75.48

64k

no

180.47

171.39

166.63

151.69

173.16

yes

88.62

70.14

71.62

72.00

.

128k

no

235.25

236.25

235.56

239.78

235.39

yes

70.72

71.30

71.34

70.33

.

 

 The default run time is 195 seconds, which can be reduced to 70s by simply changing the memory usage settings.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 4969 views
  • 1 like
  • 5 in conversation