Dear Sir,
I have a large sas dataset which I want to upload to MySql. I have verified that I can write a small sas dataset to it. However, when I try to write the big dataset to it, it takes forever.
So I try to use bulkload but encountered the following error. I have tried a few ways to use bulkload but no success. I have increased my memsize to max. How much more memory do I need?
Hope you can help or advice me
Tan
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
22
23 GOPTIONS ACCESSIBLE;
24 proc append data=dp base=mysqllib.dp(bulkload=yes);
25 run;
NOTE: Appending WORK.DP to MYSQLLIB.dp.
WARNING: Variable barrid has different lengths on BASE and DATA files (BASE 10 DATA 8).
MYSQL: Can't create bulkload temporary file: bl_dp_5
ERROR: The SAS System stopped processing this step because of insufficient memory.
NOTE: There were 2 observations read from the data set WORK.DP.
NOTE: There were 2 observations read from the data set MYSQLLIB.dp.
26
27 data mysqllib.dp5 (BULKLOAD=yes BL_DEFAULT_DIR="c:\temp\");
28 set dp;
29 run;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
MYSQL: Can't create bulkload temporary file: c:/temp//bl_dp5_6
ERROR: The SAS System stopped processing this step because of insufficient memory.
NOTE: There were 2 observations read from the data set WORK.DP.
WARNING: The data set MYSQLLIB.dp5 may be incomplete. When this step was stopped there were 1 observations and 4 variables.
30
31 data mysqllib.dp6 (BULKLOAD=yes);
32 set dp;
33 run;
NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
MYSQL: Can't create bulkload temporary file: bl_dp6_7
2 The SAS System 18:29 Friday, June 3, 2016
ERROR: The SAS System stopped processing this step because of insufficient memory.
NOTE: There were 2 observations read from the data set WORK.DP.
WARNING: The data set MYSQLLIB.dp6 may be incomplete. When this step was stopped there were 1 observations and 4 variables.
34
35 proc options group=memory; run;
SAS (r) Proprietary Software Release 9.4 TS1M3
Group=MEMORY
SORTSIZE=1073741824
Specifies the amount of memory that is available to the SORT procedure.
SUMSIZE=0 Specifies a limit on the amount of memory that is available for data summarization procedures when class
variables are active.
MAXMEMQUERY=0 Specifies the maximum amount of memory that is allocated for procedures.
MEMBLKSZ=16777216 Specifies the memory block size for Windows memory-based libraries.
MEMMAXSZ=2147483648
Specifies the maximum amount of memory to allocate for using memory-based libraries.
LOADMEMSIZE=0 Specifies a suggested amount of memory that is needed for executable programs loaded by SAS.
MEMSIZE=9854127360
Specifies the limit on the amount of virtual memory that can be used during a SAS session.
REALMEMSIZE=0 Specifies the amount of real memory SAS can expect to allocate.
NOTE: PROCEDURE OPTIONS used (Total process time):
I suggest you try without the BULKLOAD option to start with and experiment with the INSERTBUFF and DBCOMMIT options:
Also try loading to a new temporary first as this should be the least problematic.
Hi Sir,
1. Could you teach me how to load to a new temporary first?
I have used the following:
proc append data=dp base=mysqllib.dp (dbcommit=1000 insertbuff=100000);
run;
It is very fast.
proc append data=dp base=mysqllib.dp(bulkload=yes BL_METHOD=CLILOAD);
run;
--> this is very slow but bulkload is working.
2. How do I check the data in mysql is same as Sas dataset?
Thank you
Dear Linus,
I am building my own database. Hence there is no IT department to discuss MySQL with.
I am scratching my head with the previous comments about using temporary files. Certainly his suggestions on DBcommit and buffer helps overcome bulkload.
Thanks for the suggestions on using samples to check.
First thing I will preface my comment with: I am not a MySql admin or use the program so specific details (options and parameters) I do not have on hand.
A friend of my was recently working with SAS connected to MySql and they had some very poor performance issues with time of program running. After an amount of sleuthing they found out that there were apparently 3 data engines involved and the "default" installation picked one that optimizes for some other type of operation. When they switched to use a different MySql engine the process times went from 10 hours to 30 minutes for some SAS related operations.
I can't say which to look for or where to find it, but it may be default MySql setting that is making things run slow.
Hi Linus
1. Sas dataset has a limitation. If a dataset is more than 5G, it becomes very slow in reading and writing to it.
2. I would be using other programming languages to process the data because I am looking at their toolboxes. Hence, ideally, I would need a database to store the data and retrieve it from sas dataset.
Hi Ballard
I will take note of engines when it becomes too slow.
Just that I read bulkload is very good but I cant use it. Its disappointing
Thanks for your great advices
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.