Good morning. A proc append in a SQL data library takes a very long time to append data. However, the same data on a basic sas library takes a few seconds. I report the 2 example logs.
I'm asking for suggestions on which method is better to hang data on a Sql Server data library from sas:
In my experience, when appending SAS data to SQL Server tables you need to experiment with the INSERTBUFF data connection option. Changing this option can make a huge difference in table load times as the default value is not optimal. Try something like this, then experiment by increasing or reducing the value to see the impact:
LIBNAME EDWHDSST SQLSVR Datasrc=EDWHDSST SCHEMA=DBO AUTHDOMAIN="SQLSRVAuth" insertbuff = 10000;
Also to connect to SQL Server in PROC SQL you can use your LIBNAME:
proc sql;
connect using EDWHDSST;
create table MyTable as
select * from connection to EDWHDSST
(SELECT * FROM TABSQL) ;
disconnect from EDWHDSST;
quit;
I do not know much about the environment you are working in. I assume that proc append just take a "few" observations and gives it to SQLServer who appends them. After doing so maybe the SQLServer starts re-creating an index or so, waits for completion and after that accepts the next package.
I would try to get the data to be appended in the SQLServers database and let the SQLServer doing the append. Something like this :
proc upload data = backup.WXGC21K out = LIB_SQL.WXGC21K;run;
proc sql;
connect to ODBC (dsn="MySQLServerConnectString" schema=dbo);
execute (INSERT INTO TAB1
SELECT *
FROM dbo.WXGC21K) by ODBC;
disconnect from ODBC;
quit;
proc sql;
connect to ODBC (dsn="MySQLServerConnectString" schema=dbo);
execute (Drop Table WXGC21K) by ODBC;
disconnect from ODBC;
quit;;
or if you have licenced SAS/Interface for Access to SQLServer you may try this:
proc upload data = backup.WXGC21K out = LIB_SQL.WXGC21K;run;
proc sql;
connect to sqlsvr (user="Domain\UserID" password="MyPassword" datasrc="MySQLServerConnectString" schema=dbo);;
execute (INSERT INTO TAB1
SELECT *
FROM WXGC21K) by sqlsvr;
disconnect from sqlsvr;
quit;
proc sql;
connect to sqlsvr (user="Domain\UserID" password="MyPassword" datasrc="MySQLServerConnectString" schema=dbo);
execute (Drop Table WXGC21K) by sqlsvr;
disconnect from sqlsvr;
quit;;
Thank you. But I can't connect from proc sql.
LIBNAME EDWHDSST SQLSVR Datasrc=EDWHDSST SCHEMA=DBO AUTHDOMAIN="SQLSRVAuth" ;
NOTE: Libref EDWHDSST was successfully assigned as follows:
Engine: SQLSVR
Physical Name: EDWHDSST
But this proc sql:
proc sql;
connect to SQLSVR as mydb (Datasrc="EDWHDSST" SCHEMA="DBO" AUTHDOMAIN="SQLSRVAuth" );
execute (SELECT * FROM TABSQL) by sqlsvr;
disconnect from sqlsvr;
returns the following error:
27 proc sql;
28 connect to SQLSVR as mydb (Datasrc="EDWHDSST" SCHEMA="DBO" AUTHDOMAIN="SQLSRVAuth" );
NOTE: Credential obtained from SAS metadata server.
ERROR: Invalid option name SCHEMA.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
or using username and password:
proc sql;
connect to SQLSVR as mydb (Datasrc="DBO" user=user password=xxxx);
execute (SELECT * FROM TABSQL) by sqlsvr;
disconnect from sqlsvr;
quit;
ERROR: CLI error trying to establish connection: [Microsoft][ODBC Driver Manager] Data source name not found and no default driver
specified
The schema is needed in LIBNAME statement. For pass through the tablespace follows the FROM statement and then a period + table name:
proc sql;
connect to SQLSVR as mydb (Datasrc="EDWHDSST" AUTHDOMAIN="SQLSRVAuth" );
execute (SELECT * FROM dbo.TABSQL) by sqlsvr;
disconnect from sqlsvr;
Okay. Thank you.
What could I do instead of the proc upload?
It seems that the possibility is not installed because I get the following error:
ERROR: Procedure UPLOAD not found.
You may use a datastep or proc datasets / proc copy instead:
data LIB_SQL.WXGC21K;
set backup.WXGC21K ;
run;
3 hours of real time vs. 7 minutes of CPU time points to bandwidth issues between the SAS and DB server.
And 7 minutes vs. 15 seconds also means that the whole setup of the connection is inefficient.
Get in touch with your SAS and DB admins, and study the documentation for your specific database connection (ODBC).
Also see if you have ACCESS to SQL server licensed, it provides a more direct path (SAS - DB client - server instead of SAS - ODBC OS component - ODBC driver - server).
Thank you.
But assuming PROC UPLOAD worked, could it greatly improve data transfer times from sas to sql? ie i mean doing a dry proc upload of a table from sas to sql
Another question: can there not be a "SQL Pass-Through" code that makes data transfer efficient?
PROC UPLOAD is part of SAS/CONNECT, which is used to communicate between two SAS processes. It is not used to connect to a database server.
@mariopellegrini wrote:
Good morning. A proc append in a SQL data library takes a very long time to append data. However, the same data on a basic sas library takes a few seconds. I report the 2 example logs.
I'm asking for suggestions on which method is better to hang data on a Sql Server data library from sas:
Better in what sense? If the criterion for "better" is the amount of time it takes, you have already found the answer.
In my experience, when appending SAS data to SQL Server tables you need to experiment with the INSERTBUFF data connection option. Changing this option can make a huge difference in table load times as the default value is not optimal. Try something like this, then experiment by increasing or reducing the value to see the impact:
LIBNAME EDWHDSST SQLSVR Datasrc=EDWHDSST SCHEMA=DBO AUTHDOMAIN="SQLSRVAuth" insertbuff = 10000;
Also to connect to SQL Server in PROC SQL you can use your LIBNAME:
proc sql;
connect using EDWHDSST;
create table MyTable as
select * from connection to EDWHDSST
(SELECT * FROM TABSQL) ;
disconnect from EDWHDSST;
quit;
Thank you very much! Thanks to this option the elaboration went to 3'39"!
Is 10,000 usually what works best? Do you still advise me to try other values? Changing by how much?
@mariopellegrini - I don't think there is an ideal number for everyone. I usually try a few different values, like 1,000, 2,000, 5,000, 10,000, 15,000, 20,000 to see which gives the best performance. We currently use 10,000 for most of our loads and that works well. There is no point using a larger number than the number of rows you are appending though.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.