<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Upload to Databricks via ODBC is very slow in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974539#M377920</link>
    <description>&lt;P&gt;In my experience tuning INSERTBUFF works for SQL Server but it looks like to be not the case with Databricks. Have you talked to the Databricks DBAs to see if they have any ideas about the slowness? It might be worth trying bulk loading to see if that helps.&lt;/P&gt;</description>
    <pubDate>Mon, 08 Sep 2025 21:04:56 GMT</pubDate>
    <dc:creator>SASKiwi</dc:creator>
    <dc:date>2025-09-08T21:04:56Z</dc:date>
    <item>
      <title>Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974231#M377844</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;we have been experimenting with uploading results generated with SAS to a Databricks SQL warehouse. The code looks as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;libname dbx odbc prompt="Driver={Simba Spark ODBC Driver};
Host=foo.cloud.databricks.com;
Port=443;
HTTPPath=/sql/1.0/warehouses/bar;
SSL=1;
ThriftTransport=2;
AuthMech=3;
UID=token;
PWD=baz;
Catalog=foofoo;
Schema=barbar;
DefaultStringColumnLength=32767"
dbcommit=10000
insertbuff=10000
readbuff=1000
dbcreate_table_opts="TBLPROPERTIES('delta.columnMapping.mode' = 'name', 'delta.checkpoint.writeStatsAsStruct' = 'false', 'delta.autoOptimize.optimizeWrite' = 'true')"
preserve_col_names=yes;

proc sql;
create table dbx.test as
	select * from test;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The performance leaves a lot to be desired unfortunately. As an example, a dataset with 122 variables and 127 thousand observations takes between five and six minutes to upload, whereas it takes less than 20 seconds to upload to a Microsoft SQL Server DB (also via ODBC).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is this expected? According to the documentation, Databricks can also be accessed via JDBC and Spark SAS ACCESS modules, but we unfortunately do not have them licensed. Are there any options which could improve ODBC upload performance? I tried increasing insertbuff and dbcommit further, but I get the following error when going above 12k:&lt;/P&gt;&lt;PRE&gt;ERROR: CLI execute error: [Simba][Hardy] (130) An error occurred while an INSERT statement which causes the driver to reconnect to the server.&lt;/PRE&gt;&lt;P&gt;Thanks for your help in advance!&lt;/P&gt;</description>
      <pubDate>Fri, 05 Sep 2025 11:46:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974231#M377844</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-05T11:46:27Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974236#M377846</link>
      <description>&lt;P&gt;There are a lot of parameters that can affect performance in such use case.&lt;/P&gt;
&lt;P&gt;If few questions:&lt;/P&gt;
&lt;P&gt;- Where does your comparison SQL Server reside (vs Databricks)?&lt;/P&gt;
&lt;P&gt;- Have you tried to load this data by other means into Databricks? Locally (to rule out actual load problems), remotely (compare with SAS).&lt;/P&gt;
&lt;P&gt;Some hints might be given if you active tracing:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options sastrace=',,,d' sastraceloc=saslog nostsuffix msglevel=i;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Sep 2025 12:39:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974236#M377846</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2025-09-05T12:39:51Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974315#M377880</link>
      <description>&lt;P&gt;Did you try to use PROC COPY instead of PROC SQL?&lt;/P&gt;
&lt;PRE&gt;proc sql;
create table dbx.test as
	select * from test;
run;
------------&amp;gt;
proc copy in=work out=dbx;
select test;
run;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 06 Sep 2025 08:42:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974315#M377880</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2025-09-06T08:42:24Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974331#M377883</link>
      <description>&lt;P&gt;What is the download performance for the same table? If both are taking a similar amount of time then I suggest that it is your network connection to Databricks that is the problem.&lt;/P&gt;</description>
      <pubDate>Sat, 06 Sep 2025 22:32:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974331#M377883</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2025-09-06T22:32:22Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974375#M377894</link>
      <description>There is no meaningful difference when using proc datasets. This is likely explained by the fact that it still uses SQL in the background according to the ODBC trace.</description>
      <pubDate>Mon, 08 Sep 2025 06:21:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974375#M377894</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-08T06:21:45Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974379#M377898</link>
      <description>&lt;P&gt;Download performance is not an issue. If it does not fail due to insufficient memory (I need to investigate why this happens), the entire table is downloaded in around 10 seconds.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 06:41:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974379#M377898</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-08T06:41:47Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974380#M377899</link>
      <description>&lt;P&gt;The architecture is somewhat complex due to historical reasons:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;the library I am copying from is locaten on an on-prem network share&lt;/LI&gt;&lt;LI&gt;SAS runs on AWS EC2&lt;/LI&gt;&lt;LI&gt;SQL Server runs on an on-prem VM&lt;/LI&gt;&lt;LI&gt;Databricks runs on AWS&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Creating the table directly in databricks (CTAS from the SQL Server table connected as catalog) takes two minutes. So still much longer than creating a SQL Server table, but nevertheless two to three times faster than creating it via ODBC.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 06:51:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974380#M377899</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-08T06:51:33Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974539#M377920</link>
      <description>&lt;P&gt;In my experience tuning INSERTBUFF works for SQL Server but it looks like to be not the case with Databricks. Have you talked to the Databricks DBAs to see if they have any ideas about the slowness? It might be worth trying bulk loading to see if that helps.&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 21:04:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974539#M377920</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2025-09-08T21:04:56Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974708#M377967</link>
      <description>&lt;P&gt;With lower insertbuff values the upload is even slower as DBX ends up creating one parquet file per row. I am following this up with our DBX contacts in parallel. Bulkload with ODBC is unfortunately only supported for SQL Server. We would need SAS/Access to Spark in order to be able to bulkload the data to DBX.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 06:25:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974708#M377967</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-10T06:25:23Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974770#M377978</link>
      <description>&lt;P&gt;Have you tried tracking this issue to Tech Support? It would be worth doing so if you haven't already.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 20:19:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974770#M377978</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2025-09-10T20:19:55Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974807#M377987</link>
      <description>I have now.</description>
      <pubDate>Thu, 11 Sep 2025 12:02:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974807#M377987</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-11T12:02:45Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974835#M377988</link>
      <description>&lt;P&gt;When we migrated from Teradata to Redshift we found similar poor file transfer performance.&amp;nbsp; In that environment it was easier to dump the data into CSV files and move them via S3 buckets and reload using Redshifts COPY FROM and COPY TO commands.&amp;nbsp; We made SAS macros to automate the process.&amp;nbsp; Essentially the macro for upload would make an empty "table" in the remote database. Then dump the data into a gzipped CSV file (trivial with generic data step), move that file into S3 then use pass thru to issue the Redshift command to ingest the CSV file into the new table.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does Databricks have a similar facility for ingesting files you could leverage for better performance?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Sep 2025 15:41:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974835#M377988</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2025-09-11T15:41:37Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974901#M377993</link>
      <description>&lt;P&gt;We could probably get this set up, but it would require considerably more adjustments than just changing which ODBC driver is being used by our ODBC libname. Right now, we have the SQL server set up as an external data catalogue in DBX, and we can just copy the data to corresponding databricks schemas using CTAS statements.&lt;/P&gt;&lt;P&gt;The goal of using ODBC to load to DBX directly would be to potentially eliminate the need for the SQL Server altogether. Having to go the S3 route would need to be analysed in terms of how much resources it would cost to set up.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Sep 2025 06:14:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/974901#M377993</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-09-12T06:14:45Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/975115#M378046</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/223838"&gt;@js5&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have a look at this Databricks Blog&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.databricks.com/blog/2022/03/16/how-to-speed-up-data-flow-between-databricks-and-sas.html" target="_blank"&gt;https://www.databricks.com/blog/2022/03/16/how-to-speed-up-data-flow-between-databricks-and-sas.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Sep 2025 17:58:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/975115#M378046</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2025-09-15T17:58:19Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/975117#M378047</link>
      <description>&lt;P&gt;One added information&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acreldb/n0j1e5bcfyjygxn1av4lu5yshd3s.htm" target="_blank"&gt;SAS Help Center: BULKLOAD= LIBNAME Statement Option&lt;/A&gt;&lt;/P&gt;
&lt;TABLE class="xisDoc-summary"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TH class="xisDoc-summaryNote" rowspan="2"&gt;Notes:&lt;/TH&gt;
&lt;TD class="xisDoc-summaryText"&gt;Support for Microsoft SQL Server, Spark in HDFS, and Yellowbrick was added in SAS 9.4M7.&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="xisDoc-summaryText"&gt;Support for Informix and for &lt;STRONG&gt;Spark in Databricks was added in SAS 9.4M9.&lt;/STRONG&gt;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Mon, 15 Sep 2025 18:44:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/975117#M378047</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2025-09-15T18:44:27Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/979615#M378825</link>
      <description>&lt;P&gt;Since we did not have SAS/Access to Spark licensed, I ended up implementing it myself. Performance is great in comparison. In brief:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;create an empty table using implicit sql passthrough&lt;/LI&gt;&lt;LI&gt;export csv, formatting datetime and date to default spark formats&lt;/LI&gt;&lt;LI&gt;upload csv to S3&lt;/LI&gt;&lt;LI&gt;execute copy into statement on databricks warehouse using explicit sql passthrough&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Nov 2025 14:33:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/979615#M378825</guid>
      <dc:creator>js5</dc:creator>
      <dc:date>2025-11-26T14:33:24Z</dc:date>
    </item>
    <item>
      <title>Re: Upload to Databricks via ODBC is very slow</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/983329#M379473</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As I se the thread is recent and just in case.&lt;/P&gt;&lt;P&gt;If you don't have SAS9.4M9 _and_ spark connector, I would use such approach but there are possible improvements: use parquet format.&lt;/P&gt;&lt;P&gt;And yes you can use parquet even with old SAS versions, via DuckDB -- look after dudckdb + ODBC&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Once you have set up ODBC DSN you can use&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;proc sql;&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;connect to odbc(dsn='DuckDB_Parquet');&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;/* This command tells DuckDB to write the SAS-linked table directly to a Parquet file */&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;execute (&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;COPY (SELECT * FROM main.final_output)&amp;nbsp;&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;TO 'C:\exports\data_output.parquet' (FORMAT 'PARQUET')&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;) by odbc;&lt;/SPAN&gt; &lt;SPAN&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;disconnect from odbc;&lt;/SPAN&gt; &lt;SPAN&gt;quit;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Feb 2026 12:03:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Upload-to-Databricks-via-ODBC-is-very-slow/m-p/983329#M379473</guid>
      <dc:creator>belgeric</dc:creator>
      <dc:date>2026-02-11T12:03:13Z</dc:date>
    </item>
  </channel>
</rss>

