I am trying to bulk load data from SAS (9.4M4) to Greenplum (6.4) using pipes. The documentation seems to indicate that it should be possible. However, I keep getting an error saying "Named pipes not currently available." SAS is running on Windows Server 2016, while Greenplum is running on Linux. I am able to load the data to Greenplum using gpfdist and the flat file method, but I am trying to improve the speed. The documentation indicates that I could get about 30% more speed using pipes.
My SAS code looks like this:
libname mydatalib "c:\sasfiles\";
libname mydblib greenplm DSN='<name of odbc dsn' schema=<name of schema> user=<gp user name> password='<gp user password>';
proc sql;
connect using mydblib;
create table mydblib.my_test_table
(BULKLOAD=YES
BL_USE_PIPE=YES
BL_HOST='<ip address of greenplum server>'
BL_PORT='<port of greenplum server>'
)
as
select * from mydatalib.sasfile;
quit;
You are right. And it's in the documentation too.
Restrictions: Applies to UNIX environments only [Greenplum, HAWQ]
I am a bit confused as to why you use the greenplm engine, but use ODBC connection parameters.
Do have SAS/ACCESS to greenplum? You can run proc setinit to verify that.
I don't think ODBC supports the BL_USE_PIPE option.
The doc states
DSN=<'>Greenplum-data-source<'>
not what you wrote:
DSN='<name of odbc dsn'
Sorry, no other idea. You might want to ask tech support.
@ChrisNZ thanks for the thoughts offered. By the way, I did some further digging and it seems that for Greenplum, SAS uses the same DataDirect ODBC driver that I was using in the first instance.
Note that I did find the following text in a 2012 SAS Global Forum document. Any way to find out if that is still the case (since my load is from a Windows environment to Greenplum which is on a Linux environment).
GREENPLUM
Bulkload Support Using Pipes on UNIX
Bulkloading capabilities have been extended on UNIX platforms to loadtables load pipes. To leverage this new feature, you must add the BL_USE_PIPE=YES option when starting the bulkload facility:
data user.dbtab (bulkload=yes
bl_format='CSV'
bl_quote='"'
bl_host='mypc.na.acme.com'
bl_port='8080'
bl_protocol='gpfdist'
BL_USE_PIPE=YES);
set work.worktab;
run;
We have seen a 30 percent average performance gain in load speed when using pipes rather than traditional flat files. Greenplum does not currently support pipes on Windows platforms.
You are right. And it's in the documentation too.
Restrictions: Applies to UNIX environments only [Greenplum, HAWQ]
I missed it too. 🙂
Considering the number of Greenplum customers, and even lower numbers accessing it from Windows, I would not expect an update.
Unless maybe a very large new sale is at stake. (Making many assumptions here).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.