I need to be able to call a C++ hashing function from the Boost library as this is not available in our current sas environment . SAS 9.4M3 . This is part of a project to implement LSH (Locality Sensitive hashing ) . I am a C++ novice so what I am missing might be easily resolved by someone more familiar with this technique .
I have followed a good example https://support.sas.com/kb/40/562.html which implements a user defined C function using Proc Proto and Proc Fcmp I could get working but it is just a C example .
proc proto package=work.proto_ds.cfcns;
link '/Data/FCP/2100_STG_RTTM/STG_RT_DS/myfactorial.so';
int myfactorial(int n) ;
run;
proc fcmp inlib=work.proto_ds outlib=work.fcmp_ds.sasfcns;
function sas_myfactorial(x);
return (myfactorial(x));
endsub;
quit;
options cmplib=(work.proto_ds work.fcmp_ds);
data _null_;
x=sas_myfactorial(6);
put x=;
run;
I tried to compile the same function in C++ using the below commands with some small changes
int myfactorial2(int n){
int i;
int p;
for(p=1, i=2; i<=n; p=p*i, i++);
return p;
}
This is the version of compiler being used in case this has something to do with it
g++ --version g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
g++ -L/Data/FCP/2100_STG_RTTM/STG_RT_DS -c -Wall -ansi -g -fPIC myfactorial2.cpp
g++ -shared -o myfactorial2.so -fPIC myfactorial2.o
But when I run the Proc Proto I get an ERROR: Unable to find function myfactorial2 . I have tried more basic functions but still the same error . What would be good would be to see is a C++ example similar to the C example . The boost hash function is a bit more complicated but should work . This link details the Proc Proto procedure https://support.sas.com/documentation/onlinedoc/base/91/proto.pdf
proc proto package=work.proto_ds.cfcns;
link '/Data/FCP/2100_STG_RTTM/STG_RT_DS/myfactorial2.so';
int myfactorial2(int n) ;
run;
NOTE: '/Data/FCP/2100_STG_RTTM/STG_RT_DS/myfactorial2.so' loaded from specified path.
ERROR: Unable to find function myfactorial2
NOTE: Prototypes saved to WORK.PROTO_DS.CFCNS.
NOTE: PROCEDURE PROTO used (Total process time):
real time 0.00 seconds
This is down to symbol name the compiler produced. The symbol inside the shared object for that function does not have the name myfactorial2. The compiler produces a different name nm(1) will be able to print it out.
$ nm fact.so
0000000000200df0 d _DYNAMIC
0000000000201000 d _GLOBAL_OFFSET_TABLE_
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
00000000000005a9 T _Z12myfactorial2i
0000000000000668 r __FRAME_END__
00000000000005ec r __GNU_EH_FRAME_HDR
0000000000201020 d __TMC_END__
0000000000201020 B __bss_start
w __cxa_finalize@@GLIBC_2.2.5
0000000000000560 t __do_global_dtors_aux
0000000000200de0 t __do_global_dtors_aux_fini_array_entry
0000000000200de8 d __dso_handle
0000000000200dd8 t __frame_dummy_init_array_entry
w __gmon_start__
0000000000201020 D _edata
0000000000201028 B _end
00000000000005dc t _fini
00000000000004b0 t _init
0000000000201020 b completed.7294
00000000000004f0 t deregister_tm_clones
00000000000005a0 t frame_dummy
0000000000000520 t register_tm_clones
The compiler ended up with the symbol _Z12myfactorial2i for your function. You'd need some additional syntax inside your C++ program to instruct the compiler to produce symbols that have some specific name that you can then load up the shared object into your SAS session.
See this page for some tips.
This is down to symbol name the compiler produced. The symbol inside the shared object for that function does not have the name myfactorial2. The compiler produces a different name nm(1) will be able to print it out.
$ nm fact.so
0000000000200df0 d _DYNAMIC
0000000000201000 d _GLOBAL_OFFSET_TABLE_
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
00000000000005a9 T _Z12myfactorial2i
0000000000000668 r __FRAME_END__
00000000000005ec r __GNU_EH_FRAME_HDR
0000000000201020 d __TMC_END__
0000000000201020 B __bss_start
w __cxa_finalize@@GLIBC_2.2.5
0000000000000560 t __do_global_dtors_aux
0000000000200de0 t __do_global_dtors_aux_fini_array_entry
0000000000200de8 d __dso_handle
0000000000200dd8 t __frame_dummy_init_array_entry
w __gmon_start__
0000000000201020 D _edata
0000000000201028 B _end
00000000000005dc t _fini
00000000000004b0 t _init
0000000000201020 b completed.7294
00000000000004f0 t deregister_tm_clones
00000000000005a0 t frame_dummy
0000000000000520 t register_tm_clones
The compiler ended up with the symbol _Z12myfactorial2i for your function. You'd need some additional syntax inside your C++ program to instruct the compiler to produce symbols that have some specific name that you can then load up the shared object into your SAS session.
See this page for some tips.
HI Simon Thanks for the reply as stated I am a C++ novice so what changes to this function would you make to resolve this issue below are the only lines of code in the myfactorial2.cpp . The only difference between this and the C example one is the 2 in myfactorial2 .
int myfactorial2(int n){
int i;
int p;
for(p=1, i=2; i<=n; p=p*i, i++);
return p;
}
Not Ideal but this worked
proc proto package=work.proto_ds.cfcns;
link '/Data/FCP/2100_STG_RTTM/STG_RT_DS/myfactorial2.so';
int _Z12myfactorial2i(int n) ;
run;
/* this bit is working */
proc fcmp inlib=work.proto_ds outlib=work.fcmp_ds.sasfcns;
function sas_myfactorial(x);
return (_Z12myfactorial2i(x));
endsub;
quit;
options cmplib=(work.proto_ds work.fcmp_ds);
data _null_;
x=sas_myfactorial(6);
put x=;
run;
Using the new function names as given by the compiler I could get my hash value back from the C++ boost function 🙂 Happy days
proc proto package=work.proto_ds.cfcns;
link '/Data/FCP/2100_STG_RTTM/STG_RT_DS/hash_string.so';
int _Z11hash_stringPc(char * ) ;
run;
proc fcmp inlib=work.proto_ds outlib=work.fcmp_ds.sasfcns;
function sas_hash_string(x $);
OUTARGS x;
return (_Z11hash_stringPc(x));
endsub;
quit;
options cmplib=(work.proto_ds work.fcmp_ds);
data _null_;
x=sas_hash_string("Robert");
put x=;
run;
last bit of the log
25 data _null_;
26 x=sas_hash_string("Robert");
NOTE: '/Data/FCP/2100_STG_RTTM/STG_RT_DS/hash_string.so' loaded from specified path.
27 put x=;
28 run;
x=-1899591635
If you want to have symbols that don't have all the C++ extra stuff you can do it with the extern statement. Something like this when you compile it will give you the myfactorial2 symbol you are after that you can load and link to inside your SAS session.
extern "C" int myfactorial2(int); int myfactorial2(int n){ int i; int p; for(p=1, i=2; i<=n; p=p*i, i++); return p; }
When you compile that you'll have the myfactorial2 symbol and you won't need to do the swap around.
ok that worked a treat thanks for your help really appreciate it as this was looking difficult
proc proto package=work.proto_ds.cfcns;
link '/Data/FCP/2100_STG_RTTM/STG_RT_DS/hash_string.so';
int hash_string(char * ) ;
run;
proc fcmp inlib=work.proto_ds outlib=work.fcmp_ds.sasfcns;
function sas_hash_string(x $);
OUTARGS x;
return (hash_string(x));
endsub;
quit;
options cmplib=(work.proto_ds work.fcmp_ds);
data _null_;
x=sas_hash_string("Robert");
put x=;
run;
37 data _null_;
38 x=sas_hash_string("Robert");
NOTE: '/Data/FCP/2100_STG_RTTM/STG_RT_DS/hash_string.so' loaded from specified path.
39 put x=;
40 run;
x=-1899591635
@Rob_Turvey While this is on my mind please take note of this link and place this link into a comment block in that code calling proc proto.
https://support.sas.com/kb/63/373.html
Once SAS is upgraded there to SAS 9.4 M6 or later there is some updates to the configuration that will need to occur to ensure that your code will keep working.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.