BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Suja
Fluorite | Level 6

Hi,

 

My below query is taking around 40 minutes to run for 1million records in All_account. my subacc1 has 80 records, subacc2 has 47 records, subacc3 has 68 records and subacc4 has 75 records. I want to reduce the processing time. Any help would be much appreciated. Thanks for your help.

 

 

proc sql;
create table account_new as
select a.*,
(case when exists (select 1 from subacc1 b where a.AccountNo = b.account_no)
then 1 else 0
end) as subacc1,
(case when exists (select 1 from subacc2 c where a.AccountNo = c.account_no)
then 1 else 0
end) as subacc2,
(case when exists (select 1 from subacc3 d where a.AccountNo = d.account_no)
then 1 else 0
end) as subacc3,
(case when exists (select 1 from subacc4 e where a.AccountNo = e.account_no)
then 1 else 0
end) as subacc4
from All_account a;
quit;

1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12

this is good case for using hash technique. I made data using Sashelp.class so that you can try out. here we are making flags based on name present in particular dataset 

 

 

data class_female;
 set sashelp.class(where=(sex='F'));
 run;
 
  data class_male;
 set sashelp.class(where=(sex='M'));
 run;
 
 
   data want;
if _n_=1 then do;
if _n_ = 0 then set sashelp.class;
dcl hash h1(dataset: 'class_female');
 h1.definekey("Name");
 h1.definedone();
 dcl hash h2(dataset: 'class_male');
 h2.definekey("Name");
 h2.definedone();
 end;
 set sashelp.class;
 call missing(female_flag);
 if h1.find() = 0 then female_flag =1;
 else female_flag = 0 ;
 if h2.find() = 0 then male_flag =1;
 else male_flag = 0 ;
 run;
 

 

View solution in original post

7 REPLIES 7
kiranv_
Rhodochrosite | Level 12

this is good case for using hash technique. I made data using Sashelp.class so that you can try out. here we are making flags based on name present in particular dataset 

 

 

data class_female;
 set sashelp.class(where=(sex='F'));
 run;
 
  data class_male;
 set sashelp.class(where=(sex='M'));
 run;
 
 
   data want;
if _n_=1 then do;
if _n_ = 0 then set sashelp.class;
dcl hash h1(dataset: 'class_female');
 h1.definekey("Name");
 h1.definedone();
 dcl hash h2(dataset: 'class_male');
 h2.definekey("Name");
 h2.definedone();
 end;
 set sashelp.class;
 call missing(female_flag);
 if h1.find() = 0 then female_flag =1;
 else female_flag = 0 ;
 if h2.find() = 0 then male_flag =1;
 else male_flag = 0 ;
 run;
 

 

Suja
Fluorite | Level 6

Thanks kiranv. I think I can't use the hash table in one dataset like your code. Because I'm joining with four different datasets.

I have used the hash table logic but calling the macro 4 times as below. Please let me know if there is any better solution.

 

Thanks in advance!


%macro hash_macro (hash_table, in_table, out_table, keyvar, renamevar);

data &out_table. (drop=&renamevar.);
set &in_table.;
if _n_ = 0 then set &hash_table.;
if _n_ = 1 then do;
declare hash ht (dataset: "&hash_table. (rename = (&renamevar. = &keyvar. ))",
ordered: "A");
ht.definekey ("&keyvar.");
ht.definedata ("&keyvar.");
ht.definedone ( );
end;
if ht.find() = 0 then do; &hash_table. = 1; output; end;
else do; &hash_table. = 0; output; end;
run;

%mend;
/* call hash_macro for account table*/
%hash_macro(subacc1, all_account, account1, AccountNo, account_no);
%hash_macro(subacc2, account1, account2, AccountNo, account_no);
%hash_macro(subacc3, account2, account3, AccountNo, account_no);
%hash_macro(subacc4, account3, account_new, AccountNo, account_no);

kiranv_
Rhodochrosite | Level 12

if you see, I am using 2 datasets(lookup/small datasets) and you can use 4. you are creating four tables in your macro code, which is unlike your first table/intial code.

Suja
Fluorite | Level 6

Thanks for correcting me. The below code is working fine.

 

data account_new;
if _n_=1 then do;
if _n_ = 0 then set All_account;
dcl hash h1(dataset: "subacc1 (rename = (account_no = AccountNo )");
h1.definekey("AccountNo");
h1.definedone();
dcl hash h2(dataset: "subacc2 (rename = (account_no = AccountNo )");
h2.definekey("AccountNo");
h2.definedone();
dcl hash h3(dataset: "subacc3 (rename = (account_no = AccountNo )");
h3.definekey("AccountNo");
h3.definedone();
dcl hash h4(dataset: "subacc4 (rename = (account_no = AccountNo )");
h4.definekey("AccountNo");
h4.definedone();
end;
set All_account;
if h1.find() = 0 then subacc1 =1;
else subacc1 = 0 ;
if h2.find() = 0 then subacc2 =1;
else subacc2 = 0 ;
if h3.find() = 0 then subacc3 =1;
else subacc3 = 0 ;
if h4.find() = 0 then subacc4 =1;
else subacc4 = 0 ;
run;

kiranv_
Rhodochrosite | Level 12

Just curious to know, how much time the hash solution took

Suja
Fluorite | Level 6

My old query took 47 minutes and hash table took less than a minute (3.16 seconds).

novinosrin
Tourmaline | Level 20

I don't think you need 2 hash tables nor is hash the best solution?I'm not certain but can't  confirm until I pplay around

 

Would it be possible to provide some representative samples and the required output for us to work ?

 

Let me admit, I'm too lazy to assume, simulate to be totally honest.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2057 views
  • 1 like
  • 3 in conversation