BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Swapnil_21
Obsidian | Level 7

Hi,

 

I need to update only one column in master  based on transaction table. However, during proc sql update, its taking huge time. 

 

Below is Master table 

Account_IDAPPL_LOAN_PCT_GROSS_INCAPPL_LVRAPPL_NO_SPLITAPPLICATION_IDBASEL_RETAIL_CORP_CODEBROKER_FLG
RAM30000071.5384307690.6667400001700400001700CONSN
RAM30000154.29220.7368400004000400004000CONSN
RAM30000234.29220.7368400004000400004001CONSN

 

 

 

And Here is the Transaction Table 

 

ORIGINAL_VALUEMASKED_VALUE
RAM3000007ABC345890
RAM3000015ABC67848
RAM3000023

ABC77545

 

 

 

 

And Here is what is required 

 

Account_IDAPPL_LOAN_PCT_GROSS_INCAPPL_LVRAPPL_NO_SPLITAPPLICATION_IDBASEL_RETAIL_CORP_CODEBROKER_FLG
ABC3458901.5384307690.6667400001700400001700CONSN
ABC678484.29220.7368400004000400004000CONSN
ABC775454.29220.7368400004000400004001CONSN

 

 

Below code takes huge time 

 

proc sql;
update master as u
set  account_id=(select MASKED_VALUE from new as n
where u.ACCOUNT_ID=n.ORIGINAL_VALUE)
where u.ACCOUNT_ID in (select ORIGINAL_VALUE from new);
quit;

 

But this is taking huge time. Any help on alternatives is really appreciated

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Try Hash Table, if your memory of PC is big enough .

 

data master;
infile cards expandtabs truncover;
input (Account_ID	APPL_LOAN_PCT_GROSS_INC	APPL_LVR	APPL_NO_SPLIT	APPLICATION_ID	BASEL_RETAIL_CORP_CODE	BROKER_FLG) (: $40.);
cards;
RAM3000007	1.538430769	0.6667	400001700	400001700	CONS	N
RAM3000015	4.2922	0.7368	400004000	400004000	CONS	N
RAM3000023	4.2922	0.7368	400004000	400004001	CONS	N
;

 
data Transaction ; 
infile cards expandtabs truncover;
input (ORIGINAL_VALUE	MASKED_VALUE) (: $40.);
cards;
RAM3000007	ABC345890
RAM3000015	ABC67848
RAM3000023	ABC77545
;

data want;
 if _n_=1 then do;
  if 0 then set Transaction;
  declare hash h(dataset:'Transaction',hashexp:20);
  h.definekey('ORIGINAL_VALUE');
  h.definedata('MASKED_VALUE');
  h.definedone();
 end;
set master;
call missing(MASKED_VALUE);
rc=h.find(key:Account_ID);
drop rc ORIGINAL_VALUE Account_ID;
rename MASKED_VALUE=new_Account_ID;
run;

View solution in original post

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

First question. Is either of your tables located on a data base? Or are they both SAS data sets?

Swapnil_21
Obsidian | Level 7
Both are sas datasets
PeterClemmensen
Tourmaline | Level 20

How large are they?

PeterClemmensen
Tourmaline | Level 20

Also, are both data sets sorted ?

Swapnil_21
Obsidian | Level 7
Both are sorted.

Master table size 10003476
Transaction table size 13567892
PeterClemmensen
Tourmaline | Level 20

Then try this

 

data master;
input Account_ID :$10. APPL_LOAN_PCT_GROSS_INC APPL_LVR APPL_NO_SPLIT APPLICATION_ID BASEL_RETAIL_CORP_CODE $ BROKER_FLG $;
datalines;
RAM3000007 1.538430769 0.6667 400001700 400001700 CONS N
RAM3000015 4.2922      0.7368 400004000 400004000 CONS N
RAM3000023 4.2922      0.7368 400004000 400004001 CONS N
;

data transaction;
input ORIGINAL_VALUE :$10. MASKED_VALUE :$10.;
datalines;
RAM3000007 ABC345890
RAM3000015 ABC67848
RAM3000023 ABC77545
;

data master;
   modify master transaction(rename = ORIGINAL_VALUE = Account_ID);
   by Account_ID;
   Account_ID = MASKED_VALUE;
run;
Kurt_Bremser
Super User

It's a simple MERGE then:

data
  master_new
  unmasked
;
merge
  master (in=m)
  transaction (in=t)
;
by account_id;
if m;
if t
then account_id = masked_value;
else output unmasked;
output master_new;
drop masked_value;
run;

I have added an additional output dataset, since this looks like some kind of anonymization where no unmasked values will be wanted,

Ksharp
Super User

Try Hash Table, if your memory of PC is big enough .

 

data master;
infile cards expandtabs truncover;
input (Account_ID	APPL_LOAN_PCT_GROSS_INC	APPL_LVR	APPL_NO_SPLIT	APPLICATION_ID	BASEL_RETAIL_CORP_CODE	BROKER_FLG) (: $40.);
cards;
RAM3000007	1.538430769	0.6667	400001700	400001700	CONS	N
RAM3000015	4.2922	0.7368	400004000	400004000	CONS	N
RAM3000023	4.2922	0.7368	400004000	400004001	CONS	N
;

 
data Transaction ; 
infile cards expandtabs truncover;
input (ORIGINAL_VALUE	MASKED_VALUE) (: $40.);
cards;
RAM3000007	ABC345890
RAM3000015	ABC67848
RAM3000023	ABC77545
;

data want;
 if _n_=1 then do;
  if 0 then set Transaction;
  declare hash h(dataset:'Transaction',hashexp:20);
  h.definekey('ORIGINAL_VALUE');
  h.definedata('MASKED_VALUE');
  h.definedone();
 end;
set master;
call missing(MASKED_VALUE);
rc=h.find(key:Account_ID);
drop rc ORIGINAL_VALUE Account_ID;
rename MASKED_VALUE=new_Account_ID;
run;
Swapnil_21
Obsidian | Level 7

@Ksharp  This Hash solution is working fine. Need one more help  on the same, In my master, I have few more columns which I need to mask with Masked_value from transaction, then in that case how can i generalize this solution. See example as beow 

Master 

Account_ID APPL_LOAN_PCT_GROSS_INC APPL_LVR APPL_NO_SPLIT APPLICATION_ID BASEL_RETAIL_CORP_CODE BROKER_FLG Customer_id
RAM3000007 1.538430769 0.6667 400001700 400001700 CONS N 545966
RAM3000015 4.2922 0.7368 400004000 400004000 CONS N 345182
RAM3000023 4.2922 0.7368 400004000 400004001 CONS N 773280

 

Transaction

 

 

ORIGINAL_VALUE MASKED_VALUE
RAM3000007 ABC345890
RAM3000015 ABC67848
RAM3000023 ABC77545
545966 884597
345182 489012
773280 771811

 

Output Required :

 

Account_ID APPL_LOAN_PCT_GROSS_INC APPL_LVR APPL_NO_SPLIT APPLICATION_ID BASEL_RETAIL_CORP_CODE BROKER_FLG Customer_id
ABC345890 1.538430769 0.6667 400001700 400001700 CONS N 884597
ABC67848 4.2922 0.7368 400004000 400004000 CONS N 489012
ABC77545 4.2922 0.7368 400004000 400004001 CONS N

771811

 

 

 

Thank you so  much in advance

Kurt_Bremser
Super User

You can use multiple lookups from the hash:

rc1 = h.find(key:Account_ID);
if rc1 = 0 then account_id = masked_value;
rc2 = h.find(key:customer_id);
if rc2 = 0 then customer_id = masked_value;
if rc1 or rc2 /* if either find failed */
then output unmasked;
else output want;
drop rc1 rc2;
run;

Note that I removed the RENAME from @Ksharp 's code.

Ksharp
Super User
As Kurt said.

rc1 = h.find(key:Account_ID);
if rc1 = 0 then account_id = masked_value;
rc2 = h.find(key:customer_id);
if rc2 = 0 then customer_id = masked_value;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1840 views
  • 6 likes
  • 4 in conversation