Hi,
I have a requirement where I need to combine two datasets, but here is what is causing me trouble.
For example: master dataset
Emp_loc Emp_sal
USA 100000.00
Newyork 25000.00
California 25000.00
Texas 25000.00
Nevada 25000.00
Transaction dataset:
Emp_loc Emp_sal
Texas 30000.00
USA 105000.00
I want to update the master dataset, but need to keep the values of the EMP_Loc in the same order ie, after my change the dataset should look like this:
Updated Master dataset:
Emp_loc Emp_sal
USA 105000.00
Newyork 25000.00
California 25000.00
Texas 30000.00
Nevada 25000.00
Please share your ideas.
Thanks
Hash Table:
data master; input Emp_loc $ Emp_sal ; cards; USA 100000.00 Newyork 25000.00 California 25000.00 Texas 25000.00 Nevada 25000.00 ; run; data Transaction; input Emp_loc $ Emp_sal ; cards; Texas 30000.00 USA 105000.00 ; run; data want(drop=rc); if _n_ eq 1 then do; if 0 then set Transaction; declare hash ha(dataset:'Transaction'); ha.definekey('Emp_loc'); ha.definedata('Emp_sal'); ha.definedone(); end; set master; rc=ha.find(); run;
Ksharp
The hash code used by Ksharp above is more efficient.
However there is alternative approach as well by Merge statement, please try the below code
data dat1;
input Emp_loc $ Emp_sal;
order=_n_;
cards;
USA 100000.00
Newyork 25000.00
California 25000.00
Texas 25000.00
Nevada 25000.00
;
run;
data dat2;
input Emp_loc $ Emp_sal;
cards;
Texas 30000.00
USA 105000.00
;
run;
proc sort data = dat1;
by emp_loc;
run;
proc sort data = dat2;
by emp_loc;
run;
data all;
merge dat1(in=a) dat2(in=b);
by emp_loc;
run;
proc sort data = all;
by order;
run;
Hope this helps you.
Thanks,
Jagadish
From what I understand Hash tables are very useful in instances where you have a large table and a small lookup table. There is a point where hash tables become less effecient, so it is important that you undertake sufficient testing to ensure you are on the right track. Therefore I have assumed this is a sample and have provided the below alternative.
DATA MASTER;
INPUT EMP_LOC $ EMP_SAL ;
CARDS;
USA 100000.00
NEWYORK 25000.00
CALIFORNIA 25000.00
TEXAS 25000.00
NEVADA 25000.00
;
RUN;
DATA TRANSACTION;
INPUT EMP_LOC $ EMP_SAL ;
CARDS;
TEXAS 30000.00
USA 105000.00
;
RUN;
PROC SQL;
UPDATE MASTER
SET EMP_SAL = (SELECT EMP_SAL
FROM TRANSACTION
WHERE MASTER.EMP_LOC = TRANSACTION.EMP_LOC)
WHERE EMP_LOC IN (SELECT EMP_LOC
FROM TRANSACTION)
;
QUIT;
Regards,
Scott
Thank you all for sharing your thoughts!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.