I have three data sets and I want to append of two them on each other then join the appended datasets to a third data set. See code below for sample data and existing process. I would like to get the same result as the Cust_ID_Check dataset in a single data step using the individual, company, and exist_check data sets.
data individual;
input Record_ID Cust_ID $5.;
datalines;
35 AD123
74 NEW
24 GH456
;
run;
data company;
input Record_ID Cust_ID $5.;
datalines;
21 YE789
62 AG&7
93 JI245
;
run;
data exist_check;
input Customer $4. CN_ID $5.;
datalines;
Sam AD123
Mia JI245
Jon YE789
Ben GH456
;
run;
Data TEST;
set individual (keep=Record_ID Cust_ID) company(keep=Record_ID Cust_ID);
run;
proc sql;
create table Cust_ID_Check as
select A.*, B.CN_ID
from TEST as A
left join exist_check as B
on (A.Cust_ID=B.CN_ID);
quit;
Yes, you can do the same thing with a data step merge, but the three files would first have to be sorted. i.e.,
proc sort data=individual; by Cust_ID; run; proc sort data=company; by Cust_ID; run; proc sort data=exist_check; by CN_ID; run; data want; merge individual company exist_check (drop=Customer in=inC rename=(CN_ID=Cust_ID)); by Cust_ID; if inC then CN_ID=Cust_ID; run;
Art, CEO, AnalystFinder.com
Yes, you can do the same thing with a data step merge, but the three files would first have to be sorted. i.e.,
proc sort data=individual; by Cust_ID; run; proc sort data=company; by Cust_ID; run; proc sort data=exist_check; by CN_ID; run; data want; merge individual company exist_check (drop=Customer in=inC rename=(CN_ID=Cust_ID)); by Cust_ID; if inC then CN_ID=Cust_ID; run;
Art, CEO, AnalystFinder.com
Hashes can make you avoid sort and accomplish in one datastep as you wanted:
data individual;
input Record_ID Cust_ID $5.;
datalines;
35 AD123
74 NEW
24 GH456
;
run;
data company;
input Record_ID Cust_ID $5.;
datalines;
21 YE789
62 AG&7
93 JI245
;
run;
data exist_check;
input Customer $4. CN_ID $5.;
datalines;
Sam AD123
Mia JI245
Jon YE789
Ben GH456
;
run;
data want;
if _n_=1 then do;
if 0 then do;set individual; set exist_check;end;
dcl hash H (dataset:'exist_check') ;
h.definekey ("CN_ID") ;
h.definedata ("CN_ID") ;
h.definedone () ;
end;
set individual company;
if h.find(key:Cust_ID) ne 0 then call missing(CN_ID);
drop customer;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.