Hi,
I have two tables, A and B. I want to create a variable on Table A( TableB ID?) that scans iDS on table B and returns the ID if a similar one is found or puts Na, when none is available.
Example.
Table A
ID fruit colour TableB ID?
165 avocado orange Na
111 orange orange 111
111 cabbage green 111
123 mango green Na
333 strawberry red 333
333 strawberry red 333
555 berry orange Na
Table B
ID fruit colour
555 avocado Orange
111 orange Orange
178 cabbage green
236 mango green
333 strawberry red
333 strawberry red
148 berry orange
Thanks.
Message was edited by: zetter rweza I have edited the table to make things clear/ I just want to look at ID don't mind about fruit type or colour, please help.
It sounds like you are used to using Excel. In a programming language you would typical run some step to generate the new data.
1) You could do a data step merge of tablea to tableb and create the new variable based on the existence of a record in tableb.
data want ;
merge tablea (in=in1) tableb(in=in2 keep=id);
by id;
if in1;
if in2 then tablebid=cats(id);
else tablebid='NA';
run;
2) You could use an SQL join instead.
proc sql noprint;
create table want2 as
select a.*
, case when exists (select id from tableb b where a.id=b.id) then cats(id)
else 'NA'
end as TABLEBID
from tablea a
;
quit;
3) Or to be more like the VLOOKUP() function you could convert TABLEB into a format and use the PUT() function.
data format ;
set tableb end=eof;
fmtname='TABLEB';
start=id;
label=cats(id);
output;
if eof then do;
start=.; label='NA'; hlo='O'; output;
end;
run;
proc format cntlin=format;
run;
data want3;
set tablea ;
tablebid=put(id,tableb.);
run;
Note that for options (1) and (3) you will need to eliminate duplicate records from TABLEB for the same value of ID.
proc sort data=tableb nodupkey; by id; run;
Also note that in your example you incorrectly list ID=555 in TABLEA as not being found in TABLEB.
Generally you can use a proc format or SQL merge for this, but you're logic doesn't quite make sense to me.
You're not merging solely on ID, but on ID and fruit? or is it ID, fruit, and colour?
I'm just interested in ID, not fruit type or colour. Please help, thanks.
data TableA;
input ID $ fruit $ colour $;
n=_n_;
datalines;
165 avocado orange
111 orange orange
111 cabbage green
123 mango green
333 strawberry red
333 strawberry red
555 berry orange
;
data TableB;
input ID $ fruit $ colour $;
datalines;
555 avocado Orange
111 orange Orange
178 cabbage green
236 mango green
333 strawberry red
333 strawberry red
148 berry orange
;
proc sql;
create table tableb_unique as
select distinct id,fruit,colour from tableb;
quit;
proc sql;
create table want as
select a.id,a.fruit,a.colour , case when a.id = b.id then b.id else 'Na' end as TableB_id
from TableA a left join tableb_unique b
on a.id=b.id
order by n;
quit;
proc print;
run;
It sounds like you are used to using Excel. In a programming language you would typical run some step to generate the new data.
1) You could do a data step merge of tablea to tableb and create the new variable based on the existence of a record in tableb.
data want ;
merge tablea (in=in1) tableb(in=in2 keep=id);
by id;
if in1;
if in2 then tablebid=cats(id);
else tablebid='NA';
run;
2) You could use an SQL join instead.
proc sql noprint;
create table want2 as
select a.*
, case when exists (select id from tableb b where a.id=b.id) then cats(id)
else 'NA'
end as TABLEBID
from tablea a
;
quit;
3) Or to be more like the VLOOKUP() function you could convert TABLEB into a format and use the PUT() function.
data format ;
set tableb end=eof;
fmtname='TABLEB';
start=id;
label=cats(id);
output;
if eof then do;
start=.; label='NA'; hlo='O'; output;
end;
run;
proc format cntlin=format;
run;
data want3;
set tablea ;
tablebid=put(id,tableb.);
run;
Note that for options (1) and (3) you will need to eliminate duplicate records from TABLEB for the same value of ID.
proc sort data=tableb nodupkey; by id; run;
Also note that in your example you incorrectly list ID=555 in TABLEA as not being found in TABLEB.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.