Help using Base SAS procedures

Vlookup code please help

Accepted Solution Solved
Reply
Contributor
Posts: 36
Accepted Solution

Vlookup code please help

Hi,


I have two tables, A and B. I want to create a variable on Table A( TableB ID?) that scans iDS on table B and returns the ID if a similar one is found or puts Na, when none is available.


Example.


Table A


ID       fruit                     colour           TableB ID?

165     avocado             orange         Na

111    orange                 orange        111

111    cabbage             green            111

123     mango                 green            Na

333    strawberry             red                333

333    strawberry             red                333

555      berry                   orange          Na

Table B

ID    fruit                 colour  

           

555  avocado         Orange           

111  orange            Orange          

178  cabbage         green            

236   mango            green             

333  strawberry       red                 

333  strawberry       red                  

148  berry                 orange             

  Thanks.

Message was edited by: zetter rweza I have edited the table to make things clear/ I just want to look at ID don't mind about fruit type or colour, please help.


Accepted Solutions
Solution
‎09-06-2014 08:57 AM
Super User
Super User
Posts: 6,502

Re: Vlookup code please help

It sounds like you are used to using Excel.  In a programming language you would typical run some step to generate the new data.

1) You could do a data step merge of tablea to tableb and create the new variable based on the existence of a record in tableb.


data want ;

merge tablea (in=in1) tableb(in=in2 keep=id);

by id;

if in1;

if in2 then tablebid=cats(id);

else tablebid='NA';

run;


2) You could use an SQL join instead.

proc sql noprint;

create table want2 as

select a.*

      , case when exists (select id from tableb b where a.id=b.id) then cats(id)

             else 'NA'

        end as TABLEBID

from tablea a

;

quit;

3) Or to be more like the VLOOKUP() function you could convert TABLEB into a format and use the PUT() function.

data format ;

  set tableb end=eof;

  fmtname='TABLEB';

  start=id;

  label=cats(id);

  output;

  if eof then do;

    start=.; label='NA'; hlo='O'; output;

  end;

run;

proc format cntlin=format;

run;

data want3;

  set tablea ;

  tablebid=put(id,tableb.);

run;

Note that for options (1) and (3) you will need to eliminate duplicate records from TABLEB for the same value of ID.

proc sort data=tableb nodupkey; by id; run;

Also note that in your example you incorrectly list ID=555 in TABLEA as not being found in TABLEB.

View solution in original post


All Replies
Super User
Posts: 17,868

Re: Vlookup code

Generally you can use a proc format or SQL merge for this, but you're logic doesn't quite make sense to me.

You're not merging solely on ID, but on ID and fruit? or is it ID, fruit, and colour?

Contributor
Posts: 36

Re: Vlookup code

I'm just interested in ID, not fruit type or colour. Please help, thanks.

Trusted Advisor
Posts: 1,204

Re: Vlookup code please help

data TableA;
input ID $      fruit $   colour  $;
n=_n_;
datalines;
165    avocado     orange
111    orange      orange
111    cabbage     green
123    mango       green
333    strawberry  red
333    strawberry  red
555    berry       orange
;

data TableB;
input ID  $     fruit $   colour  $;
datalines; 
555  avocado         Orange
111  orange          Orange
178  cabbage         green
236  mango           green
333  strawberry      red
333  strawberry      red
148  berry           orange
;

proc sql;
create table tableb_unique as
select distinct id,fruit,colour from tableb;
quit;

proc sql;
create table want as
select a.id,a.fruit,a.colour , case when a.id = b.id then b.id else 'Na' end as TableB_id
from TableA a left join tableb_unique b
on a.id=b.id
order by n;
quit;

proc print;

run;

Solution
‎09-06-2014 08:57 AM
Super User
Super User
Posts: 6,502

Re: Vlookup code please help

It sounds like you are used to using Excel.  In a programming language you would typical run some step to generate the new data.

1) You could do a data step merge of tablea to tableb and create the new variable based on the existence of a record in tableb.


data want ;

merge tablea (in=in1) tableb(in=in2 keep=id);

by id;

if in1;

if in2 then tablebid=cats(id);

else tablebid='NA';

run;


2) You could use an SQL join instead.

proc sql noprint;

create table want2 as

select a.*

      , case when exists (select id from tableb b where a.id=b.id) then cats(id)

             else 'NA'

        end as TABLEBID

from tablea a

;

quit;

3) Or to be more like the VLOOKUP() function you could convert TABLEB into a format and use the PUT() function.

data format ;

  set tableb end=eof;

  fmtname='TABLEB';

  start=id;

  label=cats(id);

  output;

  if eof then do;

    start=.; label='NA'; hlo='O'; output;

  end;

run;

proc format cntlin=format;

run;

data want3;

  set tablea ;

  tablebid=put(id,tableb.);

run;

Note that for options (1) and (3) you will need to eliminate duplicate records from TABLEB for the same value of ID.

proc sort data=tableb nodupkey; by id; run;

Also note that in your example you incorrectly list ID=555 in TABLEA as not being found in TABLEB.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 551 views
  • 0 likes
  • 4 in conversation