DATA Step, Macro, Functions and more

Vlook up function in SAS

Accepted Solution Solved
Reply
Contributor SKP
Contributor
Posts: 25
Accepted Solution

Vlook up function in SAS

Hi,

 

I have 3 variables in a dataset 1.  The 3 variables are Profit Center (e.g. 123456), Product (e.g. A100) and 2016 expense total (e.g. $100,000).  I have another 2 variables in a dataset 2; Profit Center and Group Name (e.g. ABCD).

I'm trying to pull Group Name from dataset 2 to dataset 1 based on Profit Center, like we can do with vlook up function in excel.  Can someone please help me code this?

I have repeated records in dataset 1, meaning same Profit Center appears multiple times.

 

Thanks!


Accepted Solutions
Solution
‎07-04-2016 02:17 AM
Super User
Posts: 11,335

Re: Vlook up function in SAS

The proc format approach doesn't care at all about how many data records are involved. It essentially is the same as saying "show a value with 2 decimal places". Other approaches, either with Proc Sql or a hash object actually look at data set.

 

The Proc Sql method is basically a Join on a key value and looks something like

Proc Sql;
   create table want as
   select dataset.*, lookup.value
   from dataset left join lookup
      on dataset.keyvariable=lookup.keyvariable;
quit;

where dataset represents the name of your dataset, lookup is the name of the dataset with the value you want to bring in and keyvariable is a variable that contains the identifying values. The name of the keyvariable does not have to be the same but the values do. If there are capitalization issues in spelling (city vs City vs CITY) then you could use UPCASE function to get them to match.

View solution in original post


All Replies
Super User
Posts: 19,767

Re: Vlook up function in SAS

You have quite a few options. My personal recommendation is proc format. 

 

This paper covers many of them. For a beginner I would recommend either a SQL join or proc format. 

 

http://www2.sas.com/proceedings/forum2008/095-2008.pdf

Contributor SKP
Contributor
Posts: 25

Re: Vlook up function in SAS

Hi Reeza,

 

Thank you so much for the link.  As I did nt even know how I can start, it definitely helped me.  But I researched Proc Format and it seems that's good if the DS has ~50K records?  I have more than 1M records, so not sure if this is good option.  What would you recomment to do vlookup in large data set in SAS?  

 

Thanks,

SKP

Solution
‎07-04-2016 02:17 AM
Super User
Posts: 11,335

Re: Vlook up function in SAS

The proc format approach doesn't care at all about how many data records are involved. It essentially is the same as saying "show a value with 2 decimal places". Other approaches, either with Proc Sql or a hash object actually look at data set.

 

The Proc Sql method is basically a Join on a key value and looks something like

Proc Sql;
   create table want as
   select dataset.*, lookup.value
   from dataset left join lookup
      on dataset.keyvariable=lookup.keyvariable;
quit;

where dataset represents the name of your dataset, lookup is the name of the dataset with the value you want to bring in and keyvariable is a variable that contains the identifying values. The name of the keyvariable does not have to be the same but the values do. If there are capitalization issues in spelling (city vs City vs CITY) then you could use UPCASE function to get them to match.

Contributor SKP
Contributor
Posts: 25

Re: Vlook up function in SAS

ballardw
Thank you so much for your comment. I tried it and it worked as the way I wanted. If I need to look up a value based on multiple valuables, can I list them in "keyvariable" in the formula?

Thank you again.
SKP
Super User
Posts: 19,767

Re: Vlook up function in SAS

Yes, they can be added into the join condition. 

 

For example:

 

a.keyvar1=b.keyvar1
AND 
a.keyvar2=b.keyvar2
AND
...
;
Contributor SKP
Contributor
Posts: 25

Re: Vlook up function in SAS

Reeza,

 

Thank you!  This definitely helps.

 

SKP

Super User
Posts: 10,018

Re: Vlook up function in SAS

Give you an example:



data dataset1;
input ProfitCenter Product $ expense;
cards;
1234 A100 100
1231 A100 100
;
run;
data dataset2;
input ProfitCenter GroupName $;
cards;
1234 ABCD
;
run;

data want;
if _n_=1 then do;
 if 0 then set dataset2;
 declare hash h(dataset:'dataset2',hashexp:20);
 h.definekey('ProfitCenter');
 h.definedata('GroupName ');
 h.definedone();
end;
 set dataset1; 
 call missing(GroupName);
 rc=h.find();
 drop rc;
run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 339 views
  • 2 likes
  • 4 in conversation