BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SKP
Calcite | Level 5 SKP
Calcite | Level 5

Hi,

 

I have 3 variables in a dataset 1.  The 3 variables are Profit Center (e.g. 123456), Product (e.g. A100) and 2016 expense total (e.g. $100,000).  I have another 2 variables in a dataset 2; Profit Center and Group Name (e.g. ABCD).

I'm trying to pull Group Name from dataset 2 to dataset 1 based on Profit Center, like we can do with vlook up function in excel.  Can someone please help me code this?

I have repeated records in dataset 1, meaning same Profit Center appears multiple times.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The proc format approach doesn't care at all about how many data records are involved. It essentially is the same as saying "show a value with 2 decimal places". Other approaches, either with Proc Sql or a hash object actually look at data set.

 

The Proc Sql method is basically a Join on a key value and looks something like

Proc Sql;
   create table want as
   select dataset.*, lookup.value
   from dataset left join lookup
      on dataset.keyvariable=lookup.keyvariable;
quit;

where dataset represents the name of your dataset, lookup is the name of the dataset with the value you want to bring in and keyvariable is a variable that contains the identifying values. The name of the keyvariable does not have to be the same but the values do. If there are capitalization issues in spelling (city vs City vs CITY) then you could use UPCASE function to get them to match.

View solution in original post

7 REPLIES 7
Reeza
Super User

You have quite a few options. My personal recommendation is proc format. 

 

This paper covers many of them. For a beginner I would recommend either a SQL join or proc format. 

 

http://www2.sas.com/proceedings/forum2008/095-2008.pdf

SKP
Calcite | Level 5 SKP
Calcite | Level 5

Hi Reeza,

 

Thank you so much for the link.  As I did nt even know how I can start, it definitely helped me.  But I researched Proc Format and it seems that's good if the DS has ~50K records?  I have more than 1M records, so not sure if this is good option.  What would you recomment to do vlookup in large data set in SAS?  

 

Thanks,

SKP

ballardw
Super User

The proc format approach doesn't care at all about how many data records are involved. It essentially is the same as saying "show a value with 2 decimal places". Other approaches, either with Proc Sql or a hash object actually look at data set.

 

The Proc Sql method is basically a Join on a key value and looks something like

Proc Sql;
   create table want as
   select dataset.*, lookup.value
   from dataset left join lookup
      on dataset.keyvariable=lookup.keyvariable;
quit;

where dataset represents the name of your dataset, lookup is the name of the dataset with the value you want to bring in and keyvariable is a variable that contains the identifying values. The name of the keyvariable does not have to be the same but the values do. If there are capitalization issues in spelling (city vs City vs CITY) then you could use UPCASE function to get them to match.

SKP
Calcite | Level 5 SKP
Calcite | Level 5
ballardw
Thank you so much for your comment. I tried it and it worked as the way I wanted. If I need to look up a value based on multiple valuables, can I list them in "keyvariable" in the formula?

Thank you again.
SKP
Reeza
Super User

Yes, they can be added into the join condition. 

 

For example:

 

a.keyvar1=b.keyvar1
AND 
a.keyvar2=b.keyvar2
AND
...
;
SKP
Calcite | Level 5 SKP
Calcite | Level 5

Reeza,

 

Thank you!  This definitely helps.

 

SKP

Ksharp
Super User
Give you an example:



data dataset1;
input ProfitCenter Product $ expense;
cards;
1234 A100 100
1231 A100 100
;
run;
data dataset2;
input ProfitCenter GroupName $;
cards;
1234 ABCD
;
run;

data want;
if _n_=1 then do;
 if 0 then set dataset2;
 declare hash h(dataset:'dataset2',hashexp:20);
 h.definekey('ProfitCenter');
 h.definedata('GroupName ');
 h.definedone();
end;
 set dataset1; 
 call missing(GroupName);
 rc=h.find();
 drop rc;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1178 views
  • 2 likes
  • 4 in conversation