BookmarkSubscribeRSS Feed
fastandcurious
Obsidian | Level 7

Hello, beginner posting for the 1st time,

I want to create a new column that assigns weighted values to a large dataset using a vlookup table in SAS, rather than write hundreds of if-then statements. For example, I have LARGE dataset and each record has a unique groupid that specifies the stratified group it belongs to. (I have hundreds of groups and weighted values)

Vlookup table example:

 

fastandcurious_0-1673279787756.png

How can I create a new column where the weighted value is returned based off the string? (Example: If Unique ID=221 then Weight=17.49)

Desired output:

fastandcurious_2-1673279069957.png

 

I could use excel for this but I want to use SAS and improve my programming skills. Would array be the best function? What are other ways to do this in SAS? Thank you in advance!

 

4 REPLIES 4
Reeza
Super User

Here's a good paper that illustrates quite a few methods.

https://support.sas.com/resources/papers/proceedings15/2219-2015.pdf

 

  • Join (MERGE or SQL)
  • Format
  • Arrays
  • Key Indexing
  • Hash Table

For the start, Joins, Formats and Arrays can get you 99% of the way there. I've used those three almost exclusively for my entire SAS programming career.

 

 

Tom
Super User Tom
Super User

The BEST way depends on a lot on things like the size of two datasets and the types of the variables being created.

The most natural SAS way is to use a FORMAT to translate a code into a string.  If you need to create a numeric variable, like your WEIGHT example, then you will need to use an INFORMAT instead. But an INFORMAT needs a character value as the input.

 

Another natural way is to just to combine the two datasets.  Use a JOIN with PROC SQL.  Or MERGE using a data step.  Both of those require sorting by the key variable (but the SQL method will do the sorting for you) so if the datasets are large and not already sorted by the key variable that can be a performance issue.

 

A newer method in SAS is a HASH object. But that requires that the lookup table is small enough to fit into memory.  it does not require the other (potentially much larger dataset) to be sorted.

 

data group_wt;
  input group_id $ weight ;
cards;
111 42.38
121 14.32
;

data have;
  input id $ group_id $ date :date.;
  format date date9.;
cards;
1 121 01JAN2022
2 100 01FEB2022
3 111 01MAR2023
;

data want;
  set have;
  if _n_=1 then do;
    if 0 then set group_wt(keep=group_id weight);
    declare hash vlookup(dataset:"group_wt(keep=group_id weight");
    _error_ = vlookup.definekey("group_id");
    _error_ = vlookup.definedata(all:'yes');
    _error_ = vlookup.definedone();
  end;
  if vlookup.find() then call missing(of weight);
run;

 

 

fastandcurious
Obsidian | Level 7

Thank you so much! This is helpful.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 742 views
  • 2 likes
  • 3 in conversation