Hello SAS Community,
I am trying to do something that I think should be easy, but I have not stumbled upon the right search terms to find an example. I have 2 datasets. The first is my set of records. The second is a reference dataset. For example:
Dataset 1:
RECORD   STATE   COUNTYFIPS
001            AL           005
002            AL           005
003            AL           005
004            AL           133
005            AK           020
006           AK            020
007           AK            020
DATASET 2: a reference dataset containing the STATE, FIPS, and the rural/urban code for all 50 states.
STATE   COUNTYFIPS   RU_CODE
AL          001                    3
AL          003                    4
AL          005                    6
. 
.
.
AL         133                   4
AK         013                   6
AK         016                   6
AK         020                   3
.
.
.
DATASET NEED:
RECORD STATE COUNTYFIPS RU_CODE
001 AL 005 6
002 AL 005 6
I know I have to use a condition where the STATE and FIPS match between the 2 datasets, but how do I pull over just the RU_CODE code that matches for each record when I have over 17K records, across 50 states. My dataset has dozens of records that are in the same state and county.
Can someone point me in the right direction?
Try this:
data need;
set ds1;
if _n_ = 1
then do;
  length ru_code $1; /* define as needed */
  declare hash d2 (dataset:"ds2");
  d2.definekey("state","countyfips");
  d2.definedata("ru_code");
  d2.definedone();
end;
rc = d2.find();
drop rc;
run;
It does not need any sorting of the datasets.
Code is untested. For tested code, provide example data in usable form (data steps with datalines).
In your dataset 2, the ru_code for state = "AL" and countyfips = "005" is "6", but in your need dataset you have a value of "3" for that particular combination. What is your rule for matching?
Try this:
data need;
set ds1;
if _n_ = 1
then do;
  length ru_code $1; /* define as needed */
  declare hash d2 (dataset:"ds2");
  d2.definekey("state","countyfips");
  d2.definedata("ru_code");
  d2.definedone();
end;
rc = d2.find();
drop rc;
run;
It does not need any sorting of the datasets.
Code is untested. For tested code, provide example data in usable form (data steps with datalines).
Thank you so much! I'm checking it now, but it looks like it worked. I've never seen the "declare hash" code that you used, but it seems to have worked brilliantly!
I really appreciate your time, and thanks for teaching me something new!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.