BookmarkSubscribeRSS Feed
EmilyAV
Calcite | Level 5

Hello,

I have a dataset with thousands of observations, and need to reclassify hundreds of observations in three different variables, including the lookup/key variable. I'm trying to use a hash table but my current code simply adds the hash table variables to the old dataset and (mostly) codes them as missing. What am I doing wrong?

 

*Hash table;

   data reclass;
   input key new_key new_var1 new_var2;
datalines;
      2 1 . 9
     17 1 . 17
     32 1 . 3
     33 1 . 14
     41 1 . 15
;
run;

 

*Applying hash table to dataset old;

data output;
length key new_key new_var1 new_var2 8.;

 

if 0 then set old reclass;

if _N_ = 1 then do;
Declare hash w (dataset:'reclass', ordered:'Y');
w.defineKey ("key");
w.defineData("key","new_key", "new_var1", "new_var2");
w.defineDone();
w.replace(key:17, data: 17, data:1, data:., data:7);
w.replace(key:2, data: 2, data:1, data:., data:9);
w.replace(key:32, data:32, data:1, data:., data:3);
w.replace(key:33, data:33, data:1, data:., data:14);
w.replace(key:41, data:41, data:1, data:., data:15);
w.output (dataset:'test');
w.delete ();
end;

set old;
run;

2 REPLIES 2
Tom
Super User Tom
Super User

Sounds like your want to do:

Load a small table  into a hash object. You seem to have provided an example of that table.  Then you have a large table that you want to modify by checking if the current value matches an entry in the hash object. You have not provided any example of this table.

 

So your pattern should be something like:

  • Load hash object
  • SET main table
  • use FIND() method to check if variable value is in the hash
  • potentially copy the values from the variables defined in the hash back into the variables in the main table(if the names are different)
Patrick
Opal | Level 21

If I understand right what you're trying to achieve then a merge is may-be simpler. I've used for your reclass table the same variable names than in the have table. That's not a must but makes things even simpler. IF in the step merging the tables there is a match with the reclass table then the values from this table will overwrite the ones in the have table. If the variable names would be different then you'd have to code for it.

Also consider to not replace the key but to introduce some additional group_key or the like. 

data reclass;
  input key group_key var1 var2;
  datalines;
2 1 . 9
17 1 . 17
32 1 . 3
33 1 . 14
41 1 . 15
;

data have;;
  set reclass end=last;
  var1=888;
  var2=999;
  output;
  if last then
    do;
      key=10000;
      output;
    end;
run;

proc sort data=reclass;
  by key;
run;
proc sort data=have;
  by key;
run;

data want;
  merge have(in=ina) reclass;
  by key;
  if ina;
run;

Patrick_0-1633102135943.png