DATA Step, Macro, Functions and more

hash tables

Accepted Solution Solved
Reply
Contributor fre
Contributor
Posts: 35
Accepted Solution

hash tables

Hi Community,

 

I've searched for a solution but I didn't find it.  Hopefully someone can give me some advice.

 

I have an input_table which contains a KEY variable (unique value) with 1 data variable "variable_to_be_corrected".

 

Based on another table, tableX, (which will be the hash table) I want to lookup the KEY and correct the variable "variable_to_be_corrected"  based on the value for the variable 'correction' in the hash table tableX.

 

How do I manage this correctly?  I thought the find method could be used here, but I can't make it work.

The last line of code  doesn't work, which I indicated in the code below:

 

 

data want;

if _n_=1 then do; 
	
  if 0 then set tableX;
  declare hash o(dataset:'tableX');
  o.definekey('KEY');
  o.defineData('correction');
  o.definedone();

end;

set input_table;

if o.check(key:KEY)=0 then variable_to_be_corrected = o.find('correction') ; /* <-- ERROR HERE */
run;

 

 

 


Accepted Solutions
Solution
‎05-20-2016 09:43 AM
Super User
Super User
Posts: 7,966

Re: hash tables

So:

data tablea;
  input key $ variable_a;
datalines;
001          1 
002          1
003          2
004          5
;
run;
 
data tableb;
  input key $ variable_a;
datalines;
003          3
;
run;

data want;
  update tablea tableb;
  by key;
run;

View solution in original post


All Replies
Super Contributor
Posts: 308

Re: hash tables

Hello,

 

The hash does not work like this. I think you shall start by reading some useful papers, for example:

http://support.sas.com/resources/papers/proceedings10/151-2010.pdf

http://www2.sas.com/proceedings/forum2008/029-2008.pdf

http://www2.sas.com/proceedings/sugi30/236-30.pdf

 

Super User
Super User
Posts: 7,966

Re: hash tables

You can do such updating in a couple of different ways:

- Datastep with update statement

- Datastep with if then construct

Or SQL (just an example as no test datstep provided):

proc sql;
  update DATA_TO_BE_UPDATED
  set VARIABLE=xyz
  where  <condition>;
quit;
Contributor fre
Contributor
Posts: 35

Re: hash tables

Thank you both for your feedback.

 

What I wanted to realise is to correct some observations in table A based on observations in table B in a datastep.

 

table A:

KEY

variable_A

 

table B:

KEY

variable_A

 

Table A contains mostly correct values for variable A, but sometimes not.

For each value in table A, my program has to look if there is an occurance in table B (based on KEY), and if so, it has to update the value from variable_A in table A with the value from variable_A in table B.

 

As I understand, hash tables is not the best approach.

I was looking for a solution using a data step, but maybe with a proc SQL, it will be easier.

 

Super User
Super User
Posts: 7,966

Re: hash tables

If you post test data (form of a datastep) for each and what the output should look like we can provide code.  What happens for instance if there are multiple examples of KEY?  What is a "correct" value?  Its sounds like the best solution, based on what you provided is the datastep with update setup.

Contributor fre
Contributor
Posts: 35

Re: hash tables

okay, see here some testdata:

 

Given are the following tables:

 

- table A:

KEY          Variable_A

001          1 

002          1

003          2

004          5

 

- table B:

KEY          Variable_A

003          3

 

 

What code would I need if I want this output:

- table C:

KEY          Variable_A

001          1 

002          1

003          3

004          5

 

Table C is the same as Table A, except for the observatons that occur in table B.

How could this be done in a datastep?

Solution
‎05-20-2016 09:43 AM
Super User
Super User
Posts: 7,966

Re: hash tables

So:

data tablea;
  input key $ variable_a;
datalines;
001          1 
002          1
003          2
004          5
;
run;
 
data tableb;
  input key $ variable_a;
datalines;
003          3
;
run;

data want;
  update tablea tableb;
  by key;
run;
Contributor fre
Contributor
Posts: 35

Re: hash tables

Thank you very much.  This is indeed a very simple and compact way.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 346 views
  • 4 likes
  • 3 in conversation