Hi Community,
I've searched for a solution but I didn't find it. Hopefully someone can give me some advice.
I have an input_table which contains a KEY variable (unique value) with 1 data variable "variable_to_be_corrected".
Based on another table, tableX, (which will be the hash table) I want to lookup the KEY and correct the variable "variable_to_be_corrected" based on the value for the variable 'correction' in the hash table tableX.
How do I manage this correctly? I thought the find method could be used here, but I can't make it work.
The last line of code doesn't work, which I indicated in the code below:
data want;
if _n_=1 then do;
if 0 then set tableX;
declare hash o(dataset:'tableX');
o.definekey('KEY');
o.defineData('correction');
o.definedone();
end;
set input_table;
if o.check(key:KEY)=0 then variable_to_be_corrected = o.find('correction') ; /* <-- ERROR HERE */
run;
So:
data tablea; input key $ variable_a; datalines; 001 1 002 1 003 2 004 5 ; run; data tableb; input key $ variable_a; datalines; 003 3 ; run; data want; update tablea tableb; by key; run;
Hello,
The hash does not work like this. I think you shall start by reading some useful papers, for example:
http://support.sas.com/resources/papers/proceedings10/151-2010.pdf
http://www2.sas.com/proceedings/forum2008/029-2008.pdf
http://www2.sas.com/proceedings/sugi30/236-30.pdf
You can do such updating in a couple of different ways:
- Datastep with update statement
- Datastep with if then construct
Or SQL (just an example as no test datstep provided):
proc sql; update DATA_TO_BE_UPDATED set VARIABLE=xyz where <condition>; quit;
Thank you both for your feedback.
What I wanted to realise is to correct some observations in table A based on observations in table B in a datastep.
table A:
KEY
variable_A
table B:
KEY
variable_A
Table A contains mostly correct values for variable A, but sometimes not.
For each value in table A, my program has to look if there is an occurance in table B (based on KEY), and if so, it has to update the value from variable_A in table A with the value from variable_A in table B.
As I understand, hash tables is not the best approach.
I was looking for a solution using a data step, but maybe with a proc SQL, it will be easier.
If you post test data (form of a datastep) for each and what the output should look like we can provide code. What happens for instance if there are multiple examples of KEY? What is a "correct" value? Its sounds like the best solution, based on what you provided is the datastep with update setup.
okay, see here some testdata:
Given are the following tables:
- table A:
KEY Variable_A
001 1
002 1
003 2
004 5
- table B:
KEY Variable_A
003 3
What code would I need if I want this output:
- table C:
KEY Variable_A
001 1
002 1
003 3
004 5
Table C is the same as Table A, except for the observatons that occur in table B.
How could this be done in a datastep?
So:
data tablea; input key $ variable_a; datalines; 001 1 002 1 003 2 004 5 ; run; data tableb; input key $ variable_a; datalines; 003 3 ; run; data want; update tablea tableb; by key; run;
Thank you very much. This is indeed a very simple and compact way.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.