Hi Community,
I've searched for a solution but I didn't find it. Hopefully someone can give me some advice.
I have an input_table which contains a KEY variable (unique value) with 1 data variable "variable_to_be_corrected".
Based on another table, tableX, (which will be the hash table) I want to lookup the KEY and correct the variable "variable_to_be_corrected" based on the value for the variable 'correction' in the hash table tableX.
How do I manage this correctly? I thought the find method could be used here, but I can't make it work.
The last line of code doesn't work, which I indicated in the code below:
data want;
if _n_=1 then do;
if 0 then set tableX;
declare hash o(dataset:'tableX');
o.definekey('KEY');
o.defineData('correction');
o.definedone();
end;
set input_table;
if o.check(key:KEY)=0 then variable_to_be_corrected = o.find('correction') ; /* <-- ERROR HERE */
run;
So:
data tablea; input key $ variable_a; datalines; 001 1 002 1 003 2 004 5 ; run; data tableb; input key $ variable_a; datalines; 003 3 ; run; data want; update tablea tableb; by key; run;
Hello,
The hash does not work like this. I think you shall start by reading some useful papers, for example:
http://support.sas.com/resources/papers/proceedings10/151-2010.pdf
http://www2.sas.com/proceedings/forum2008/029-2008.pdf
http://www2.sas.com/proceedings/sugi30/236-30.pdf
You can do such updating in a couple of different ways:
- Datastep with update statement
- Datastep with if then construct
Or SQL (just an example as no test datstep provided):
proc sql; update DATA_TO_BE_UPDATED set VARIABLE=xyz where <condition>; quit;
Thank you both for your feedback.
What I wanted to realise is to correct some observations in table A based on observations in table B in a datastep.
table A:
KEY
variable_A
table B:
KEY
variable_A
Table A contains mostly correct values for variable A, but sometimes not.
For each value in table A, my program has to look if there is an occurance in table B (based on KEY), and if so, it has to update the value from variable_A in table A with the value from variable_A in table B.
As I understand, hash tables is not the best approach.
I was looking for a solution using a data step, but maybe with a proc SQL, it will be easier.
If you post test data (form of a datastep) for each and what the output should look like we can provide code. What happens for instance if there are multiple examples of KEY? What is a "correct" value? Its sounds like the best solution, based on what you provided is the datastep with update setup.
okay, see here some testdata:
Given are the following tables:
- table A:
KEY Variable_A
001 1
002 1
003 2
004 5
- table B:
KEY Variable_A
003 3
What code would I need if I want this output:
- table C:
KEY Variable_A
001 1
002 1
003 3
004 5
Table C is the same as Table A, except for the observatons that occur in table B.
How could this be done in a datastep?
So:
data tablea; input key $ variable_a; datalines; 001 1 002 1 003 2 004 5 ; run; data tableb; input key $ variable_a; datalines; 003 3 ; run; data want; update tablea tableb; by key; run;
Thank you very much. This is indeed a very simple and compact way.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.