Hi,
I'm using 3 hash tables in a data step to pull in data for a large dataset. In many cases the key won't be found, and in those situations I just want to ignore the condition. What I'm seeing is each time this happens, an error gets sent to the log file in the form of:
ERROR: Key not found.
I didn't spot any obvious way to suppress these. ??
Thanks!
--Ben
hi ... I suspect you are just using something like ... o.find() ... rather than ... rc=o.find()
no error messages in the following ...
data one;
input x y @@;
datalines;
1 100 2 200 3 200
;
data two;
input x @@;
datalines;
1 9 2 9 3
;
data three (drop=rc);
declare hash o (dataset:'one');
o.definekey ('x');
o.definedata ('y');
o.definedone();
do until(done);
set two end=done;
rc=o.find();
if rc then call missing(y);
output;
end;
stop;
run;
DATA SET THREE
y x
100 1
. 9
200 2
. 9
200 3
hi ... I suspect you are just using something like ... o.find() ... rather than ... rc=o.find()
no error messages in the following ...
data one;
input x y @@;
datalines;
1 100 2 200 3 200
;
data two;
input x @@;
datalines;
1 9 2 9 3
;
data three (drop=rc);
declare hash o (dataset:'one');
o.definekey ('x');
o.definedata ('y');
o.definedone();
do until(done);
set two end=done;
rc=o.find();
if rc then call missing(y);
output;
end;
stop;
run;
DATA SET THREE
y x
100 1
. 9
200 2
. 9
200 3
OH! Yes, one of the find() was w/o a rc= context. Didn't realize that would generate a 20 meg log file.
Much appreciated!
--Ben
Hi Mike,
Can you explain why you need 'stop'? Thanks!
hi ... sure
LOG with stop ...
NOTE: There were 3 observations read from the data set WORK.ONE.
NOTE: There were 5 observations read from the data set WORK.TWO.
NOTE: The data set WORK.THREE has 5 observations and 2 variables.
LOG without stop ...
NOTE: There were 3 observations read from the data set WORK.ONE.
NOTE: There were 3 observations read from the data set WORK.ONE.
NOTE: There were 5 observations read from the data set WORK.TWO.
NOTE: The data set WORK.THREE has 5 observations and 2 variables.
without the stop, the data step cycles back to the start to check if there is any more data to read and it does all the hash stuff a second time before it hits the DOW loop that reads the data
the job works either way, just "better" with stop
OK?
Thank you Mike!
below is the how I code hash:
data three (drop=rc);
if _n_=1 then do;
if 0 then set one;
declare hash o (dataset:'one');
o.definekey ('x');
o.definedata ('y');
o.definedone();
end;
do until(done);
set two end=done;
rc=o.find();
if rc then call missing(y);
output;
end;
run;
Linlin
hi ... without the stop, the data step still goes back to the start with your code to check if there's any more data to be read
try this and look at the LOG, you'll see two lines were written ... still works, but does that extra, unnecessary check
data three (drop=rc);
if _n_=1 then do;
if 0 then set one;
declare hash o (dataset:'one');
o.definekey ('x');
o.definedata ('y');
o.definedone();
end;
put "HI LINLIN ... " _n_;
do until(done);
set two end=done;
rc=o.find();
if rc then call missing(y);
output;
end;
run;
Thank you Mike! I will remember to add 'stop'. See you on Tuesday. - Linlin
Hey,
A bit confused here. Thanks for your help in advance. This is what I have. Col1-49 is a symmetric 49*49 matrix and they represent distance from one another.
Country GDP Col1 Col2 Col3.............Col49
Germany 5 0 3 1 2
France 4 3 0 2 1
Belgium 1 1 2 0 1
So basically Market potential for Germany = (GDP of france)/Distance between germany and france (3 in this case) + (GDP of Belgium)/(Distance between Ger and Bel (1 in this case).
So I got a help from Ksharp here, but it says ERROR: Key not found.
data have;
set have;
k=cats('Col',_n_);
run;
data want;
if _n_ eq 1 then do;
if 0 then set have(keep=k rgdpinmi rename=(rgdpinmi=_gdp));
declare hash ha(dataset:'have(keep=k rgdpinmi rename=(rgdpinmi=_gdp))');
ha.definekey('k');
ha.definedata('_gdp');
ha.definedone();
end;
set have;
array x{*} col: ;
sum=0;
do i=1 to dim(x);
if _n_ ne i then do;
k=vname(x{i});
ha.find();
sum + _gdp/x{i};
end;
end;
run;
Can you help me solve the problem please? Thanks for your help. (The program worked when there were just 3 dataset (3*3 matrix), but doesn't work with my dataset 49*49 matrix. I realize it has to do with ha.find(), but couldn't figure out a way to solve it. Thanks.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
