I am running into a problem with big data inner joining. I created a hash object but not sure how to use all variables without typing all 200 variable names into the definedata() and call missing(). Should I be using the call missing()? please take a look at my code and give any suggestions....
data ltdmerg (drop=rc);
if 0 then set bigdata.file;
declare hash hh_pat(dataset:"bigdata.file");
rc=hh_pat.defineKey("id_cd");
rc=hh_pat.defineData(ALL:'Yes');
rc=hh_pat.defineDone();
do until(eof);
set work.data end=eof;
call missing (?????);
rc=hh_pat.find();
if rc=0 then output;
end;
stop;
run;
Here's another version of setting the PDV host variable equivalent. I call it PD's style :). Love it when he toils with SAS coding
data want (drop=rc);
declare hash hh_pat(dataset:"sashelp.class");
rc=hh_pat.defineKey("name");
rc=hh_pat.defineData(ALL:'Yes');
rc=hh_pat.defineDone();
do until(eof);
set w end=eof;
rc=hh_pat.find();
if rc=0 then output;
end;
stop;
set sashelp.class;/*notice here*/
run;
I remember PD's style as PDV=Program data vector corrected as Paul dorfman's Vector aka PDV host vars
Recommended reading the book title and notes are below:
When loading a dataset into a hash table, you don't need call missing for the reason
if 0 then set bigdata.file;
will set the PDV host variable equivalent of hash variable names .So if i could grasp what I have read and comprehended so far on reading the almighty's incarnation PD's book, I think I am right.
an illustration:
data w;
set sashelp.class;
keep name;
run;
data want (drop=rc);
if 0 then set sashelp.class;
declare hash hh_pat(dataset:"sashelp.class");
rc=hh_pat.defineKey("name");
rc=hh_pat.defineData(ALL:'Yes');
rc=hh_pat.defineDone();
do until(eof);
set w end=eof;
rc=hh_pat.find();
if rc=0 then output;
end;
stop;
run;
Here's another version of setting the PDV host variable equivalent. I call it PD's style :). Love it when he toils with SAS coding
data want (drop=rc);
declare hash hh_pat(dataset:"sashelp.class");
rc=hh_pat.defineKey("name");
rc=hh_pat.defineData(ALL:'Yes');
rc=hh_pat.defineDone();
do until(eof);
set w end=eof;
rc=hh_pat.find();
if rc=0 then output;
end;
stop;
set sashelp.class;/*notice here*/
run;
I remember PD's style as PDV=Program data vector corrected as Paul dorfman's Vector aka PDV host vars
Recommended reading the book title and notes are below:
Thanks! Took missing out and works great!
If you had a left join, this could be used to avoid listing all the variables:
data W;
set SASHELP.CLASS(keep=NAME) ;
output;
if _N_=1 then do; NAME='x'; output; end;
run;
data want (drop=rc);
if _N_=1 then do;
if 0 then set SASHELP.CLASS;
declare hash hh_pat(dataset:"SASHELP.CLASS");
rc=hh_pat.defineKey("NAME");
rc=hh_pat.defineData(ALL:'Yes');
rc=hh_pat.defineDone();
end;
set W;
call missing(of SEX -- WEIGHT);
rc=hh_pat.find();
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.