Hi,
I used one dataset as a hash, with "ID" as its key. I also have other variables in the hash that I want to keep and I included them in defineData.
I'd like to output variables in the hash, where the "ID" can not be found in the main dataset, because I'm more interested in variables that dont exist in the main dataset.
The codes I'm using now are like this: if hash.find()^=o then output .... But it seems like it outputs variables in the main datasets instead of in the hash.
I can't do it by reversing the hash and main dataset, because the main dataset is far larger than my hash. What code should I use to solve it? Any help will be appreciated!
Remove it from Hash when you find it , and at the end of data step output Hash.
data _null_;
set main end=last;
...........
if hash.check()=o then hash.remove();
if last then hash.output(dataset:'want');
run;
In similar situation I prefer the SQL.
Let datasets be: SMALL (instead hash) and BIG with ID common in both, then
proc sql;
create table WANT as select * from BIG
where ID not in (select ID from SMALL);
quit;
The dataset is really large (more than 400 GB), so I was suggested to use Hash. But thanks for your help!
Remove it from Hash when you find it , and at the end of data step output Hash.
data _null_;
set main end=last;
...........
if hash.check()=o then hash.remove();
if last then hash.output(dataset:'want');
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.