data want; set have1; where var in ( proc sql; select distinct var from have2; quit;);
Or how to write the proc sql part in pure data step?
Your code would not work. I think you want all the observations in HAVE1 with a value for VAR that also occurs in HAVE2:
data want;
set have1;
if _n_=1 then do;
declare hash h (dataset:'have2 (keep=var)');
h.definekey('var');
h.definedone();
end;
rc=h.find();
if rc=0;
run;
Or if you don't want implement hash objects, you can use a couple of proc sorts, followed by a merge:
proc sort data=have2 (keep=var) out=need2 nodupkey;
by var;
run;
proc sort data=have1;
by var;
run;
data want;
merge have1 have2 (in=in2);
by var;
if in2;
run;
The latter requires for more input/output.
The data step looks more complicated.
I think a full proc sql is the simplest and readable.
proc sql; create table want as select * from have1 where var in (select distinct var in have2); quit;
@afiqcjohari, yes, SQL is simpler in this case, but you need the proper syntax:
proc sql;
create table want as
select * from have1
where var in (select var from have2);
quit;
Agree with other that stick to SQL for the whole query is the preferred technique.
But if there's another situation, and really would benefit from nesting SQL code, take a look at DS2.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.