Re: How to embed proc sql within a data step?

afiqcjohari · Posted 03-14-2017 09:23 PM

data want;
set have1;
where var in (
   proc sql; 
      select distinct var from have2;
  quit;);

Or how to write the proc sql part in pure data step?

mkeintz · Posted 03-14-2017 09:28 PM

Your code would not work. I think you want all the observations in HAVE1 with a value for VAR that also occurs in HAVE2:

data want;

set have1;

if _n_=1 then do;

declare hash h (dataset:'have2 (keep=var)');

h.definekey('var');

h.definedone();

end;

rc=h.find();

if rc=0;

run;

Or if you don't want implement hash objects, you can use a couple of proc sorts, followed by a merge:

proc sort data=have2 (keep=var) out=need2 nodupkey;

by var;

run;

proc sort data=have1;

by var;

run;

data want;

merge have1 have2 (in=in2);

by var;

if in2;

run;

The latter requires for more input/output.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

afiqcjohari · Posted 03-14-2017 09:37 PM

The data step looks more complicated.

I think a full proc sql is the simplest and readable.

proc sql;
 create table want as
 select * from have1
 where var in (select distinct var in have2);
quit;

PGStats · Posted 03-14-2017 11:22 PM

@afiqcjohari, yes, SQL is simpler in this case, but you need the proper syntax:

proc sql;
 create table want as
 select * from have1
 where var in (select var from have2);
quit;

PG

LinusH · Posted 03-15-2017 04:38 AM

Agree with other that stick to SQL for the whole query is the preferred technique.

But if there's another situation, and really would benefit from nesting SQL code, take a look at DS2.

Data never sleeps

afiqcjohari · Posted 03-15-2017 05:17 AM

Thanks for this, it may come handy later. 🙂

How to embed proc sql within a data step?