I have a two data sets that look like the following simplified:
Dataset 1 Dataset2
Patient Drug StartDate EndDate etc Drug etc
A E 2016-02-20 2016-02-21 - E -
A F 2016-01-21 2016-01-25 - G -
A G 2016-03-20 2016-03-21 -
B E 2016-01-05 2016-01-24 -
B G 2016-01-07 2016-01-14 -
C H 2016-02-20 2016-02-28 -
I am trying to keep only the observations where the Drug lists match:
Patient Drug StartDate EndDate etc
A E 2016-02-20 2016-02-21 -
A G 2016-03-20 2016-03-21 -
B E 2016-01-05 2016-01-24 -
B G 2016-01-07 2016-01-14 -
I've tried the following but I'm not sure how to read from the drug list in the second data set:
data remove;
set mydata;
if Drug = [matching drugs from drug list] then delete;
;
Thanks for any help!
proc sql;
create table remove as
select *
from one
where drug in (select drug from two);
quit
Something like below (untested) code should work.
data want;
if _n_=1 then
do;
dcl hash druglis(dataset:'dataset2');
druglist.defineKey('drug');
druglist.defineData('drug');
druglist.defineDone();
end;
set dataset1;
if druglist.check()=0 then output;
run;
proc sql;
create table remove as
select *
from one
where drug in (select drug from two);
quit
Via data step
proc sort data=Dataset1;
by drug;
run;
proc sort data=Dataset2(keep=drug) nodupkey;
by drug;
run;
data want;
merge Dataset1(in=a) Dataset2(in=b);
by drug;
if a and b then delete;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.