I have two datasets. Dataset A has all the raw data where each subject ID code has multiple (sometimes thousands) of observations for it. Dataset B has a list of subject ID codes that need to be removed from Dataset A (i.e., if the ID code is in Dataset B then all observations linked to the ID code need to be removed from Dataset A).
I'm thinking it would seem like a job for proc sql, but don't know how to go about it. Thanks!
proc sort data=dataset_A;
by id;
run;
proc sort data=dataset_B;
by id;
run;
data want;
merge dataset_A(in=a) dataset_B(in=b);
by id;
if a and not b;
run;
by proc sql;
proc sql;
create table want as select a.* from dataset_A as a, dataset_b as b where a.id^=b.id;
quit;
Thanks,
Jagadish
proc sort data=dataset_A;
by id;
run;
proc sort data=dataset_B;
by id;
run;
data want;
merge dataset_A(in=a) dataset_B(in=b);
by id;
if a and not b;
run;
by proc sql;
proc sql;
create table want as select a.* from dataset_A as a, dataset_b as b where a.id^=b.id;
quit;
Thanks,
Jagadish
The SQL code creates a cartesian product that takes forever to run before timing out after it has consumed all available memory, but the data merge code above it works well.
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.