I have two datasets. Dataset A has all the raw data where each subject ID code has multiple (sometimes thousands) of observations for it. Dataset B has a list of subject ID codes that need to be removed from Dataset A (i.e., if the ID code is in Dataset B then all observations linked to the ID code need to be removed from Dataset A).
I'm thinking it would seem like a job for proc sql, but don't know how to go about it. Thanks!
proc sort data=dataset_A;
by id;
run;
proc sort data=dataset_B;
by id;
run;
data want;
merge dataset_A(in=a) dataset_B(in=b);
by id;
if a and not b;
run;
by proc sql;
proc sql;
create table want as select a.* from dataset_A as a, dataset_b as b where a.id^=b.id;
quit;
Thanks,
Jagadish
proc sort data=dataset_A;
by id;
run;
proc sort data=dataset_B;
by id;
run;
data want;
merge dataset_A(in=a) dataset_B(in=b);
by id;
if a and not b;
run;
by proc sql;
proc sql;
create table want as select a.* from dataset_A as a, dataset_b as b where a.id^=b.id;
quit;
Thanks,
Jagadish
The SQL code creates a cartesian product that takes forever to run before timing out after it has consumed all available memory, but the data merge code above it works well.
Thanks!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.