I have a problem I'm trying to solve, and not really sure where to start.
I have two tables Table A and Table B. I'm trying to delete ALL of the customer IDs from Table A if they appear in Table B, but Table A sometimes has multiple entries for the same customer ID. My end result should be Table A with only records that are not in Table B.
Example:
Table A
Customer ID | Date |
A | 5/7/2020 |
A | 5/11/2020 |
B | 5/8/2020 |
B | 5/9/2020 |
B | 5/10/2020 |
C | 5/11/2020 |
Table B
Customer ID | Delete Flag |
B | Y |
B | Y |
A | Y |
In this example I would only want Customer ID C to show up in my final table.
I have a few thousand rows of data in Table A that I'm checking against. Any help is greatly appreciated!
This is relatively straightforward with a subquery like this:
(I have altered the name of the variable "Customer ID". It can work as a named literal like this: 'Customer ID'n
proc sql;
create table want as
select * from A
where customer_ID not in (select customer_ID from B);
quit;
This is relatively straightforward with a subquery like this:
(I have altered the name of the variable "Customer ID". It can work as a named literal like this: 'Customer ID'n
proc sql;
create table want as
select * from A
where customer_ID not in (select customer_ID from B);
quit;
Thank you! Super easy way to do this, not sure how I didn't think of this.
If data size and performance is an issue, use a hash object:
data want;
set a;
if _n_ 01
then do;
declare hash b (dataset:"b");
b.definekey("customer_id");
b.definedone();
end;
if b.check() ne 0;
run;
proc sort data=tableB out=b_unique nodupkey;
by customerID;
run;
proc sort data=tableA;
by customerID;
data tableC;
merge tableA(in=ina) b_unique(in=inb drop=deleteflag);
by customerID;
if not inb;
run;
A SQL way that will delete records from the table - test thoroughly before using something like this:
data tableA;
infile cards dlm='09'x;
input CustomerID $ Date : ddmmyy10.;
format date date9.;
cards;
A 5/7/2020
A 5/11/2020
B 5/8/2020
B 5/9/2020
B 5/10/2020
C 5/11/2020
;;;;
run;
data tableB;
infile cards dlm='09'x;
input CustomerID $ DeleteFlag $;
cards;
B Y
B Y
A Y
;;;;
run;
proc sql;
delete * from tableA where customerID in (select distinct(CustomerID) from TableB);
quit;
Thank you!
This step below worked as well.
proc sort data=tableB out=b_unique nodupkey;
by customerID;
run;
proc sort data=tableA;
by customerID;
data tableC;
merge tableA(in=ina) b_unique(in=inb drop=deleteflag);
by customerID;
if not inb;
run;
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.