Solved: How to delete rows from table based on another table

Sas_noob25 · Posted 10-27-2022 01:13 PM

I have a problem I'm trying to solve, and not really sure where to start.

I have two tables Table A and Table B. I'm trying to delete ALL of the customer IDs from Table A if they appear in Table B, but Table A sometimes has multiple entries for the same customer ID. My end result should be Table A with only records that are not in Table B.

Example:

Table A

Customer ID	Date
A	5/7/2020
A	5/11/2020
B	5/8/2020
B	5/9/2020
B	5/10/2020
C	5/11/2020

Table B

Customer ID	Delete Flag
B	Y
B	Y
A	Y

In this example I would only want Customer ID C to show up in my final table.

I have a few thousand rows of data in Table A that I'm checking against. Any help is greatly appreciated!

svh · Posted 10-27-2022 01:35 PM

This is relatively straightforward with a subquery like this:

(I have altered the name of the variable "Customer ID". It can work as a named literal like this: 'Customer ID'n

proc sql;
   create table want as
   select * from A
  where customer_ID not in (select customer_ID from B);
quit;

View solution in original post

svh · Posted 10-27-2022 01:35 PM

This is relatively straightforward with a subquery like this:

(I have altered the name of the variable "Customer ID". It can work as a named literal like this: 'Customer ID'n

proc sql;
   create table want as
   select * from A
  where customer_ID not in (select customer_ID from B);
quit;

Sas_noob25 · Posted 10-27-2022 02:01 PM

Thank you! Super easy way to do this, not sure how I didn't think of this.

Kurt_Bremser · Posted 10-27-2022 01:39 PM

If data size and performance is an issue, use a hash object:

data want;
set a;
if _n_ 01
then do;
  declare hash b (dataset:"b");
  b.definekey("customer_id");
  b.definedone();
end;
if b.check() ne 0;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Reeza · Posted 10-27-2022 01:40 PM


proc sort data=tableB out=b_unique nodupkey;
by customerID;
run;

proc sort data=tableA;
by customerID;

data tableC;
merge tableA(in=ina) b_unique(in=inb drop=deleteflag);
by customerID;
if not inb;
run;

A SQL way that will delete records from the table - test thoroughly before using something like this:

data tableA;
infile cards dlm='09'x;
input CustomerID  $	Date : ddmmyy10.;
format date date9.;
cards;
A	5/7/2020
A	5/11/2020
B	5/8/2020
B	5/9/2020
B	5/10/2020
C	5/11/2020
;;;;
run;

data tableB;
infile cards dlm='09'x;
input CustomerID $	DeleteFlag $;
cards;
B	Y
B	Y
A	Y
;;;;
run;

proc sql;
delete * from tableA where customerID  in (select distinct(CustomerID) from TableB);
quit;

Sas_noob25 · Posted 10-27-2022 02:12 PM

Thank you!

This step below worked as well.

proc sort data=tableB out=b_unique nodupkey;
by customerID;
run;

proc sort data=tableA;
by customerID;

data tableC;
merge tableA(in=ina) b_unique(in=inb drop=deleteflag);
by customerID;
if not inb;
run;

How to delete rows from table based on another table

Re: How to delete rows from table based on another table

Re: How to delete rows from table based on another table

Re: How to delete rows from table based on another table

Re: How to delete rows from table based on another table

Re: How to delete rows from table based on another table

Re: How to delete rows from table based on another table

SAS Innovate 2025: Call for Content

Classroom Training Available!