BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sarahzhou
Quartz | Level 8

Hi, 

 

I have a table A:

User_ID Customer_ID Email_add Invoice_no Invoice_account
u1 c1 example1@gmail.com v1 100
u2 c2 example2@gmail.com v2 120
u2 c2 example2b@gmail.com v3 130
u3 c3 example3@gmail.com v4 150
u4 c4 example4@gmail.com v5 165
u5 c5 example5@gmail.com v6 180

 

And table B:

User_ID Customer_ID Email_add
  c1 example1@gmail.com
    example2@gmail.com
u3    

 

I wish to delete the whole entry when either User_ID, Customer_ID or Email_add appeared in table B.

 

So the result table C should look like this:

User_ID Customer_ID Email_add Invoice_no Invoice_account
u4 c4 example4@gmail.com v5 165
u5 c5 example5@gmail.com v6 180

 

I've tried using where clause A.User_ID=B.User_ID or A.Customer_ID=B.Customer_ID or A.Email_add=B.Email_add to delete the rows (failed T.T.). This operation is very slow when the datasets for table A and Table B is large.

 

Please advise, thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

Try this

 

data a;
input User_ID $ Customer_ID $ Email_add :$20. Invoice_no $ Invoice_account;
datalines;
u1 c1 example1@gmail.com  v1 100 
u2 c2 example2@gmail.com  v2 120 
u2 c2 example2b@gmail.com v3 130 
u3 c3 example3@gmail.com  v4 150 
u4 c4 example4@gmail.com  v5 165 
u5 c5 example5@gmail.com  v6 180 
;
 
data b;
input User_ID $ Customer_ID $ Email_add :$20.;
infile datalines missover dlm = '|';
datalines;
   | c1 | example1@gmail.com 
   |    | example2@gmail.com 
u3 |    |                    
;   


proc sql;
   create table want as
   select * from a
   where User_ID     not in (select User_ID     from b)
     and Customer_ID not in (select Customer_ID from b)
     and Email_add   not in (select Email_add   from b)
   ;
quit;
PeterClemmensen
Tourmaline | Level 20

Also, the 3rd obs should be in the desired result as well, correct?

Patrick
Opal | Level 21

You could use the EXIST clause.

/* create a new table */
proc sql;
  create table want as
    select * from a
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

/* delete from existing table in-place */
proc sql;
  delete from a 
    where not exists
      ( select * from b 
        where 
          a.user_id=b.user_id or
          a.customer_id=b.customer_id or
          a.email_add=b.email_add
      )
    ;
quit;

If you're dealing with SAS tables then it's normally better to create a new table and not to directly delete rows in an existing table ....unless you have a big source table and need only to delete a minor percentage of rows. Reason: SAS deleted the rows only logically but no physically. The deleted rows still remain in the table and add to the tables volume (file size).

proc contents data=a;
quit;

Patrick_0-1665561613802.png

 

If it's a table in a database then using DELETE is likely the "correct" approach. Having said that: Delete is also in a database a slow process so if you need to delete a high percentage of rows then creating a new table will likely execute faster.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1077 views
  • 0 likes
  • 3 in conversation