SAS Programming

cmccor · Posted 06-09-2022 02:32 PM

Update: selection solution actually doesn't work well in full data. See thread. Sorry for being hasty to accept a response!

I am working with some payment data that has several variables in it (ID is the customer ID, date is the date of service, and payment is how much the customer paid). However, some of the payment data are refunds of prior purchases, that match the date of the original purchase but have a negative value of the purchase price. I need to remove both data rows of the purchase and the refund because it does not represent what was actually sold off by the end of the month. Below is an example of what I have and what I want. Note the "seqeunce_num" is just the row of the observation to make it easier to show what is deleted in the Want dataset:

Have:

sequence_num	ID	Date	payment
1	1	1/1/2012	5
2	1	1/2/2012	5
3	1	1/2/2012	-5
4	2	1/6/2012	7
5	2	1/6/2012	7
6	3	1/6/2012	13
7	4	1/12/2012	10
8	4	1/12/2012	-10

Want:

sequence_num	ID	Date	payment
1	1	1/1/2012	5
4	2	1/6/2012	7
5	2	1/6/2012	7
6	3	1/6/2012	13

I'm not sure what to do because using the nodupkey in proc sort only removes one of the claims (such as the refund) but not both (the original purchase and refund).

PeterClemmensen · Posted 06-09-2022 03:31 PM

data have;
input sequence_num ID Date :ddmmyy10. payment;
format Date ddmmyy10.;
datalines;
1 1 1/1/2012   8  
2 1 1/2/2012   5  
3 1 1/2/2012  -5  
4 2 1/6/2012   7  
5 2 1/6/2012   7  
6 3 1/6/2012   13 
7 4 1/12/2012  10 
8 4 1/12/2012 -10 
;

proc sql;
   create table want as
   select * from have
   where payment not in
   (select payment * -1 from have)
   ;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

PeterClemmensen · Posted 06-09-2022 03:31 PM

data have;
input sequence_num ID Date :ddmmyy10. payment;
format Date ddmmyy10.;
datalines;
1 1 1/1/2012   8  
2 1 1/2/2012   5  
3 1 1/2/2012  -5  
4 2 1/6/2012   7  
5 2 1/6/2012   7  
6 3 1/6/2012   13 
7 4 1/12/2012  10 
8 4 1/12/2012 -10 
;

proc sql;
   create table want as
   select * from have
   where payment not in
   (select payment * -1 from have)
   ;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

cmccor · Posted 06-09-2022 03:48 PM

Updated: I thought this did work but when testing it on the larger data set, if two patients had matching sales, it would delete all of them.

I need it to look within each patient and only delete the original purchase and refund if the date matches and the payment/refund amount are identical (given that one is positive and one is negative)

Reeza · Posted 06-09-2022 03:53 PM

You sure?

Something like this seems to break it:

data have;
input sequence_num ID Date :ddmmyy10. payment;
format Date ddmmyy10.;
datalines;
1 1 1/1/2012   8  
2 1 1/2/2012   5  
3 1 1/2/2012  -5  
3 2 1/2/2012  5 
4 2 1/6/2012   7  
5 2 1/6/2012   7  
6 3 1/6/2012   13 
7 4 1/12/2012  10 
8 4 1/12/2012 -10 
;

cmccor · Posted 06-09-2022 04:03 PM

I admit I got hasty in my response seeing it work on the smaller data sample I had pulled and when I tried it out on the larger data set it messed up. I need it to be within customers and be based on matching dates for them to make sure that they are the same sale and it's not just deleting any other sale.

Reeza · Posted 06-10-2022 11:11 AM

What's your ultimate goal here in the end? Calculate total sales and number of sales?
One easy method to achieve that add a variable, 1/-1 for each row. If the row is positive it's 1, if it's negative it's -1. To get the number of transactions sum the variable. To get the amounts, summing the amounts will be fine. Average? Use the variable as weights.

If the transactions are always on the same date this works fine.

Ksharp · Posted 06-10-2022 08:11 AM

https://communities.sas.com/t5/SAS-Programming/How-to-delete-rows-with-the-same-absolute-value-in-PR...

SAS Programming

Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Re: Delete rows within a dataset that match on certain variables

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...