Hi everyone,
I need some help to solve this problem.
I've tried to make code in proc SQL, but it would be fine in basic SAS too.
I have a table like below. (Actual data has about 90 columns.)
I want to delete rows that have the same size of L_price and P_price but have different ctype and ccnt in the same group by contract_repeat.
(When ctype is new, ccnt is -1 and L_price and P_price have negative values.)
(Original table)
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 |
4 | 1555018 | 12 | 155501812 | partial | new | 1 | 0 | 0 |
5 | 1555021 | 6 | 15550216 | partial | new | 1 | 23430 | 455 |
6 | 1555021 | 6 | 15550216 | partial | cancel | -1 | -22230 | -405 |
7 | 1555021 | 6 | 15550216 | partial | new | 1 | 22230 | 405 |
8 | 1555021 | 6 | 15550216 | partial | cancel | -1 | -23430 | -455 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 |
(Result which I want to get)
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 |
4 | 1555018 | 12 | 155501812 | partial | new | 1 | 0 | 0 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 |
For example,
[contract_repeat 15550127 ]
There are two rows each one has a different ctype, but the sum of L_price of two rows is 1300 and the sum of P_price of two rows is -100, so this data will not be deleted.
[contract_repeat 15550216]
There are five rows and two pairs(=four rows) should be deleted.
(pair1) rowN : 5 & rowN : 8
-> the sum of two L_price = 23430 - 23430 = 0 and the sum of two P_price = 455- 455 = 0
(pair2) rowN : 6 & rowN : 7
-> the sum of two L_price = -22230 + 22230 = 0 and the sum of two P_price = -405 + 405 = 0
If I can get any help, I will really appreciate it!
Here is one option that works with this particular data set:
data orig;
input rowN contractNo repeatNo contract_repeat cate:$10. ctype:$10. ccnt L_tot_price P_tot_price;
datalines;
1 1555012 7 15550127 partial cancel -1 -700 -200
2 1555012 7 15550127 partial new 1 2000 100
3 1555017 8 15550178 partial new 1 9620 93
4 1555018 12 155501812 partial new 1 0 0
5 1555021 6 15550216 partial new 1 23430 455
6 1555021 6 15550216 partial cancel -1 -22230 -405
7 1555021 6 15550216 partial new 1 22230 405
8 1555021 6 15550216 partial cancel -1 -23430 -455
9 1555021 6 15550216 partial new 1 14210 296
;
run;
data working;
set orig;
absccnt = abs(ccnt);
absL_tot_price = abs(L_tot_price);
absP_tot_price = abs(P_tot_price);
run;
proc sql;
select *
from working
group by absccnt, absL_tot_price, absP_tot_price
having count(*) < 2
order by rowN;
quit;
Output data set is:
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price | absccnt | absL_tot_price | absP_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 | 1 | 700 | 200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 | 1 | 9620 | 93 |
4 | 1555018 | 12 | 1.56E+08 | partial | new | 1 | 0 | 0 | 1 | 0 | 0 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 | 1 | 14210 | 296 |
You might also want to look at FIRST. and LAST. DATA Step Variables
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.