Hi everyone,
I need some help to solve this problem.
I've tried to make code in proc SQL, but it would be fine in basic SAS too.
I have a table like below. (Actual data has about 90 columns.)
I want to delete rows that have the same size of L_price and P_price but have different ctype and ccnt in the same group by contract_repeat.
(When ctype is new, ccnt is -1 and L_price and P_price have negative values.)
(Original table)
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 |
4 | 1555018 | 12 | 155501812 | partial | new | 1 | 0 | 0 |
5 | 1555021 | 6 | 15550216 | partial | new | 1 | 23430 | 455 |
6 | 1555021 | 6 | 15550216 | partial | cancel | -1 | -22230 | -405 |
7 | 1555021 | 6 | 15550216 | partial | new | 1 | 22230 | 405 |
8 | 1555021 | 6 | 15550216 | partial | cancel | -1 | -23430 | -455 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 |
(Result which I want to get)
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 |
4 | 1555018 | 12 | 155501812 | partial | new | 1 | 0 | 0 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 |
For example,
[contract_repeat 15550127 ]
There are two rows each one has a different ctype, but the sum of L_price of two rows is 1300 and the sum of P_price of two rows is -100, so this data will not be deleted.
[contract_repeat 15550216]
There are five rows and two pairs(=four rows) should be deleted.
(pair1) rowN : 5 & rowN : 8
-> the sum of two L_price = 23430 - 23430 = 0 and the sum of two P_price = 455- 455 = 0
(pair2) rowN : 6 & rowN : 7
-> the sum of two L_price = -22230 + 22230 = 0 and the sum of two P_price = -405 + 405 = 0
If I can get any help, I will really appreciate it!
Here is one option that works with this particular data set:
data orig;
input rowN contractNo repeatNo contract_repeat cate:$10. ctype:$10. ccnt L_tot_price P_tot_price;
datalines;
1 1555012 7 15550127 partial cancel -1 -700 -200
2 1555012 7 15550127 partial new 1 2000 100
3 1555017 8 15550178 partial new 1 9620 93
4 1555018 12 155501812 partial new 1 0 0
5 1555021 6 15550216 partial new 1 23430 455
6 1555021 6 15550216 partial cancel -1 -22230 -405
7 1555021 6 15550216 partial new 1 22230 405
8 1555021 6 15550216 partial cancel -1 -23430 -455
9 1555021 6 15550216 partial new 1 14210 296
;
run;
data working;
set orig;
absccnt = abs(ccnt);
absL_tot_price = abs(L_tot_price);
absP_tot_price = abs(P_tot_price);
run;
proc sql;
select *
from working
group by absccnt, absL_tot_price, absP_tot_price
having count(*) < 2
order by rowN;
quit;
Output data set is:
rowN | contractNo | repeatNo | contract_repeat | cate | ctype | ccnt | L_tot_price | P_tot_price | absccnt | absL_tot_price | absP_tot_price |
1 | 1555012 | 7 | 15550127 | partial | cancel | -1 | -700 | -200 | 1 | 700 | 200 |
2 | 1555012 | 7 | 15550127 | partial | new | 1 | 2000 | 100 | 1 | 2000 | 100 |
3 | 1555017 | 8 | 15550178 | partial | new | 1 | 9620 | 93 | 1 | 9620 | 93 |
4 | 1555018 | 12 | 1.56E+08 | partial | new | 1 | 0 | 0 | 1 | 0 | 0 |
9 | 1555021 | 6 | 15550216 | partial | new | 1 | 14210 | 296 | 1 | 14210 | 296 |
You might also want to look at FIRST. and LAST. DATA Step Variables
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.