Dear All,
I have a dataset like below with 300000 observations;
The calculation is ;
if A>=A then sum(B) [if observation in A column is greater than equal to A column Sum the values in B column]
Finally, TOTAL=sum(all B)
| Input | output | |||
| A | B | A | B | |
| 1 | 100 | 1 | 1209 | |
| 1 | 101 | 2 | 1008 | |
| 2 | 102 | 3 | 706 | |
| 2 | 200 | 4 | 501 | |
| 3 | 205 | TOTAL | 3424 | |
| 4 | 501 |
Could anyone please help me here.?
Thanks,
Chithra
This code is not the full solution as it misses out the final Total. There is probably also a more efficient way of doing this. However I hope the below code helps.
data have;
input a b;
cards;
1 100
1 101
2 102
2 200
3 205
4 501
;
run;
proc sort data=have
out=haveSrt;
by descending a;
run;
data want(drop=originalB);
set haveSrt(rename=(b=originalB));
by descending a;
retain b 0;
b + originalB;
if last.a;
run;
proc sort data=want;
by a;
run;
This code is not the full solution as it misses out the final Total. There is probably also a more efficient way of doing this. However I hope the below code helps.
data have;
input a b;
cards;
1 100
1 101
2 102
2 200
3 205
4 501
;
run;
proc sort data=have
out=haveSrt;
by descending a;
run;
data want(drop=originalB);
set haveSrt(rename=(b=originalB));
by descending a;
retain b 0;
b + originalB;
if last.a;
run;
proc sort data=want;
by a;
run;
Thanks a lot, this helps
one more thing, is it possible to do the same with respect to one more column?
data have;
input a1 $9. a b;
cards;
IME_TOTAL 11 100
IME_PREM 11 101
IME_TOTAL 2 102
IME_ANNUI 2 200
IME_TOTAL 3 205
IME_ANNUI 4 501
;
run;
output looks like :
| Input | output | ||||
| A1 | A | B | A1 | A | B |
| IME_TOTAL | 11 | 100 | IME_TOTAL | 11 | 100 |
| IME_PREM | 11 | 101 | IME_TOTAL | 3 | 407 |
| IME_TOTAL | 3 | 102 | IME_PREM | 11 | 101 |
| IME_ANNUI | 2 | 200 | IME_ANNUI | 2 | 701 |
| IME_TOTAL | 3 | 205 | IME_ANNUI | 4 | 501 |
| IME_ANNUI | 4 | 501 | TOTAL | 1810 |
What are your rules with the additional Column?
I also cannot work out what sorting method you want for the Output, please could you clarify.
Thanks
if a1=ime_total then do the calculation as done before, the sam if a1=ime_prem and ime_annui.
ie; for a1=ime_total, when a>=a sum(b). like for all a1.
in this example for a1=ime_total we get 2 rows :
11>=11 --> one row
3>=3(2 rows),3>=11 sum together--> one row
hope you will get this.
data have;
input a1 $9. a b;
cards;
IME_TOTAL 11 100
IME_PREM 11 101
IME_TOTAL 2 102
IME_ANNUI 2 200
IME_TOTAL 3 205
IME_ANNUI 4 501
;
run;
proc sort data=have
out=haveSrt;
by a1 descending a;
run;
data want(drop=originalB);
set haveSrt(rename=(b=originalB));
by a1 descending a;
retain b 0;
if first.a1 then b=0;
b + originalB;
if last.a;
run;
proc sort data=want;
by a;
run;Hello. This seems to match your rules but not your output. (for the three column method).
Could you have a look at it and if necessary tell me exactly which rows each row of your output should be adding up.
Apologies for this.
But, this actually helps me.
Thanks
Thanks a lot, this helps.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.