Dear, I hope you are very well, my question is the following: I have 3 columns with numerical variables and I must accumulate them from the oldest to the newest as I show in the following table of how the data is and how the final table should be.
DATE | CATEG_A | CATEG_B | CATEG_C |
16-01-2022 | 1 | 5 | 0 |
15-01-2022 | 8 | 1 | 3 |
14-01-2022 | 4 | 0 | 1 |
13-01-2022 | 1 | 1 | 0 |
12-01-2022 | 1 | 0 | 0 |
Results.
DATE | CATEG_A | CUM_A | CATEG_B | CUM_B | CATEG_C | CUM_C |
16-01-2022 | 1 | 15 | 5 | 7 | 0 | 4 |
15-01-2022 | 8 | 14 | 1 | 2 | 3 | 4 |
14-01-2022 | 4 | 6 | 0 | 1 | 1 | 1 |
13-01-2022 | 1 | 2 | 1 | 1 | 0 | 0 |
12-01-2022 | 1 | 1 | 0 | 0 | 0 | 0 |
thanks for your time
If you have PROC EXPAND, this will do cumulative sums as follows:
proc sort data=have;
by date;
run;
proc expand data=have out=want;
convert categ_a=cum_a categ_b=cum_b categ_c=cum_c/transformout=(cusum);
run;
Otherwise you would have to do this in a DATA step.
proc sort data=have;
by date;
run;
data want;
set have;
cum_a+categ_a;
cum_b+categ_b;
cum_c+categ_c;
run;
Please explain how in the original data CATEG_C is always zero but in the output CUM_C is sometimes not zero. And lots of other discrepancies between the input and output also exist and require explanation. Thanks.
sorry, I changed the table for a smaller one for the example, I will renew it.
If you have PROC EXPAND, this will do cumulative sums as follows:
proc sort data=have;
by date;
run;
proc expand data=have out=want;
convert categ_a=cum_a categ_b=cum_b categ_c=cum_c/transformout=(cusum);
run;
Otherwise you would have to do this in a DATA step.
proc sort data=have;
by date;
run;
data want;
set have;
cum_a+categ_a;
cum_b+categ_b;
cum_c+categ_c;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.