Dear SAS experts
I would like to categorize my data based on cumulatative sums. I would like to compute cumulutative sums, but at the row when the sum reaches =>5, I would like the sum to start over. Moreover, I would like the data to be categorized somehow such that each "bout" of cumulative sums are categorized together.
Given the example data below:
data example;
input value;
datalines;
2
3
6
2
6
2
;
run;
I would like the resulting dataset to look as such:
value cumsum_var Categorical_var
2 2 1
3 5 1
6 6 2
2 2 3
6 8 3
2 2 .
Is this possible to do? I have not managed to find a solution yet. I am thinking that a cumulative sum variable must be created, but there may be a more efficient way of achieving the desired result.
I probably would be able to create the categorical variable if I could create the sum variable.
Thank you
data example; input value; datalines; 2 3 6 2 6 2 ; run; data want; set example; cum+value; if cum>5 then do;cum=value;Categorical_var+1;end; run;
Hey KSharp
Thanks for the code.
It appears that row 4 and 5 are categorized seperately when they should be in the same category? It appears that the code only works as intended in the first 2 rows.
Thank you
On row 2, the cumulative sum is >=5, but you don't change categorical_var. That seems to be different than what your text is saying.
Dear PaigeMiller
True. I meant to say that the row when the cumulative sum reaches >=5 should be included in the row/rows which precede this row.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.