Try out this code (modified according to @Tom's reply):
data want;
set have;
by year;
if first.year then cumsum=0;
cumsum+number;
run;
As a SAS user, I think computing cumulative sums in SAS is far less intuitive than doing so in Microsoft Excel. You will hardly be able to know what the line "cumsum+number;" means if nobody tells you that it is for computing cumulative sums.
data want;
set have;
by year;
if first.year then runsum=number;
else runsum+number;
run;
The FIRST. reference is incorrect, you need to use variable listed in the By statement.
@toomanystepsint wrote:
I have a dataset.It has 3 variables:-year-date-numberI want the cumulative value of number resetting every year.The typical sas advice does not work for me.For example, according to this answer my code should be:data want;set have;by year;if first.date then runsum=number;else runsum+number;run;But this code gives me all null values for runsum. Why? That makes no sense. There's nothing I'm doing that the other answers aren't doing, this just isn't working.I've tried many other things, including adding a retain, and doing by year and date. Retain does nothing, all my values are just null. Doing by year and date just sets runsum to the value of number, it never cumulates. I've tried turning my date variable into datetime, into character, nothing works. Why?
I'm having a difficult time reconciling your response to @Reeza
@toomanystepsint wrote:
I AM using a variable listed in the by statement. You can see I'm saying "by year" and if first.year.
Unbelievable.
with your original post:
data want;set have;by year;if first.date then runsum=number;else runsum+number;run;
Re-read your code it says first.date, not first.year as posted.
I posted the corrected code.
@toomanystepsint wrote:
I AM using a variable listed in the by statement. You can see I'm saying "by year" and if first.year.
Unbelievable.
@toomanystepsint wrote:
I AM using a variable listed in the by statement. You can see I'm saying "by year" and if first.year.
Unbelievable.
Please share the lines from the SAS log for your data step.
The code you posted in your question was using FIRST.DATE, but only listed YEAR in the BY statement. That will cause SAS to invent a first.date variable, which will never be true.
If the goal is to re-start the accumulator when starting a new year then use FIRST.YEAR in the IF statement. You can keep DATE in the BY statement if you want have the data step make sure the observations are actually sorted by DATE within YEAR and stop if it is not.
Let's make some sample data:
data have; input year date :date. count; format date date9. ; cards; 1 01JAN2000 10 1 02JAN2000 20 1 03JAN2000 30 2 01JAN2001 10 2 01FEB2001 15 ;
And try making a running sum of the COUNT variable.
Remember to use a NEW variable for the running sum. If you use a variable that already exists on the input dataset then the value read by the SET statement will replace the value calculated on the previous observation.
data want;
set have ;
by year date;
if first.year then running_sum=0;
running_sum + count;
run;
Result
running_ Obs year date count sum 1 1 01JAN2000 10 10 2 1 02JAN2000 20 30 3 1 03JAN2000 30 60 4 2 01JAN2001 10 10 5 2 01FEB2001 15 25
If the goal is something else then you need to explain it more clearly. Sharing example input data and the desired result from that input will help clarify what you want.
It is not needed, but if you have a year AND date variable it ensures that the data is sorted correctly within the year to get the correct running sum. Otherwise, there could be a sum that's reset by year, but out of order by day.
This isn't stated in the question but best practice based on experience.
@Season wrote:
Your code is generally correct except for an issue that could be improved: the variable date is not needed in the BY statement.
All right. I have got it. Thank you for your patient reply!
@Season wrote:
Your code is generally correct except for an issue that could be improved: the variable date is not needed in the BY statement.
Please re-read my full response and you will see why it is there.
Try out this code (modified according to @Tom's reply):
data want;
set have;
by year;
if first.year then cumsum=0;
cumsum+number;
run;
As a SAS user, I think computing cumulative sums in SAS is far less intuitive than doing so in Microsoft Excel. You will hardly be able to know what the line "cumsum+number;" means if nobody tells you that it is for computing cumulative sums.
If you read the LOG the reason is there. (Highlight added). An Uninitialized variable is always missing and so will never be "true". Which may well lead to @Reeza's correction to use the proper FIRST.<variablename>.
284 data want; 285 set have; 286 by year; 287 if first.date then runsum=number; 288 else runsum+number; 289 run; NOTE: Variable 'first.date'n is uninitialized. NOTE: There were 2 observations read from the data set USER.HAVE. NOTE: The data set USER.WANT has 2 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
You asked for advice to (emphasis mine)
I want the cumulative value of number resetting every year.
Yet you accepted a response (reproduced below) from @Season that does not do a yearly reset:
data want;
set have;
by year;
cumsum+number;
output;
run;
Please clarify.
Otherwise, naive users may believe the accepted solution correctly addresses the problem as originally stated. Help us make sas communities a reliable source of advice.
Thank you for following my reply. Referencing @Tom's code, I have modified mine. It is at @toomanystepsint's discretion to choose the reply of whom as the solution.
A reminder for @toomanystepsint is that you can retract the acceptance of one's reply as the solution and accepting another reply as solution afterwards, regardless of whether the second reply is posted by the original solution provider.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.