I have been using the following, which is quite neat when you want to sum values in a column.
IF FIRST.var1 THEN ACC_var1=var1;
ELSE ACC_var+var1;
However, I only have the skills to do it when the dataset only contains 12 months. ^^
How do I accumulate the values within in a year (period = months) when I have multiple years?
For the program code attached I would like to accumulated the 12 months for 2019 and 2020 and the current ones for 2021 for both var1 and var2.
Should I use a DO-loop/some sort of lag solution or is there an another way to use First./ Last.?
I am open for any suggestion. Many thanks!
DATA WORK.temp;
Infile datalines delimiter=',';
INPUT YearMonth $ var1 var2 Desired_Accu1 Desired_Accu2
;
CARDS;
2019-01,100,500,100,500
2019-02,200,600,300,1100
2019-03,300,700,600,1800
2019-04,400,100,1000,1900
2019-05,500,200,1500,2100
2019-06,600,300,2100,2400
2019-07,700,400,2800,2800
2019-08,800,500,3600,3300
2019-09,100,500,3700,3800
2019-10,200,600,3900,4400
2019-11,300,700,4200,5100
2019-12,400,100,4600,5200
2020-01,500,200,500,200
2020-02,600,300,1100,500
2020-03,700,400,1800,900
2020-04,800,500,2600,1400
2020-05,100,500,2700,1900
2020-06,200,600,2900,2500
2020-07,300,700,3200,3200
2020-08,400,100,3600,3300
2020-09,500,200,4100,3500
2020-10,600,300,4700,3800
2020-11,700,400,5400,4200
2020-12,800,500,6200,4700
2021-01,100,500,100,500
2021-02,50,300,150,800
;
RUN;
Thank you!
It worked like a charm! 😁
Of course it helps to change to BY Year. If I wasn't completely blocked I should have been able to figure it out. 🤔
You can use the retain statement to accumulate a variable or a sum statement. This topic is discussed in the SAS Programming 2 class in detail. I would recommend you take this course.
Here is an example to your question but only using one of the variables var1 by year.
DATA WORK.temp;
Infile datalines delimiter=',';
INPUT YearMonth $ var1 var2 Desired_Accu1 Desired_Accu2
;
year=substr(yearmonth,1,4);
CARDS;
2019-01,100,500,100,500
2019-02,200,600,300,1100
2019-03,300,700,600,1800
2019-04,400,100,1000,1900
2019-05,500,200,1500,2100
2019-06,600,300,2100,2400
2019-07,700,400,2800,2800
2019-08,800,500,3600,3300
2019-09,100,500,3700,3800
2019-10,200,600,3900,4400
2019-11,300,700,4200,5100
2019-12,400,100,4600,5200
2020-01,500,200,500,200
2020-02,600,300,1100,500
2020-03,700,400,1800,900
2020-04,800,500,2600,1400
2020-05,100,500,2700,1900
2020-06,200,600,2900,2500
2020-07,300,700,3200,3200
2020-08,400,100,3600,3300
2020-09,500,200,4100,3500
2020-10,600,300,4700,3800
2020-11,700,400,5400,4200
2020-12,800,500,6200,4700
2021-01,100,500,100,500
2021-02,50,300,150,800
;
RUN;
proc sort data=WORK.temp;
by Year;
run;
Data work.temp(keep=year var1 ret_var1);
set work.temp;
by year;
if first.year=1 then ret_var1=0;
ret_var1+var1;
run;
Thank you!
It worked like a charm! 😁
Of course it helps to change to BY Year. If I wasn't completely blocked I should have been able to figure it out. 🤔
Create a new variable called YEAR that you can use in your BY statement. You can create it by taking the first four characters of your year month variable.
Now you have a variable you can use with your FIRST and LAST.
DATA WORK.temp;
Infile datalines delimiter=',';
INPUT YearMonth $ var1 var2 Desired_Accu1 Desired_Accu2
;
Year = substr(yearMonth, 1, 4);
....
run;
data want;
set temp;
by year;
if first.year then do;
accu1 = var1;
accu2 = var2;
end;
else do;
accu1 = sum(accu1, var1);
accu2 = sum(accu2, var2);
end;
run;
Or if you can assume you will have complete data, every year/month, you can use the first month as your condition instead, no new variable needed this way.
if substr(yearMonth, 6, 2) = '01' then .....;
@Pili1100 wrote:
I have been using the following, which is quite neat when you want to sum values in a column.
IF FIRST.var1 THEN ACC_var1=var1;ELSE ACC_var+var1;
However, I only have the skills to do it when the dataset only contains 12 months. ^^
How do I accumulate the values within in a year (period = months) when I have multiple years?
For the program code attached I would like to accumulated the 12 months for 2019 and 2020 and the current ones for 2021 for both var1 and var2.
Should I use a DO-loop/some sort of lag solution or is there an another way to use First./ Last.?
I am open for any suggestion. Many thanks!
DATA WORK.temp; Infile datalines delimiter=','; INPUT YearMonth $ var1 var2 Desired_Accu1 Desired_Accu2 ; CARDS; 2019-01,100,500,100,500 2019-02,200,600,300,1100 2019-03,300,700,600,1800 2019-04,400,100,1000,1900 2019-05,500,200,1500,2100 2019-06,600,300,2100,2400 2019-07,700,400,2800,2800 2019-08,800,500,3600,3300 2019-09,100,500,3700,3800 2019-10,200,600,3900,4400 2019-11,300,700,4200,5100 2019-12,400,100,4600,5200 2020-01,500,200,500,200 2020-02,600,300,1100,500 2020-03,700,400,1800,900 2020-04,800,500,2600,1400 2020-05,100,500,2700,1900 2020-06,200,600,2900,2500 2020-07,300,700,3200,3200 2020-08,400,100,3600,3300 2020-09,500,200,4100,3500 2020-10,600,300,4700,3800 2020-11,700,400,5400,4200 2020-12,800,500,6200,4700 2021-01,100,500,100,500 2021-02,50,300,150,800 ; RUN;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.