BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bijayadhikar
Obsidian | Level 7

Hi,

I am trying to calculate last year's YDT (i.e. 2023 Jan), last 5 year's YTD (i.e. 2019-2023 Jan), last 5 year's annual average (i.e. average 2019-2023 Jan) and compare with this year's YTD counts (i.e. 2024 Jan) to see if this year's YTD counts are high.  Once we have Feb full data we will calc YTD accordingly. Here is the dataset, thanks in advance for your help.

DATA SHOWN ONLY FOR 3 MONTHS HERE

 

data have;
infile datalines delimiter=',';
input disease$ year month tot;

cards;
A,2019,1,7
A,2019,2,9
A,2019,3,13
A,2020,1,7
A,2020,2,10
A,2020,3,9
A,2021,1,8
A,2021,2,16
A,2021,3,6
A,2022,1,8
A,2022,2,5
A,2022,3,7
A,2023,1,8
A,2023,2,4
A,2023,3,11
B,2019,1,11
B,2019,2,2
B,2019,3,5
B,2020,1,3
B,2020,2,6
B,2020,3,7
B,2021,1,2
B,2021,2,2
B,2021,3,1
B,2022,1,4
B,2022,2,4
B,2022,3,1
B,2023,1,2
B,2023,2,11
B,2023,3,5
A,2024,1,4
B,2024,1,7
;
run;

 

Expected OUTPUT

bijayadhikar_0-1707171954806.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Assuming you have no missing data this is one way. If you do have missing data, you will need another method.

 

Edited to reflect @mkeintz comment (thanks!)


proc sort data=have;
    by disease month year;
run;

proc transpose data=have out=wide prefix=YEAR;
    by disease month;
    id year;
    var tot;
run;

data want;
    set wide;
    by disease month;
    retain running_total;
     if first.disease then running_total=0;
    total=sum(of year2019-year2023);
    running_total+total;
    YTD_AVG_2019_2023=running_total/5;
    drop running_total total _name_;
run;

@bijayadhikar wrote:

Here is my revised desired output.

 

bijayadhikar_0-1707225368863.png

5-year average year-to-date COUNT = all 5 Jan/5


 

View solution in original post

7 REPLIES 7
Reeza
Super User
Is your desired shown correctly for the headings? What value does YTD 2019-2023 provide when the period is not same as the average? I would expect that to be the average. And you want to include the average including the current year? or is that a typo?
bijayadhikar
Obsidian | Level 7

Thank you Reeza. Likely I made a mistake. My objective is to produce 2024 monthly report. Each month of 2024 will be compared avg of last 5 years of that month. i.e. JAN this year will be compared with five previous January divided by 5, so I get historical average to compare with this year. Here is an example report I found and trying to produce automated report that I will schedule in SAS Management Console. Basically I need to figure out 2nd last column of the report below using macro year and month, so that the it will be good next year and so on. You can ignore other YTD columns. I need "5-year average year-to-date COUNT" column. 

https://www.publichealthontario.ca/-/media/Documents/Surveillance-Reports/Infectious/2022/surveillan... 

bijayadhikar
Obsidian | Level 7

Here is my revised desired output.

 

bijayadhikar_0-1707225368863.png

5-year average year-to-date COUNT = all 5 Jan/5

Reeza
Super User

Assuming you have no missing data this is one way. If you do have missing data, you will need another method.

 

Edited to reflect @mkeintz comment (thanks!)


proc sort data=have;
    by disease month year;
run;

proc transpose data=have out=wide prefix=YEAR;
    by disease month;
    id year;
    var tot;
run;

data want;
    set wide;
    by disease month;
    retain running_total;
     if first.disease then running_total=0;
    total=sum(of year2019-year2023);
    running_total+total;
    YTD_AVG_2019_2023=running_total/5;
    drop running_total total _name_;
run;

@bijayadhikar wrote:

Here is my revised desired output.

 

bijayadhikar_0-1707225368863.png

5-year average year-to-date COUNT = all 5 Jan/5


 

bijayadhikar
Obsidian | Level 7

Thank you @Reeza  This works for me. Very apricated.

mkeintz
PROC Star
I think a

if first.disease then call missing(running_total) ;

statement is needed to prevent including prior diseases in the ytd values for the current disease.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 553 views
  • 4 likes
  • 3 in conversation