I'm new to SAS and have cleaned my dataset and have it as follows:
VxDate | Moderna | Pfizer | Spikevax | Total | Year | VxMonth |
07/05/2023 | 0 | 1 | 0 | 1 | 2023 | July |
08/07/2023 | 0 | 1 | 0 | 1 | 2023 | August |
09/05/2023 | 1 | 1 | 1 | 3 | 2023 | September |
I would like to create a report that summarizes the vaccines by month and the total vaccines administered each month. The above only shows 3 dates, my data has dates from Jan 2023 to Dec 2023.
I would like data in the following format:
VaxType | July | August | September | Total |
Moderna |
|
|
|
|
Pfizer |
|
|
|
|
Spikevax |
|
|
|
|
Thank you
There's a lot of things to get right here in order to produce the desired table. First, a long data set (like HAVE_T in the program below) is always preferable to a wide data set, but you start with a wide data set. Why is a long data set preferable? Because almost all SAS data handling and analysis procedures work properly with a long data set. So as a long-term best practice, do not create wide data sets, create long data sets.
So the first step is to use PROC TRANSPOSE to create a long data set named HAVE_T. I cannot work with HAVE, but I can work with HAVE_T.
Next, here in the SAS Communities, we prefer (and usually insist) that data be presented as working SAS data step code. Screen captures and table images such as the one your present are difficult to work with, and so many people will request you create a data set that is working SAS data step code, as I have presented (please see examples and instructions). I did the work to create SAS code in this case, please you do the work from now on to present data that way, and test your code to make sure it work before including it into the problem statement.
Lastly, you don't need a variable MONTH or YEAR to do this, in fact those are redundant if you have an actual valid numeric SAS date. Note that my code only uses VXDATE and not MONTH or YEAR. If you were to work with text strings such as July, August and September, SAS will sort those alphabetically and August then becomes the first month of those three. If you work with dates as numbers (as I am doing), then you can tell SAS to sort in numerical order (order=internal), and so July comes first numerically, then August and so on. I also don't use variable TOTAL, but rather compute it in PROC REPORT.
data have;
input VxDate :mmddyy10. Moderna Pfizer Spikevax Total Year VxMonth $9.;
format vxdate date9.;
cards;
07/05/2023 0 1 0 1 2023 July
08/07/2023 0 1 0 1 2023 August
09/05/2023 1 1 1 3 2023 September
;
proc transpose data=have out=have_t;
by vxdate;
var moderna pfizer spikevax;
run;
proc report data=have_t;
columns _name_ col1,vxdate col1=total;
define _name_/group ' ';
define col1/' ';
define vxdate/across format=monname12. ' ' order=internal;
define total/sum 'Total';
run;
There's a lot of things to get right here in order to produce the desired table. First, a long data set (like HAVE_T in the program below) is always preferable to a wide data set, but you start with a wide data set. Why is a long data set preferable? Because almost all SAS data handling and analysis procedures work properly with a long data set. So as a long-term best practice, do not create wide data sets, create long data sets.
So the first step is to use PROC TRANSPOSE to create a long data set named HAVE_T. I cannot work with HAVE, but I can work with HAVE_T.
Next, here in the SAS Communities, we prefer (and usually insist) that data be presented as working SAS data step code. Screen captures and table images such as the one your present are difficult to work with, and so many people will request you create a data set that is working SAS data step code, as I have presented (please see examples and instructions). I did the work to create SAS code in this case, please you do the work from now on to present data that way, and test your code to make sure it work before including it into the problem statement.
Lastly, you don't need a variable MONTH or YEAR to do this, in fact those are redundant if you have an actual valid numeric SAS date. Note that my code only uses VXDATE and not MONTH or YEAR. If you were to work with text strings such as July, August and September, SAS will sort those alphabetically and August then becomes the first month of those three. If you work with dates as numbers (as I am doing), then you can tell SAS to sort in numerical order (order=internal), and so July comes first numerically, then August and so on. I also don't use variable TOTAL, but rather compute it in PROC REPORT.
data have;
input VxDate :mmddyy10. Moderna Pfizer Spikevax Total Year VxMonth $9.;
format vxdate date9.;
cards;
07/05/2023 0 1 0 1 2023 July
08/07/2023 0 1 0 1 2023 August
09/05/2023 1 1 1 3 2023 September
;
proc transpose data=have out=have_t;
by vxdate;
var moderna pfizer spikevax;
run;
proc report data=have_t;
columns _name_ col1,vxdate col1=total;
define _name_/group ' ';
define col1/' ';
define vxdate/across format=monname12. ' ' order=internal;
define total/sum 'Total';
run;
@PaigeMiller Thank you so much for the detailed explanation, this worked. Moving forward I'll also make sure that my data is presented as a working SAS data step code.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.