Hi,
I have read a few posts about this, but none of the codes work for what I am trying to do and I am not well-versed enough in proc sql to figure out how to tweak them correctly.
I am trying to prepare data for run charts and study process metrics for a clinical trial. I need to calculate the number of providers enrolled at each month in this study (cumulative). I coded it so that I have which month the provider enrolled in the study and I basically need to calculate how many providers are enrolled during month 1, month 2, month 3, etc. So there are 3 providers enrolled at month 1, and 2 more enroll during month 2. The number of enrolled providers for month 2 would then be 5 (3 enrolled month 1, 2 enrolled month 2). I need a code to calculate the number enrolled per month, cumulatively, and output this as a new variable. It would also be great to be able to get this number by site.
For reference this is what my data looks like:
username site month
user1 117 1
user2 117 1
user3 117 1
user4 104 2
user5 104 2
user6 104 3
user7 104 3
user8 110 3
user9 110 3
user10 114 3
user11 114 3
user12 117 3
Is provider the same as username? Can a provider be listed more than once?
/* First get non-cumulative results */
proc summary data=have nway;
class site month;
var month;
output out=counts n=count;
run;
/* Next get cumulative results */
data want;
set counts;
by site;
if first.site then cumulative=0;
cumulative+count;
run;
Is provider the same as username? Can a provider be listed more than once?
/* First get non-cumulative results */
proc summary data=have nway;
class site month;
var month;
output out=counts n=count;
run;
/* Next get cumulative results */
data want;
set counts;
by site;
if first.site then cumulative=0;
cumulative+count;
run;
Providers are only listed once, so the usernames are unique and serve as the providerid. And that worked! With the minor substitution of set counts instead of set want. Thank you 🙂
@fordcr2 wrote:
With the minor substitution of set counts instead of set want.
Yep. My bad! I went back and made the correction.
I know but I needed it in the actual dataset as a variable to be exported - didn't know how to make the cumulative number from proc freq into an output.
In my example, you could easily replace PROC SUMMARY with PROC FREQ to get non-cumulative results, then the same method to get cumulative results, although I leave the issue of writing the actual code to others.
Got it. I actually used proc summary to get the total number of providers across all sites using your code to get the cumulative numbers per site, and then proc summary to total of those numbers for the overall cumulative number.
@fordcr2 wrote:
I know but I needed it in the actual dataset as a variable to be exported - didn't know how to make the cumulative number from proc freq into an output.
Example using the OUTCUM option to get cumulative totals.
proc freq data = sashelp.class noprint; tables age/ out=work.freq outcum ; run;
However that only works on one-way tables since the concept of "cumulative" gets interesting in multidimensional (2 or more variables) terms.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.