- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have read a few posts about this, but none of the codes work for what I am trying to do and I am not well-versed enough in proc sql to figure out how to tweak them correctly.
I am trying to prepare data for run charts and study process metrics for a clinical trial. I need to calculate the number of providers enrolled at each month in this study (cumulative). I coded it so that I have which month the provider enrolled in the study and I basically need to calculate how many providers are enrolled during month 1, month 2, month 3, etc. So there are 3 providers enrolled at month 1, and 2 more enroll during month 2. The number of enrolled providers for month 2 would then be 5 (3 enrolled month 1, 2 enrolled month 2). I need a code to calculate the number enrolled per month, cumulatively, and output this as a new variable. It would also be great to be able to get this number by site.
For reference this is what my data looks like:
username site month
user1 117 1
user2 117 1
user3 117 1
user4 104 2
user5 104 2
user6 104 3
user7 104 3
user8 110 3
user9 110 3
user10 114 3
user11 114 3
user12 117 3
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is provider the same as username? Can a provider be listed more than once?
/* First get non-cumulative results */
proc summary data=have nway;
class site month;
var month;
output out=counts n=count;
run;
/* Next get cumulative results */
data want;
set counts;
by site;
if first.site then cumulative=0;
cumulative+count;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is provider the same as username? Can a provider be listed more than once?
/* First get non-cumulative results */
proc summary data=have nway;
class site month;
var month;
output out=counts n=count;
run;
/* Next get cumulative results */
data want;
set counts;
by site;
if first.site then cumulative=0;
cumulative+count;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Providers are only listed once, so the usernames are unique and serve as the providerid. And that worked! With the minor substitution of set counts instead of set want. Thank you 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@fordcr2 wrote:
With the minor substitution of set counts instead of set want.
Yep. My bad! I went back and made the correction.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I know but I needed it in the actual dataset as a variable to be exported - didn't know how to make the cumulative number from proc freq into an output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In my example, you could easily replace PROC SUMMARY with PROC FREQ to get non-cumulative results, then the same method to get cumulative results, although I leave the issue of writing the actual code to others.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Got it. I actually used proc summary to get the total number of providers across all sites using your code to get the cumulative numbers per site, and then proc summary to total of those numbers for the overall cumulative number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@fordcr2 wrote:
I know but I needed it in the actual dataset as a variable to be exported - didn't know how to make the cumulative number from proc freq into an output.
Example using the OUTCUM option to get cumulative totals.
proc freq data = sashelp.class noprint; tables age/ out=work.freq outcum ; run;
However that only works on one-way tables since the concept of "cumulative" gets interesting in multidimensional (2 or more variables) terms.