Hi, relatively new SAS user here. Any help is greatly appreciated!
I need a table with the number of occurrences in each month by 2 criteria (TP and Elevated). So for the following example, I need a table with the # of TP=1 and Elevated=1 occurring in January.
example data
ID DOV Elevated TP
1 01JAN2017 1 1
2 05JAN2017 0 2
3 07JAN2017 1 3
4 01MAR2017 0 1
5 09MAR2017 1 1
I am able to calculate number of occurrences in each month by TP using:
proc sql;
CREATE TABLE screen_mth_detail as
SELECT distinct(TP), COUNT(distinct ID)AS TP_Count,
SUM (DOC between '01AUG2017'd AND '31AUG2017'd)AS AUG17_Count,
SUM (DOC between '01SEP2017'd AND '30SEP2017'd)AS SEP17_Count,
SUM (DOC between '01OCT2017'd AND '31OCT2017'd)AS OCT17_Count,
SUM (DOC between '01NOV2017'd AND '30NOV2017'd)AS NOV17_Count,
SUM (DOC between '01DEC2017'd AND '31DEC2017'd)AS DEC17_Count
FROM epds1
Group BY TP;
run;
How can I calculate the same but for only the cases in which Elevated=1?
Please supply more example data that illustrates why you need DISTINCT.
Nothing shown indicates the need for DISTINCT and if you don't need it, it's much easier to switch to a SAS PROC such as PROC MEANS which will aggregate data by month automatically without you having to hardcode the time intervals.
Looks like I may be making this more complicated than it needs to be. My data looks like this:
ID | DOV | Aged | Score | Elevate | TP |
13 | 8-Aug-17 | 32 | 3 | 0 | 1 |
11 | 9-Aug-17 | 1195 | 14 | 1 | 3 |
19 | 9-Aug-17 | 43 | 0 | 0 | 2 |
14 | 9-Aug-17 | 42 | 14 | 1 | 2 |
10 | 15-Aug-17 | 275 | 14 | 1 | 3 |
14 | 15-Aug-17 | 19 | 12 | 1 | 3 |
17 | 16-Sep-17 | 35 | 0 | 0 | 2 |
18 | 17-Sep-17 | 68 | 14 | 1 | 2 |
15 | 17-Sep-17 | 32 | 3 | 0 | 1 |
I need an output like this: (count of visits that are at TP1-3 in each month and of those, how many were elevated (elevate=1)
TP 1 | TP 1 Elevated | TP 2 | TP 2 Elevate | TP 3 | TP 3 Elevated | |
August | 1 | 0 | 2 | 1 | 3 | 3 |
Sept | 1 | 0 | 2 | 1 | 0 | 0 |
Thank you for your reply!
Try this:
data have;
input ID DOV :anydtdte. Aged Score Elevate TP ;
datalines;
13 8-Aug-17 32 3 0 1
11 9-Aug-17 1195 14 1 3
19 9-Aug-17 43 0 0 2
14 9-Aug-17 42 14 1 2
10 15-Aug-17 275 14 1 3
14 15-Aug-17 19 12 1 3
17 16-Sep-17 35 0 0 2
18 17-Sep-17 68 14 1 2
15 17-Sep-17 32 3 0 1
;
proc freq data=have;
format DOV monname10.;
table DOV*tp*elevate / out=freqs sparse noprint;
run;
data fixed;
set freqs;
id = catx(" ", "TP", tp, ifc(elevate, "Elevated", ""));
run;
proc transpose data=fixed out=want(drop=_: );
by DOV;
id id;
idlabel id;
var count;
run;
just create a var using the month function ie month=month(), and then use proc freq as someone else indicated, except use the 'list' options on the table statement ie tables month*tp*elevate /list;
@pau13rown wrote:
just create a var using the month function ie month=month(), and then use proc freq as someone else indicated, except use the 'list' options on the table statement ie
tables date*tp*elevate /list; format date monname3.;
;
That's not required, in SAS you can apply the month format within PROC FREQ so you don't need a new variable.
'not required' depends on your temperament.Having a variable that is merely implied doesn't encourage defensive coding etc
What's 'defensive coding'?
it's a euphemism for paranoia in the drug industry. For example, if i "know" my text variable (lab parameter for example) is in caps, i am still writing my code as "if upcase(param)='HCT' then ...". rather than simply "if param='HCT' then ..." If you make an error in industry there is nothing you can say to get you out of it, the cost is too high, hence defensive coding
I agree with the virtues of defensive coding. I consider the temporary association of a format Inside an analysis procedure such as proc freq to be defensive against any format association that might have been made outside the procedure. It also défends against the multiplication of versions of your data.
well, we could waste time discussing it, the industry is a different universe and we'll speak past each other. There would be no multiplication of data, obviously. The analysis plan would stipulate or imply that the variable is needed, thus it would exist in a permanent dataset, refer to cdisc, sdtm, adam. The format applied would be indicated within that documentation etc. When i said 'etc' above i was alluding to validation, you cannot validate a variable that doesn't exist. The process is extremely pedantic, maybe they document things with proc compare etc. Thus, throwing a format within freq shows an old school nonchalance that doesn't exist anymore. Each to his own, but i would try to encourage (only incidentally) good programming practice when discussing code: http://www.phusewiki.org/wiki/index.php?title=Good_Programming_Practice
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.