Hello
I have a dataset that looks roughly like this:
Subject ID | Follow-Up Visit | Date |
---|---|---|
1 | 1 | 20/05/14 |
1 | 2 | 20/06/14 |
1 | 3 | |
2 | 1 | 15/04/13 |
2 | 2 | 17/05/13 |
3 | 1 | 25/06/14 |
4 | 1 | 14/03/13 |
4 | 2 | 20/04/13 |
4 | 3 | 27/05/13 |
For each subject I have multiple rows, each representing a different follow-up visit, and the date of the visit (and other columns, not important for my question).
I want to create a summary table (using PROC TABULATE) that gives the number and percentage of subjects completing each follow-up visit (let's say there are 5 of them).
In order for a follow-up visit to be completed, I need it to have a row in the table, with a date which is not a missing value ! Not having a row, or having one, with a missing value, means the follow-up was not completed.
I am not sure how to do this. I thought maybe to "inflate" the number of rows, adding rows that are missing for each subject, with a missing date. But I am not sure how to do this. If you can help me with this, or suggest a better way to tackle this problem, it will be mostly appreciated !
Thank you in advance
You probably want to provide it a CLASSDATA dataset.
proc sql;
select Subject_ID,count(*) as Completed_freq,count(*)/5 as percentage format=percent6.1
from have where date ne .
group by subject_id;
quit;
Am I missing something?
data T;
input SUBJECT ID DATE anydtdte.;
cards;
1 1 20/05/14
1 2 20/06/14
1 3
2 1 15/04/13
2 2 17/05/13
3 1 25/06/14
4 1 14/03/13
4 2 20/04/13
4 3 27/05/13
4 4 27/05/13
4 5 27/05/13
run;
proc tabulate data=T;
where DATE ne .;
class ID;
table ID=' '
,(n='#' pctn='%')
/ box='Visit Number';
run;
Visit Number | # | % |
1 | 4 | 40.00 |
2 | 3 | 30.00 |
3 | 1 | 10.00 |
4 | 1 | 10.00 |
5 | 1 | 10.00 |
Thank you all for replying, 3 different approaches, and all interesting.
Chris, your approach seems straightforward, but I have a question, if non of the subjects completed follow-up #5, will your code provide a line for follow-up number 5 with a percentage of 0% ?
where in the code you tell SAS to do that ?
In this case see Tom's post.
data T;
input SUBJECT ID DATE anydtdte.;
cards;
1 1 20/05/14
1 2 20/06/14
1 3
2 1 15/04/13
2 2 17/05/13
3 1 25/06/14
4 1 14/03/13
4 2 20/04/13
4 3 27/05/13
4 5 27/05/13
run;
data CLASSDATA;
do ID=1 to 5;
output;
end;
run;
option missing='0';
proc tabulate data=T classdata=CLASSDATA;
where DATE ne .;
class ID;
table ID=' '
,(n='#' pctn='%')
/ box='Visit Number';
run;
Visit Number | # | % |
1 | 4 | 44.44 |
2 | 3 | 33.33 |
3 | 1 | 11.11 |
4 | 0 | 0.00 |
5 | 1 | 11.11 |
Thank you, the idea of classdata worked good.
I got a similar table to the one Chris shows. Is there a way to add columns of those who did not complete the follow up visits, and other columns for all ?
for example, if the first row is 4, 44.44%, and let's assume (for example sake) that I have 20 subjects, I want two columns of 16, 55.56% and two columns for all 20, 100%
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.