Hello
I have a dataset that looks roughly like this:
Subject ID | Follow-Up Visit | Date |
---|---|---|
1 | 1 | 20/05/14 |
1 | 2 | 20/06/14 |
1 | 3 | |
2 | 1 | 15/04/13 |
2 | 2 | 17/05/13 |
3 | 1 | 25/06/14 |
4 | 1 | 14/03/13 |
4 | 2 | 20/04/13 |
4 | 3 | 27/05/13 |
For each subject I have multiple rows, each representing a different follow-up visit, and the date of the visit (and other columns, not important for my question).
I want to create a summary table (using PROC TABULATE) that gives the number and percentage of subjects completing each follow-up visit (let's say there are 5 of them).
In order for a follow-up visit to be completed, I need it to have a row in the table, with a date which is not a missing value ! Not having a row, or having one, with a missing value, means the follow-up was not completed.
I am not sure how to do this. I thought maybe to "inflate" the number of rows, adding rows that are missing for each subject, with a missing date. But I am not sure how to do this. If you can help me with this, or suggest a better way to tackle this problem, it will be mostly appreciated !
Thank you in advance
You probably want to provide it a CLASSDATA dataset.
proc sql;
select Subject_ID,count(*) as Completed_freq,count(*)/5 as percentage format=percent6.1
from have where date ne .
group by subject_id;
quit;
Am I missing something?
data T;
input SUBJECT ID DATE anydtdte.;
cards;
1 1 20/05/14
1 2 20/06/14
1 3
2 1 15/04/13
2 2 17/05/13
3 1 25/06/14
4 1 14/03/13
4 2 20/04/13
4 3 27/05/13
4 4 27/05/13
4 5 27/05/13
run;
proc tabulate data=T;
where DATE ne .;
class ID;
table ID=' '
,(n='#' pctn='%')
/ box='Visit Number';
run;
Visit Number | # | % |
1 | 4 | 40.00 |
2 | 3 | 30.00 |
3 | 1 | 10.00 |
4 | 1 | 10.00 |
5 | 1 | 10.00 |
Thank you all for replying, 3 different approaches, and all interesting.
Chris, your approach seems straightforward, but I have a question, if non of the subjects completed follow-up #5, will your code provide a line for follow-up number 5 with a percentage of 0% ?
where in the code you tell SAS to do that ?
In this case see Tom's post.
data T;
input SUBJECT ID DATE anydtdte.;
cards;
1 1 20/05/14
1 2 20/06/14
1 3
2 1 15/04/13
2 2 17/05/13
3 1 25/06/14
4 1 14/03/13
4 2 20/04/13
4 3 27/05/13
4 5 27/05/13
run;
data CLASSDATA;
do ID=1 to 5;
output;
end;
run;
option missing='0';
proc tabulate data=T classdata=CLASSDATA;
where DATE ne .;
class ID;
table ID=' '
,(n='#' pctn='%')
/ box='Visit Number';
run;
Visit Number | # | % |
1 | 4 | 44.44 |
2 | 3 | 33.33 |
3 | 1 | 11.11 |
4 | 0 | 0.00 |
5 | 1 | 11.11 |
Thank you, the idea of classdata worked good.
I got a similar table to the one Chris shows. Is there a way to add columns of those who did not complete the follow up visits, and other columns for all ?
for example, if the first row is 4, 44.44%, and let's assume (for example sake) that I have 20 subjects, I want two columns of 16, 55.56% and two columns for all 20, 100%
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.