DATA Step, Macro, Functions and more

Summarizing a data with missing rows

Reply
Regular Contributor
Posts: 180

Summarizing a data with missing rows

Hello

I have a dataset that looks roughly like this:

Subject IDFollow-Up VisitDate
1120/05/14
1220/06/14
13
2115/04/13
2217/05/13
3125/06/14
4114/03/13
4220/04/13
4327/05/13

For each subject I have multiple rows, each representing a different follow-up visit, and the date of the visit (and other columns, not important for my question).

I want to create a summary table (using PROC TABULATE) that gives the number and percentage of subjects completing each follow-up visit (let's say there are 5 of them).

In order for a follow-up visit to be completed, I need it to have a row in the table, with a date which is not a missing value ! Not having a row, or having one, with a missing value, means the follow-up was not completed.

I am not sure how to do this. I thought maybe to "inflate" the number of rows, adding rows that are missing for each subject, with a missing date. But I am not sure how to do this. If you can help me with this, or suggest a better way to tackle this problem, it will be mostly appreciated !

Thank you in advance

Super User
Super User
Posts: 7,076

Re: Summarizing a data with missing rows

Trusted Advisor
Posts: 1,230

Re: Summarizing a data with missing rows

proc sql;

select Subject_ID,count(*) as Completed_freq,count(*)/5 as percentage format=percent6.1

from have where date ne .

group by subject_id;

quit;

PROC Star
Posts: 1,760

Re: Summarizing a data with missing rows

Am I missing something?

data T;

input SUBJECT ID DATE anydtdte.;   

cards;

1     1     20/05/14

1     2     20/06/14

1     3    

2     1     15/04/13

2     2     17/05/13

3     1     25/06/14

4     1     14/03/13

4     2     20/04/13

4     3     27/05/13

4     4     27/05/13

4     5     27/05/13

run;

proc tabulate data=T;

  where DATE ne .;

  class ID;

  table ID=' '

       ,(n='#'  pctn='%')

       / box='Visit Number';

run;

Visit Number

#

%

1

4

40.00

2

3

30.00

3

1

10.00

4

1

10.00

5

1

10.00

Regular Contributor
Posts: 180

Re: Summarizing a data with missing rows

Thank you all for replying, 3 different approaches, and all interesting.

Chris, your approach seems straightforward, but I have a question, if non of the subjects completed follow-up #5, will your code provide a line for follow-up number 5 with a percentage of 0% ?

where in the code you tell SAS to do that ?

PROC Star
Posts: 1,760

Re: Summarizing a data with missing rows

In this case see Tom's post.

data T;

input SUBJECT ID DATE anydtdte.;   

cards;

1     1     20/05/14

1     2     20/06/14

1     3    

2     1     15/04/13

2     2     17/05/13

3     1     25/06/14

4     1     14/03/13

4     2     20/04/13

4     3     27/05/13

4     5     27/05/13

run;

data CLASSDATA;

  do ID=1 to 5;

    output;

  end;

run;

option missing='0';

proc tabulate data=T classdata=CLASSDATA;

  where DATE ne .;

  class ID;

  table ID=' '

       ,(n='#'  pctn='%')

       / box='Visit Number';

run;

Visit Number

#

%

1

4

44.44

2

3

33.33

3

1

11.11

4

0

0.00

5

1

11.11

Regular Contributor
Posts: 180

Re: Summarizing a data with missing rows

Thank you, the idea of classdata worked good.

I got a similar table to the one Chris shows. Is there a way to add columns of those who did not complete the follow up visits, and other columns for all ?

for example, if the first row is 4, 44.44%, and let's assume (for example sake) that I have 20 subjects, I want two columns of 16, 55.56% and two columns for all 20, 100%

Ask a Question
Discussion stats
  • 6 replies
  • 388 views
  • 0 likes
  • 4 in conversation