The SAS Output Delivery System and reporting techniques

not to display missing value in a table

Reply
Occasional Contributor
Posts: 12

not to display missing value in a table

Hi all:

   I am wondering how to avoid displaying missing values in a table by proc tabulate. Say, for example, I have a clinical dataset with several variables such as

treatment: group = 'case'/'control'

frailty scale: fscale= 1,2,3

symtom: sym = 1,2,3;

proc data=mydata out=test;

class group;

class fscale sym/ missing;

table fscale sym, (group ALL)*(N PCTN) / missingtext='0';

run;

I put 'missing' in the 'class' statement in order to get correct statistics. However, I don't want summary for missing values e.g

                   Case       Control

fscale    N  PCTN   N PCTN

.              10  10      10  10                            /* I don't want this line */

1              0     0        5    5

......

Is there any way to do that directly? I cannot find this option online. I tried to output the result to a table (test in this case) and remove the record corresponding to the first line. However, there is problem with this approach. Notice there is no value for Case and fscale = 1, which I printed as 0. SAS will not output it in the output table test.

   group fscale N...

    Control 1     10     /* this record is there in test*/

    Case    1      0 ...      /* this will not be outputed, but I need this to reconstruct my table */

Thanks,


Super User
Posts: 11,343

Re: not to display missing value in a table

Posted in reply to tradepeter

Look at the CLASSDATA and EXCLUSIVE options on the PROC TABULATE statement. CLASSDATA references a dataset containing the various levels of you CLASS variables you are interested in and EXCLUSIVE says to display only the levels in that set.

I think you can get the CLASSDATA set for your purposes by:

Proc summary data=mydata nway;

     class group fscale sym;

     output out=classdataset (drop= _Smiley Happy;

run;

as summary will by default not include missing values of class variables.

Occasional Contributor
Posts: 12

Re: not to display missing value in a table

I tried to use CLASSDATA and EXCLUSIVE. However, tabulate will not generate correct summaries if missing value is not set as a level in CLASSDATA. If missing value is set as a valid level, the row&column for missing will still be display in tabulate. Maybe I am not doing it right.......?

Thanks,

Super User
Posts: 11,343

Re: not to display missing value in a table

Posted in reply to tradepeter

This may be a case of pre-processing the data before the final version.

You could use PROC tabulate once to generater an output data set by adding OUT=datasetname on the PROC statement. This will generate additional variable names for the statistics but the class variables stay the same. You also will get something like GROUP_all as a class var. You could then use TABULATE or PRINT to get the desired output using a WHERE fscale ne . ; If there are more variables with missing that you don't want to include you probably need a data step to use more complex logic to keep the ones you want.

Super Contributor
Posts: 349

Re: not to display missing value in a table

Hi,

Is this what you want...or can you post the data also.

data aaa;

input treatment $ 1-8 fscale sym;

cards;

case    1 1

control 2 2

           5 5

control 3 3

;

run;

proc tabulate data=aaa out=test missing;

class treatment;

var fscale sym;

table fscale sym, (treatment ALL)*(N PCTN);

run;

data aaa1;

input treatment $ 1-8 fscale sym;

cards;

case    1 1

control 2 2

case    . .

control 3 3

;

run;

proc tabulate data=aaa1 out=test1 missing;

class treatment;

var fscale sym;

table fscale sym, (treatment ALL)*(N PCTN);

run;

Thanks,

Shiva

SAS Super FREQ
Posts: 8,864

Re: not to display missing value in a table

Posted in reply to tradepeter

Hi:

  Let's go back to the beginning. I have code that generates exactly 100 observations and then runs a TABULATE step and creates WORK.TEST using the OUT= option.

data mydata;

  infile datalines dlm=',' dsd;

  input ID $ grp $ fscale sym;

return;

datalines;

"1001","case",.,.

"1002","case",.,.

"1003","case",.,.

"1004","case",.,.

"1005","case",.,.

"1006","case",.,.

"1007","case",.,.

"1008","case",.,.

"1009","case",.,.

"1010","case",.,1

"1011","case",2,1

"1012","case",3,2

"1013","case",3,3

"1014","case",2,1

"1015","case",2,2

"1016","case",3,3

"1017","case",3,1

"1018","case",2,2

"1019","case",2,3

"1020","case",3,2

"1021","case",3,3

"1022","case",2,2

"1023","case",3,1

"1024","case",2,3

"1025","case",3,2

"1026","case",2,1

"1027","case",3,3

"1028","case",2,2

"1029","case",2,1

"1030","case",3,3

"1031","case",2,2

"1032","case",2,1

"1033","case",2,2

"1034","case",2,1

"1035","case",2,3

"1036","case",2,1

"1037","case",2,2

"1038","case",2,1

"1039","case",2,3

"1040","case",2,1

"1041","case",2,2

"1042","case",2,1

"1043","case",2,1

"1044","case",2,2

"1045","case",2,3

"1046","case",2,1

"1047","case",3,1

"1048","case",3,2

"1049","case",3,3

"1050","case",3,1

"1051","case",3,2

"1052","case",3,3

"1053","case",3,1

"1054","case",3,2

"1055","case",3,3

"1056","case",3,1

"1057","case",3,2

"1058","case",3,3

"1059","control",.,1

"1060","control",.,1

"1061","control",.,1

"1062","control",.,1

"1063","control",.,1

"1064","control",.,1

"1065","control",.,1

"1066","control",.,1

"1067","control",.,1

"1068","control",.,1

"1069","control",1,1

"1070","control",1,2

"1071","control",1,3

"1072","control",1,1

"1073","control",1,2

"1074","control",2,3

"1075","control",3,1

"1076","control",2,2

"1077","control",3,3

"1078","control",2,1

"1079","control",2,2

"1080","control",3,1

"1081","control",2,3

"1082","control",2,2

"1083","control",2,1

"1084","control",3,3

"1085","control",3,2

"1086","control",2,1

"1087","control",2,3

"1088","control",3,2

"1089","control",2,1

"1090","control",2,2

"1091","control",3,1

"1092","control",3,2

"1093","control",3,3

"1094","control",3,2

"1095","control",3,1

"1096","control",3,3

"1097","control",3,2

"1098","control",3,1

"1099","control",3,3

"1100","control",3,3

;

run;

         

ods listing close;

options nodate nonumber center;

title;

 

ods pdf file='c:\temp\report.pdf' style=journal;

proc tabulate data=mydata out=test f=comma8.;

class grp;

class fscale sym/ missing;

table (fscale ) (sym ),

      (grp ALL)*(N PCTN) / misstext='0' printmiss;

run;

    

proc print data=work.test;

title 'work.test';

run;

ods pdf close;

I used PDF so I could take a screen shot of the TABULATE and PRINT outputs side by side. If you look, you will see that the _TYPE_ column in WORK.TEST can be directly tied back to the rows and columns in your TABULATE table. So the information you want to delete is inside the red circle, where  _TYPE_ 110, and FSCALE = . for both CASE and CONTROL values of the group variable (GRP). Notice how each of the values for _TYPE_ corresponds to a section of the finished table.

If you need the missing values for the percents to calculate correctly, there is no way to tell TABULATE to use the missing, but NOT display the missing. As others have suggested, the most usual way of dealing with what you want to do (use the missing to calculate N and PCTN), but then not display is to post-process the data set from TABULATE: create an output data set from TABULATE, and then post-process with a program that uses the values of _TYPE_ to delete the information you do NOT want. Note that I used the PRINTMISS option, with MISSING in order to show the N and PCTN for FSCALE=1 and have that also appear in WORK.TEST.

In a DATA step program, you could assign a value to the group variable in your program based on the values of _TYPE_...something like:

if _type_ = '010' then grp = 'Overall';

and you could also selectively get rid of the rows you did not want. Then you would have to take this post-processd data and either get it restructured the way you want with PROC TRANSPOSE and then use PROC PRINT or use the data with PROC REPORT to get the report you want. Two possible reports with PROC REPORT are shown -- one with 2 separate tables for each variable and the other with one table for both variables.

My suggestion is that you work to understand TABULATE and getting the output and percents you want in a SAS dataset before you move on to the post-processing and using REPORT or some other method to write out what you need.

cynthia


case_control_explan.jpgtwo_sep_tables.jpgalternate_final.jpg
Ask a Question
Discussion stats
  • 5 replies
  • 1332 views
  • 0 likes
  • 4 in conversation