BookmarkSubscribeRSS Feed
tradepeter
Calcite | Level 5

Hi all:

   I am wondering how to avoid displaying missing values in a table by proc tabulate. Say, for example, I have a clinical dataset with several variables such as

treatment: group = 'case'/'control'

frailty scale: fscale= 1,2,3

symtom: sym = 1,2,3;

proc data=mydata out=test;

class group;

class fscale sym/ missing;

table fscale sym, (group ALL)*(N PCTN) / missingtext='0';

run;

I put 'missing' in the 'class' statement in order to get correct statistics. However, I don't want summary for missing values e.g

                   Case       Control

fscale    N  PCTN   N PCTN

.              10  10      10  10                            /* I don't want this line */

1              0     0        5    5

......

Is there any way to do that directly? I cannot find this option online. I tried to output the result to a table (test in this case) and remove the record corresponding to the first line. However, there is problem with this approach. Notice there is no value for Case and fscale = 1, which I printed as 0. SAS will not output it in the output table test.

   group fscale N...

    Control 1     10     /* this record is there in test*/

    Case    1      0 ...      /* this will not be outputed, but I need this to reconstruct my table */

Thanks,


11 REPLIES 11
ballardw
Super User

Look at the CLASSDATA and EXCLUSIVE options on the PROC TABULATE statement. CLASSDATA references a dataset containing the various levels of you CLASS variables you are interested in and EXCLUSIVE says to display only the levels in that set.

I think you can get the CLASSDATA set for your purposes by:

Proc summary data=mydata nway;

     class group fscale sym;

     output out=classdataset (drop= _:);

run;

as summary will by default not include missing values of class variables.

tradepeter
Calcite | Level 5

I tried to use CLASSDATA and EXCLUSIVE. However, tabulate will not generate correct summaries if missing value is not set as a level in CLASSDATA. If missing value is set as a valid level, the row&column for missing will still be display in tabulate. Maybe I am not doing it right.......?

Thanks,

ballardw
Super User

This may be a case of pre-processing the data before the final version.

You could use PROC tabulate once to generater an output data set by adding OUT=datasetname on the PROC statement. This will generate additional variable names for the statistics but the class variables stay the same. You also will get something like GROUP_all as a class var. You could then use TABULATE or PRINT to get the desired output using a WHERE fscale ne . ; If there are more variables with missing that you don't want to include you probably need a data step to use more complex logic to keep the ones you want.

shivas
Pyrite | Level 9

Hi,

Is this what you want...or can you post the data also.

data aaa;

input treatment $ 1-8 fscale sym;

cards;

case    1 1

control 2 2

           5 5

control 3 3

;

run;

proc tabulate data=aaa out=test missing;

class treatment;

var fscale sym;

table fscale sym, (treatment ALL)*(N PCTN);

run;

data aaa1;

input treatment $ 1-8 fscale sym;

cards;

case    1 1

control 2 2

case    . .

control 3 3

;

run;

proc tabulate data=aaa1 out=test1 missing;

class treatment;

var fscale sym;

table fscale sym, (treatment ALL)*(N PCTN);

run;

Thanks,

Shiva

Cynthia_sas
SAS Super FREQ

Hi:

  Let's go back to the beginning. I have code that generates exactly 100 observations and then runs a TABULATE step and creates WORK.TEST using the OUT= option.

data mydata;

  infile datalines dlm=',' dsd;

  input ID $ grp $ fscale sym;

return;

datalines;

"1001","case",.,.

"1002","case",.,.

"1003","case",.,.

"1004","case",.,.

"1005","case",.,.

"1006","case",.,.

"1007","case",.,.

"1008","case",.,.

"1009","case",.,.

"1010","case",.,1

"1011","case",2,1

"1012","case",3,2

"1013","case",3,3

"1014","case",2,1

"1015","case",2,2

"1016","case",3,3

"1017","case",3,1

"1018","case",2,2

"1019","case",2,3

"1020","case",3,2

"1021","case",3,3

"1022","case",2,2

"1023","case",3,1

"1024","case",2,3

"1025","case",3,2

"1026","case",2,1

"1027","case",3,3

"1028","case",2,2

"1029","case",2,1

"1030","case",3,3

"1031","case",2,2

"1032","case",2,1

"1033","case",2,2

"1034","case",2,1

"1035","case",2,3

"1036","case",2,1

"1037","case",2,2

"1038","case",2,1

"1039","case",2,3

"1040","case",2,1

"1041","case",2,2

"1042","case",2,1

"1043","case",2,1

"1044","case",2,2

"1045","case",2,3

"1046","case",2,1

"1047","case",3,1

"1048","case",3,2

"1049","case",3,3

"1050","case",3,1

"1051","case",3,2

"1052","case",3,3

"1053","case",3,1

"1054","case",3,2

"1055","case",3,3

"1056","case",3,1

"1057","case",3,2

"1058","case",3,3

"1059","control",.,1

"1060","control",.,1

"1061","control",.,1

"1062","control",.,1

"1063","control",.,1

"1064","control",.,1

"1065","control",.,1

"1066","control",.,1

"1067","control",.,1

"1068","control",.,1

"1069","control",1,1

"1070","control",1,2

"1071","control",1,3

"1072","control",1,1

"1073","control",1,2

"1074","control",2,3

"1075","control",3,1

"1076","control",2,2

"1077","control",3,3

"1078","control",2,1

"1079","control",2,2

"1080","control",3,1

"1081","control",2,3

"1082","control",2,2

"1083","control",2,1

"1084","control",3,3

"1085","control",3,2

"1086","control",2,1

"1087","control",2,3

"1088","control",3,2

"1089","control",2,1

"1090","control",2,2

"1091","control",3,1

"1092","control",3,2

"1093","control",3,3

"1094","control",3,2

"1095","control",3,1

"1096","control",3,3

"1097","control",3,2

"1098","control",3,1

"1099","control",3,3

"1100","control",3,3

;

run;

         

ods listing close;

options nodate nonumber center;

title;

 

ods pdf file='c:\temp\report.pdf' style=journal;

proc tabulate data=mydata out=test f=comma8.;

class grp;

class fscale sym/ missing;

table (fscale ) (sym ),

      (grp ALL)*(N PCTN) / misstext='0' printmiss;

run;

    

proc print data=work.test;

title 'work.test';

run;

ods pdf close;

I used PDF so I could take a screen shot of the TABULATE and PRINT outputs side by side. If you look, you will see that the _TYPE_ column in WORK.TEST can be directly tied back to the rows and columns in your TABULATE table. So the information you want to delete is inside the red circle, where  _TYPE_ 110, and FSCALE = . for both CASE and CONTROL values of the group variable (GRP). Notice how each of the values for _TYPE_ corresponds to a section of the finished table.

If you need the missing values for the percents to calculate correctly, there is no way to tell TABULATE to use the missing, but NOT display the missing. As others have suggested, the most usual way of dealing with what you want to do (use the missing to calculate N and PCTN), but then not display is to post-process the data set from TABULATE: create an output data set from TABULATE, and then post-process with a program that uses the values of _TYPE_ to delete the information you do NOT want. Note that I used the PRINTMISS option, with MISSING in order to show the N and PCTN for FSCALE=1 and have that also appear in WORK.TEST.

In a DATA step program, you could assign a value to the group variable in your program based on the values of _TYPE_...something like:

if _type_ = '010' then grp = 'Overall';

and you could also selectively get rid of the rows you did not want. Then you would have to take this post-processd data and either get it restructured the way you want with PROC TRANSPOSE and then use PROC PRINT or use the data with PROC REPORT to get the report you want. Two possible reports with PROC REPORT are shown -- one with 2 separate tables for each variable and the other with one table for both variables.

My suggestion is that you work to understand TABULATE and getting the output and percents you want in a SAS dataset before you move on to the post-processing and using REPORT or some other method to write out what you need.

cynthia


case_control_explan.jpgtwo_sep_tables.jpgalternate_final.jpg
Araya32097
Fluorite | Level 6

 

 So my code is this - and basically works - except it's giving me missing values ' . ' that I don't want (also a 192 obs data set)

where could I add 'missing'/'no missing'/;misstext' to ensure the table is cleaner?

or do I change my format _all_ to a different variable? 

 

DATA proj4.gas;
SET proj4.gas;
IF MONTH in (1,2,3) THEN Q1 = 1;
ELSE IF MONTH in (4,5,6) THEN Q2 = 2;
ELSE IF MONTH in (7,8,9) THEN Q3 = 3;
ELSE IF MONTH in (10,11,12) THEN Q4 = 4;
quarter = MONTH; FORMAT _all_;
RUN;

 

Kurt_Bremser
Super User

First, a warning:

DATA proj4.gas;
SET proj4.gas;

Do not do this, especially not during code development. If this step fails, it destroys your dataset, and you need to recreate it, either from a backup or by rerunning the code that created it.

Create a new dataset and proceed from there. Only overwrite existing datasets once you have verified that your code has worked.

 

You work with date (or date-related) values, so you should use SAS date values, which opens you up to use the many tools (formats, functions) which make working with date values easier.

SAS dates are counts of days, starting with 1960-01-01 as day zero, and a format assigned to make them human-readable. Such a format can specifically display the quarter, so you do not need to do any calculations.

 

Finally, you attached your question to a 10-year-old (!!!) thread, across which I stumbled by accident. Post your question in a new thread to increase your audience.

Araya32097
Fluorite | Level 6
Thank you so much! I really appreciate your feedback and I am looking it over and dissecting it then digesting it. I feel really lost. I can't even fig out how to take a 192 ob, 15 year data set and chop it into qtrs. I know SQL, some Python and some R. All simple comparatively. I don't know why I am having such a hard time. Thank you again. -Kate
Patrick
Opal | Level 21

@Araya32097 wrote:
Thank you so much! I really appreciate your feedback and I am looking it over and dissecting it then digesting it. I feel really lost. I can't even fig out how to take a 192 ob, 15 year data set and chop it into qtrs. I know SQL, some Python and some R. All simple comparatively. I don't know why I am having such a hard time. Thank you again. -Kate

Do know that you can use SQL with SAS?

proc sql;

...your SQL statements...;

quit;

 

For printing missing numerical values in reports: If you don't want the default period then define somewhere earlier in your code...

options missing=' ';

 

Kurt_Bremser
Super User

With the use of SAS date values, all it takes is a format to make statistical procs work with quarters:

data have;
format period yymmn6.;
do year = 2007 to 2022;
  do month = 1 to 12;
    period = mdy(month,1,year);
    output;
  end;
end;
run;

proc freq data=have;
format period yyq6.;
tables period;
run;

If you need "manual" analysis which you cannot solve with a SAS procedure, create a "quarter" variable by using the same format in a PUT function. You can then use this new variable in a BY statement.

 

Edit: fixed a typo.

Araya32097
Fluorite | Level 6
WOW. This is the first response that makes sense! I will give it a whirl and let you know how it goes. THANK YOU!!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 4885 views
  • 0 likes
  • 7 in conversation