Hi all:
I am wondering how to avoid displaying missing values in a table by proc tabulate. Say, for example, I have a clinical dataset with several variables such as
treatment: group = 'case'/'control'
frailty scale: fscale= 1,2,3
symtom: sym = 1,2,3;
proc data=mydata out=test;
class group;
class fscale sym/ missing;
table fscale sym, (group ALL)*(N PCTN) / missingtext='0';
run;
I put 'missing' in the 'class' statement in order to get correct statistics. However, I don't want summary for missing values e.g
Case Control
fscale N PCTN N PCTN
. 10 10 10 10 /* I don't want this line */
1 0 0 5 5
......
Is there any way to do that directly? I cannot find this option online. I tried to output the result to a table (test in this case) and remove the record corresponding to the first line. However, there is problem with this approach. Notice there is no value for Case and fscale = 1, which I printed as 0. SAS will not output it in the output table test.
group fscale N...
Control 1 10 /* this record is there in test*/
Case 1 0 ... /* this will not be outputed, but I need this to reconstruct my table */
Thanks,
Look at the CLASSDATA and EXCLUSIVE options on the PROC TABULATE statement. CLASSDATA references a dataset containing the various levels of you CLASS variables you are interested in and EXCLUSIVE says to display only the levels in that set.
I think you can get the CLASSDATA set for your purposes by:
Proc summary data=mydata nway;
class group fscale sym;
output out=classdataset (drop= _:);
run;
as summary will by default not include missing values of class variables.
I tried to use CLASSDATA and EXCLUSIVE. However, tabulate will not generate correct summaries if missing value is not set as a level in CLASSDATA. If missing value is set as a valid level, the row&column for missing will still be display in tabulate. Maybe I am not doing it right.......?
Thanks,
This may be a case of pre-processing the data before the final version.
You could use PROC tabulate once to generater an output data set by adding OUT=datasetname on the PROC statement. This will generate additional variable names for the statistics but the class variables stay the same. You also will get something like GROUP_all as a class var. You could then use TABULATE or PRINT to get the desired output using a WHERE fscale ne . ; If there are more variables with missing that you don't want to include you probably need a data step to use more complex logic to keep the ones you want.
Hi,
Is this what you want...or can you post the data also.
data aaa;
input treatment $ 1-8 fscale sym;
cards;
case 1 1
control 2 2
5 5
control 3 3
;
run;
proc tabulate data=aaa out=test missing;
class treatment;
var fscale sym;
table fscale sym, (treatment ALL)*(N PCTN);
run;
data aaa1;
input treatment $ 1-8 fscale sym;
cards;
case 1 1
control 2 2
case . .
control 3 3
;
run;
proc tabulate data=aaa1 out=test1 missing;
class treatment;
var fscale sym;
table fscale sym, (treatment ALL)*(N PCTN);
run;
Thanks,
Shiva
Hi:
Let's go back to the beginning. I have code that generates exactly 100 observations and then runs a TABULATE step and creates WORK.TEST using the OUT= option.
data mydata;
infile datalines dlm=',' dsd;
input ID $ grp $ fscale sym;
return;
datalines;
"1001","case",.,.
"1002","case",.,.
"1003","case",.,.
"1004","case",.,.
"1005","case",.,.
"1006","case",.,.
"1007","case",.,.
"1008","case",.,.
"1009","case",.,.
"1010","case",.,1
"1011","case",2,1
"1012","case",3,2
"1013","case",3,3
"1014","case",2,1
"1015","case",2,2
"1016","case",3,3
"1017","case",3,1
"1018","case",2,2
"1019","case",2,3
"1020","case",3,2
"1021","case",3,3
"1022","case",2,2
"1023","case",3,1
"1024","case",2,3
"1025","case",3,2
"1026","case",2,1
"1027","case",3,3
"1028","case",2,2
"1029","case",2,1
"1030","case",3,3
"1031","case",2,2
"1032","case",2,1
"1033","case",2,2
"1034","case",2,1
"1035","case",2,3
"1036","case",2,1
"1037","case",2,2
"1038","case",2,1
"1039","case",2,3
"1040","case",2,1
"1041","case",2,2
"1042","case",2,1
"1043","case",2,1
"1044","case",2,2
"1045","case",2,3
"1046","case",2,1
"1047","case",3,1
"1048","case",3,2
"1049","case",3,3
"1050","case",3,1
"1051","case",3,2
"1052","case",3,3
"1053","case",3,1
"1054","case",3,2
"1055","case",3,3
"1056","case",3,1
"1057","case",3,2
"1058","case",3,3
"1059","control",.,1
"1060","control",.,1
"1061","control",.,1
"1062","control",.,1
"1063","control",.,1
"1064","control",.,1
"1065","control",.,1
"1066","control",.,1
"1067","control",.,1
"1068","control",.,1
"1069","control",1,1
"1070","control",1,2
"1071","control",1,3
"1072","control",1,1
"1073","control",1,2
"1074","control",2,3
"1075","control",3,1
"1076","control",2,2
"1077","control",3,3
"1078","control",2,1
"1079","control",2,2
"1080","control",3,1
"1081","control",2,3
"1082","control",2,2
"1083","control",2,1
"1084","control",3,3
"1085","control",3,2
"1086","control",2,1
"1087","control",2,3
"1088","control",3,2
"1089","control",2,1
"1090","control",2,2
"1091","control",3,1
"1092","control",3,2
"1093","control",3,3
"1094","control",3,2
"1095","control",3,1
"1096","control",3,3
"1097","control",3,2
"1098","control",3,1
"1099","control",3,3
"1100","control",3,3
;
run;
ods listing close;
options nodate nonumber center;
title;
ods pdf file='c:\temp\report.pdf' style=journal;
proc tabulate data=mydata out=test f=comma8.;
class grp;
class fscale sym/ missing;
table (fscale ) (sym ),
(grp ALL)*(N PCTN) / misstext='0' printmiss;
run;
proc print data=work.test;
title 'work.test';
run;
ods pdf close;
I used PDF so I could take a screen shot of the TABULATE and PRINT outputs side by side. If you look, you will see that the _TYPE_ column in WORK.TEST can be directly tied back to the rows and columns in your TABULATE table. So the information you want to delete is inside the red circle, where _TYPE_ 110, and FSCALE = . for both CASE and CONTROL values of the group variable (GRP). Notice how each of the values for _TYPE_ corresponds to a section of the finished table.
If you need the missing values for the percents to calculate correctly, there is no way to tell TABULATE to use the missing, but NOT display the missing. As others have suggested, the most usual way of dealing with what you want to do (use the missing to calculate N and PCTN), but then not display is to post-process the data set from TABULATE: create an output data set from TABULATE, and then post-process with a program that uses the values of _TYPE_ to delete the information you do NOT want. Note that I used the PRINTMISS option, with MISSING in order to show the N and PCTN for FSCALE=1 and have that also appear in WORK.TEST.
In a DATA step program, you could assign a value to the group variable in your program based on the values of _TYPE_...something like:
if _type_ = '010' then grp = 'Overall';
and you could also selectively get rid of the rows you did not want. Then you would have to take this post-processd data and either get it restructured the way you want with PROC TRANSPOSE and then use PROC PRINT or use the data with PROC REPORT to get the report you want. Two possible reports with PROC REPORT are shown -- one with 2 separate tables for each variable and the other with one table for both variables.
My suggestion is that you work to understand TABULATE and getting the output and percents you want in a SAS dataset before you move on to the post-processing and using REPORT or some other method to write out what you need.
cynthia
So my code is this - and basically works - except it's giving me missing values ' . ' that I don't want (also a 192 obs data set)
where could I add 'missing'/'no missing'/;misstext' to ensure the table is cleaner?
or do I change my format _all_ to a different variable?
DATA proj4.gas;
SET proj4.gas;
IF MONTH in (1,2,3) THEN Q1 = 1;
ELSE IF MONTH in (4,5,6) THEN Q2 = 2;
ELSE IF MONTH in (7,8,9) THEN Q3 = 3;
ELSE IF MONTH in (10,11,12) THEN Q4 = 4;
quarter = MONTH; FORMAT _all_;
RUN;
First, a warning:
DATA proj4.gas;
SET proj4.gas;
Do not do this, especially not during code development. If this step fails, it destroys your dataset, and you need to recreate it, either from a backup or by rerunning the code that created it.
Create a new dataset and proceed from there. Only overwrite existing datasets once you have verified that your code has worked.
You work with date (or date-related) values, so you should use SAS date values, which opens you up to use the many tools (formats, functions) which make working with date values easier.
SAS dates are counts of days, starting with 1960-01-01 as day zero, and a format assigned to make them human-readable. Such a format can specifically display the quarter, so you do not need to do any calculations.
Finally, you attached your question to a 10-year-old (!!!) thread, across which I stumbled by accident. Post your question in a new thread to increase your audience.
@Araya32097 wrote:
Thank you so much! I really appreciate your feedback and I am looking it over and dissecting it then digesting it. I feel really lost. I can't even fig out how to take a 192 ob, 15 year data set and chop it into qtrs. I know SQL, some Python and some R. All simple comparatively. I don't know why I am having such a hard time. Thank you again. -Kate
Do know that you can use SQL with SAS?
proc sql;
...your SQL statements...;
quit;
For printing missing numerical values in reports: If you don't want the default period then define somewhere earlier in your code...
options missing=' ';
With the use of SAS date values, all it takes is a format to make statistical procs work with quarters:
data have;
format period yymmn6.;
do year = 2007 to 2022;
do month = 1 to 12;
period = mdy(month,1,year);
output;
end;
end;
run;
proc freq data=have;
format period yyq6.;
tables period;
run;
If you need "manual" analysis which you cannot solve with a SAS procedure, create a "quarter" variable by using the same format in a PUT function. You can then use this new variable in a BY statement.
Edit: fixed a typo.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.