Hello,
I have the following sample codes listed below. I found out that the Age_group counts in Proc Freg are much higher than the sum of Proc Tabulate [(Underly=0)+(Underly=1)]. Could anyone explain this strange situation?
data test_1;
set test;
if 2 =< ageyears =< 4 then Age_group=1;
else if 5 <= ageyears <= 11 then Age_group=2;
else if 12 <= ageyears <= 17 then Age_group=3;
if preterm=1 or lung=1 or airway=1 or heart=1 or gastro=1 or endo=1 or heme=1 or immuncpm=1 or gm=1 or other=1 then Underly=1;
else Underly=0;
run;
Proc freq data=test_1;
table Underly/missing;
where Age_group in (1,2,3);
run;
proc tabulate data=test_1;
class Age_group Sex race C_Underly;
tables (Age_group Sex race)*(n colpctn*f=4.2), Underly;
where Age_group in (1,2,3);
format Age_group Age_group. Sex Sex. Race Race.;
run;
@ybz12003 wrote:
I added Missing option in the Proc Tabulate table; I found out all the Age_group counts are increasing to the same as the Proc Freq. So my question is, although no missing value in the Age_group, if there are missing value in other class (Sex or Race), Proc Tabulate is counting the subjects that are not missing in any of the three classes. Am I right?
Any variable that appears on a CLASS statement that does not include the option MISSING will exclude the observations with missing values.
Note that you can have multiple CLASS statements in Proc Tabulate with different options. So you may leverage this behavior if desirable by including some variables on a Class statement with missing and others on a different Class statement without the missing.
I think my person maximum for Class statements in a single Proc Tabulate call is seven. Using ORDER= options, PRELOADFMT and/or MLF (Tabulate being one of the few procedures that support multilabel formats) and mixes of Missing.
If you haven't noticed yet Tabulate will also allow multiple table statements though all tables will use the same Class, Classlev, Var and other non-Table statement settings.
Can you please show us the output, so we can see what is different?
Can you please show us the data (or a portion of it) in data set TEST_1 as working SAS data step code, so we can see what the program is doing on your data (as you did in an earlier thread).
Proc Freq Age_group=1, N=8678
Proc Tabulate Underly=0 Age_group=1196
Proc Tabulate Underly=1 Age_group=2328
As shown above, Sum N [Underly=0 + Underly=1] not equal Freq N
Why do I have to ask twice to see (a portion of) the data? I shouldn't even have to ask once, by now @ybz12003 you should know we are going to ask to see (a portion of) the data. You should just provide the data without us even asking. This will enable us to provide you with quicker and better answers.
If you have missing values for any of the CLASS variables, add the MISSING option to your PROC TABULATE statement. Otherwise a record with a missing value for any class variable will be excluded from the table.
I added Missing option in the Proc Tabulate table; I found out all the Age_group counts are increasing to the same as the Proc Freq. So my question is, although no missing value in the Age_group, if there are missing value in other class (Sex or Race), Proc Tabulate is counting the subjects that are not missing in any of the three classes. Am I right?
@ybz12003 wrote:
I added Missing option in the Proc Tabulate table; I found out all the Age_group counts are increasing to the same as the Proc Freq. So my question is, although no missing value in the Age_group, if there are missing value in other class (Sex or Race), Proc Tabulate is counting the subjects that are not missing in any of the three classes. Am I right?
Any variable that appears on a CLASS statement that does not include the option MISSING will exclude the observations with missing values.
Note that you can have multiple CLASS statements in Proc Tabulate with different options. So you may leverage this behavior if desirable by including some variables on a Class statement with missing and others on a different Class statement without the missing.
I think my person maximum for Class statements in a single Proc Tabulate call is seven. Using ORDER= options, PRELOADFMT and/or MLF (Tabulate being one of the few procedures that support multilabel formats) and mixes of Missing.
If you haven't noticed yet Tabulate will also allow multiple table statements though all tables will use the same Class, Classlev, Var and other non-Table statement settings.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.