I recently learned about using PRELOADFMT in MEANS, SUMMARY, and REPORT procedures. Can I achieve this when counting the number of Records in PROC FREQ or SQL?
In the following example, I also look for my data to display the sex= 'Missing' record count as '0'. In my way, I used to create a dummy record and then append it to my dataset. But I would like to learn the easy way if that's possible.
Proc FREQ with 'Sparse' option ,I was able to get the '0' for sex ='Female' and Age =16 . But I am don't have the 'Missing' sex.
in SQL I don't get any of the missing counts.
proc format;
value $shf 'F'= 'Female'
'M'= 'Male'
'U' = 'Missing';
quit;
proc freq data = sashelp.class noprint;
tables sex*age/ sparse out= cnt0 nocum norow nopercent;
format sex $shf.;
quit;
proc sql;
create table cnt1 as
select sex format $shf.,age, count(*) as cnt from sashelp.class
group by sex,age;
quit;
Maybe use one of the procedures that uses preloadfmt to create an output data set.
Can you show what you actually expect for output? If you expect a missing for each for each age that can be done in Sql be creating an output set from the Proc Format code and joining that with the data:
proc format out=work.shffmt; value $shf 'F'= 'Female' 'M'= 'Male' 'U' = 'Missing'; select $shf; quit; proc sql; create table cnt1 as select b.start as sex format $shf.,age, count(*) as cnt from sashelp.class as a ,work.shffmt as b group by b.start,age; quit;
But that gets very cumbersome if you want to do this for multiple variables.
Thank you, @ballardw. If with sql code works and is an easy solution, I am happy with that. Your code almost got what I wanted, but I have a question (request). There is no 'Female ' with age 16 in the ' class' data set ( selected observation in image)
This is the output I am looking for (looking for highlighted one also to display in my final output along with other rows that have the count).
You don't want to count(*). Instead you want to count(something) where something is a variable that will MISSING when the combination does not appear in the actual data and NOT missing when it does appear in the actual data.
Thanks, @Tom. Unfortunately, I couldn't understand what you were saying. If I replace the "something" with any Variable, it gives me back the normal SQL counts with fewer records ( excluding those that are not in my data). However, I am looking for my final data to display like this.
How I want:
As per @ballardw Suggesion: ( Sex ='Missing' is shows up in cnt1, but there count also changed along with male and female counts)
The COUNT() function counts the non-missing values of the variable. You cannot use SEX if you want to get the counts of the number of times the value of SEX is missing in the real data. So you need some variable that is not missing on every observation in the actual dataset. If you don't have such a variable you will have make one.
So if you created a template dataset with all of the combinations of SEX and AGE you want to report, left call it SHELL, then you could use code like this to join with SASHELP.CLASS and generate the counts.
data shell;
do sex='M','F',' ';
do age=10 to 16 ;
output;
end;
end;
run;
proc sql noprint;
create table cnt1 as
select
b.sex
,b.age
,count(a.constant) as cnt
from (select sex,age, 1 as constant from sashelp.class) a
full join shell b
on a.sex = b.sex and a.age = b.age
group by 1,2
;
quit;
Results
Obs sex age cnt 1 10 0 2 11 0 3 12 0 4 13 0 5 14 0 6 15 0 7 16 0 8 F 10 0 9 F 11 1 10 F 12 2 11 F 13 2 12 F 14 2 13 F 15 2 14 F 16 0 15 M 10 0 16 M 11 1 17 M 12 3 18 M 13 1 19 M 14 2 20 M 15 2 21 M 16 1
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.