Hi everyone,
I have a very large data set which includes many variables. I want to learn Missing percentage for every variables based on Column. The first thing coming to my mind is using the PROC TABULATE. I try to use the following code but this did not give my desired output. I prepared a sample code and desired image as below. Can somebody help me, please?
Additively, I have hundreds variables in my data set. Is it possible to write macro code to get whole variables's Missing percentage more easier?
Data Have;
Length Variable1 8 Variable2 8 Variable3 8 Variable4 8 Variable5 8 Variable6 8 Variable7 8 Variable8 8;
Infile Datalines Missover;
Input Variable1 Variable2 Variable3 Variable4 Variable5 Variable6 Variable7 Variable8;
Datalines;
1 2 3 4 5 6 7 8
. 2 3 4 5 6 7 8
. . 3 4 5 6 7 8
. . . 4 5 6 7 8
. . . . 5 6 7 8
. . . . . 6 7 8
. . . . . . 7 8
. . . . . . . 8
. . . . . . . .
;
Run;
PROC TABULATE DATA=HAVE;
VAR Variable1 Variable2 Variable3 Variable4 Variable5 Variable6 Variable7 Variable8;
TABLE /* Column Dimension */
Variable1*(N NMiss)
Variable2*(N NMiss)
Variable3*(N NMiss)
Variable4*(N NMiss)
Variable5*(N NMiss)
Variable6*(N NMiss)
Variable7*(N NMiss)
Variable8*(N NMiss)
;
RUN;
Thanks
Any idea about this subject?
Proc tabulate will not use the result of one statistic to calculate another. Either a couple passes through the data, proc summary to get the n and nmiss and then a data step for the percent or possibly proc report which will allow you to use results of statistics to calculate using the column results.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.