Hi everyone,
I have a very large data set which includes many variables. I want to learn Missing percentage for every variables based on Column. The first thing coming to my mind is using the PROC TABULATE. I try to use the following code but this did not give my desired output. I prepared a sample code and desired image as below. Can somebody help me, please?
Additively, I have hundreds variables in my data set. Is it possible to write macro code to get whole variables's Missing percentage more easier?
Data Have;
Length Variable1 8 Variable2 8 Variable3 8 Variable4 8 Variable5 8 Variable6 8 Variable7 8 Variable8 8;
Infile Datalines Missover;
Input Variable1 Variable2 Variable3 Variable4 Variable5 Variable6 Variable7 Variable8;
Datalines;
1 2 3 4 5 6 7 8
. 2 3 4 5 6 7 8
. . 3 4 5 6 7 8
. . . 4 5 6 7 8
. . . . 5 6 7 8
. . . . . 6 7 8
. . . . . . 7 8
. . . . . . . 8
. . . . . . . .
;
Run;
PROC TABULATE DATA=HAVE;
VAR Variable1 Variable2 Variable3 Variable4 Variable5 Variable6 Variable7 Variable8;
TABLE /* Column Dimension */
Variable1*(N NMiss)
Variable2*(N NMiss)
Variable3*(N NMiss)
Variable4*(N NMiss)
Variable5*(N NMiss)
Variable6*(N NMiss)
Variable7*(N NMiss)
Variable8*(N NMiss)
;
RUN;
Thanks
Any idea about this subject?
Proc tabulate will not use the result of one statistic to calculate another. Either a couple passes through the data, proc summary to get the n and nmiss and then a data step for the percent or possibly proc report which will allow you to use results of statistics to calculate using the column results.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.