I want to calculate the number of missing values in my dataset for each column and output the results to a separate dataset so I can later drop the variables that have more than X% missing values. The code below is the one that is used:
Data have(drop=x);
Do x = 1 to 5;
very_long_variable_name_12m = x;
very_long_variable_name_24m = x*2;
If x > 3 then call missing (very_long_variable_name_12m,very_long_variable_name_24m);
Output;
End;
Run;
Proc means data=have nmiss;
Output out=not_want nmiss= /autoname;
Run;
When proc means outputs the variable it cuts off a significant part of the variable name making it impossible to understand what variable is being referenced.
My original dataset has thousands of variables, so fixing this issue by hand is not an option. Can someone please help?
Don't use /autoname. In this simplified case where there is only one output statistic, why would you need to have _NMISS at the end?
Don't use /autoname. In this simplified case where there is only one output statistic, why would you need to have _NMISS at the end?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.