Hello!
I am just beginning my adventure with SAS and I need a little help with a subject connected to missing data. I have a dataset of 2310 variables and 52841 observations and I am supposed to analyze the missing data in the form of visualized graphs and reports. The graphs should help with identifying variables which have no missing data and a lot of missing data, but with 2310 variables I don't know how to do that.
I tried to show the % of missing data in a graph, but I would like the x axis to be more detailed (I want the marks of 10, 30, 50 % and so on to show).
Do you maybe have any recommendations how can I achieve that and what kind of visual graphs and reports I can make? I would appreciate any kind of help 🙂
Thank you in advance!
For my case I wouldn't bother with any graphs if the question is identifying how many missing values.
If you have variables in your data set named ZZZ, AAA or I replace them the following data step.
The Junk data which is used solely for the purpose of identifying missing/non -missing values assigns 1 to variables that are missing on a record and 0 otherwise. The numeric variables will have numeric 1/0 and the character variables will have character 1/0.
The Proc freq will create a table for each variable in the data set. The frequency of 1 in the tables are the number of missing and the percent are the percent missing.
data junk;
set have;
array zzz (*) _numeric_;
array aaa (*) _character_;
do i=1 to dim(zzz);
zzz[i]= missing(zzz[i]);
end;
do i= 1 to dim(aaa);
aaa[i]= put(missing(aaa[i]),f1.);
end;
drop i;
run;
proc freq data=junk;
tables _all_;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.