Hello!
I am just beginning my adventure with SAS and I need a little help with a subject connected to missing data. I have a dataset of 2310 variables and 52841 observations and I am supposed to analyze the missing data in the form of visualized graphs and reports. The graphs should help with identifying variables which have no missing data and a lot of missing data, but with 2310 variables I don't know how to do that.
I tried to show the % of missing data in a graph, but I would like the x axis to be more detailed (I want the marks of 10, 30, 50 % and so on to show).
Do you maybe have any recommendations how can I achieve that and what kind of visual graphs and reports I can make? I would appreciate any kind of help 🙂
Thank you in advance!
For my case I wouldn't bother with any graphs if the question is identifying how many missing values.
If you have variables in your data set named ZZZ, AAA or I replace them the following data step.
The Junk data which is used solely for the purpose of identifying missing/non -missing values assigns 1 to variables that are missing on a record and 0 otherwise. The numeric variables will have numeric 1/0 and the character variables will have character 1/0.
The Proc freq will create a table for each variable in the data set. The frequency of 1 in the tables are the number of missing and the percent are the percent missing.
data junk;
set have;
array zzz (*) _numeric_;
array aaa (*) _character_;
do i=1 to dim(zzz);
zzz[i]= missing(zzz[i]);
end;
do i= 1 to dim(aaa);
aaa[i]= put(missing(aaa[i]),f1.);
end;
drop i;
run;
proc freq data=junk;
tables _all_;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.