BookmarkSubscribeRSS Feed
GabbieS
Fluorite | Level 6

Hello!

I am just beginning my adventure with SAS and I need a little help with a subject connected to missing data. I have a dataset of 2310 variables and 52841 observations and I am supposed to analyze the missing data in the form of visualized graphs and reports. The graphs should help with identifying variables which have no missing data and a lot of missing data, but with 2310 variables I don't know how to do that.

I tried to show the % of missing data in a graph, but I would like the x axis to be more detailed (I want the marks of 10, 30, 50 % and so on to show).

Zrzut ekranu 2021-01-12 o 19.59.00.png

ods graphics / reset width=7in height=4.8in imagemap;
 
proc sgplot data=SASUSER.TMISS_NUM;
histogram miss_proc / scale=count fillattrs=(color=CX7fb1e0) dataskin=matte;
xaxis label="Braki danych";
yaxis grid label="Liczba zmiennych";
run;
 
ods graphics / reset;
 

Do you maybe have any recommendations how can I achieve that and what kind of visual graphs and reports I can make? I would appreciate any kind of help 🙂

 

Thank you in advance!

4 REPLIES 4
PaigeMiller
Diamond | Level 26

There are many options in the HISTOGRAM statement that let you control the width and location of bins in your plot.

 

There are many options in the XAXIS statement that let you control the appearance and spacing on the x-axis.

--
Paige Miller
GabbieS
Fluorite | Level 6
Thank you!
ballardw
Super User

For my case I wouldn't bother with any graphs if the question is identifying how many missing values.

If you have variables in your data set named ZZZ, AAA or I replace them the following data step.

The Junk data which is used solely for the purpose of identifying missing/non -missing values assigns 1 to variables that are missing on a record and 0 otherwise. The numeric variables will have numeric 1/0 and the character variables will have character 1/0.

The Proc freq will create a table for each variable in the data set. The frequency of 1 in the tables are the number of missing and the percent are the percent missing.

 

data junk;
   set have;
   array zzz (*) _numeric_;
   array aaa (*) _character_;
   do i=1 to dim(zzz);
      zzz[i]= missing(zzz[i]);
   end;
   do i= 1 to dim(aaa);
      aaa[i]= put(missing(aaa[i]),f1.);
   end;
   drop i;
run;


proc freq data=junk;
tables _all_; run;
GabbieS
Fluorite | Level 6
Thank you so much!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1174 views
  • 2 likes
  • 3 in conversation