BookmarkSubscribeRSS Feed
MSALKAR
Calcite | Level 5

Hello, is there any way of treatming missing values in the dataset? I have a lot of missing values because of which my results are not reliable. SAS shows me that 71% of the data is missing, which is a large number. Pleas elet me know how this can be solved.

yyy.png

4 REPLIES 4
lkeyes
Obsidian | Level 7

Well, from the embeded picture, i can't really tell what those results say, or what you are trying to do with the data (i.e. what Proc you are using); however here would be my solutions for missing values:

  1. I would use a data setp and do one of the following
    1. for categorical variables, use an "If" statement - if yourvar = "" then yourvar = "Unknown"
    2. for numeric variables, use an "if" statement - if yourvar = . then yourvar = 99999;
  2. I would also reconsider that if 71% of the data contains missing values, that you get rid of them entirely; or find some way to get that information in there. Not having information on roughly [3/4] of the data seems rather problematic in some (most) situations. 
jklaverstijn
Rhodochrosite | Level 12

It is unclear how this is a problem. Would you like to have them converted to zeroes? Or minus 1? Or delete the rows or colmns containg them? A value is generally missing because, well, it is. Probably born that way. There is no generic treatment for that; you'll need to inform us how you want if 'fixed'. Many procs (also the ones yuou are using like FREQ) have options designed to deal with missing values. Please elaborate on your intentions with the data so we can better advice.

 

Regards,

- Jan.

ballardw
Super User

First thing I would consider is where did the data come from? If you used Proc Import to bring the data into SAS then you may have issues with variables having been guessed to be the wrong type. If the first rows of data have numeral values but some of the values in the column are text then the Proc could guess that the data is supposed to be numeric based on the first few rows and then all of the character values would be set to missing.

Also did you look at the raw data source and the result after being brought into SAS to see that things look right?

 

Another common issue for getting many missing values is combining two or more data sets. If a variable does not exist in one data set when two sets are appended (proc append with force or as set statement like set data1 data2;) then the records from the other data set will have missing values for those variables.

Or if you match merge two or more sets (MERGE or proc sql joins of some types) if there is not a matching variable you can get additonal rows with missing.

So you need to examine the entire process from start to the displayed output to consider where these come from.

Then you can decide what to do.

 

 

AaroninMN
Obsidian | Level 7

If you want the missing values converted to zeroes, you can use code like this to replace them all at once.

 

data work.out_dataset;

 

set WORK.your_dataset;

array nums _numeric_;

 

do over nums;

if nums=. then nums=0;

end;

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 960 views
  • 1 like
  • 5 in conversation