Architecting, installing and maintaining your SAS environment

Filling in Missing Data

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 101
Accepted Solution

Filling in Missing Data

Hello,

I have data that takes the following form:

data have;

input month percent $;

cards;

1 .1

1 .3

1 NULL

1 NULL

1 NULL

2 .1

2 NULL

2 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

4 .1

4 .5

4 NULL

4 NULL

;

run;

I would like to fill in all of the data that is stored as NULL (char) with an equal-weighted percent that makes all of the percentages per month add to 1.  So for the first month, all three NULL entries would become .2.

I know that I am going to have to turn the character variables into numeric variables, and I can change all of the NULL entries to zero without worry, because there are no entries that are 0 in the dataset, so I won't be losing data.

Is this something that I can use PROC STANDARD for?  I have only used that for setting the mean and the SD before, but maybe this can work here too.

Thanks,

John


Accepted Solutions
Solution
‎06-23-2014 11:19 AM
Super User
Posts: 10,497

Re: Filling in Missing Data

You can turn them into numeric by dropping the $ on the read. You'll get warnings but that's okay. Then the story is to replace MISSING.

Is there any chance that the sum of percents will already equal 1 and still have a NULL value within the group?
Assuming percent is numeric:

proc summary data=have nway;
class month;
var percent;
output out=summary (drop=_type_ _freq_) sum=PctSum Nmiss=NumberMissing;
run;

data addpct;
set summary;
If missing(pctsum) then pctsum=0;
NewPct = (1-pctsum)/NumberMissing;
run;

data want;
merge have addpct (keep=month newpct);
by month;
if missing(percent) then percent = NewPct;
drop newpct;
run;

View solution in original post


All Replies
Solution
‎06-23-2014 11:19 AM
Super User
Posts: 10,497

Re: Filling in Missing Data

You can turn them into numeric by dropping the $ on the read. You'll get warnings but that's okay. Then the story is to replace MISSING.

Is there any chance that the sum of percents will already equal 1 and still have a NULL value within the group?
Assuming percent is numeric:

proc summary data=have nway;
class month;
var percent;
output out=summary (drop=_type_ _freq_) sum=PctSum Nmiss=NumberMissing;
run;

data addpct;
set summary;
If missing(pctsum) then pctsum=0;
NewPct = (1-pctsum)/NumberMissing;
run;

data want;
merge have addpct (keep=month newpct);
by month;
if missing(percent) then percent = NewPct;
drop newpct;
run;

Frequent Contributor
Posts: 101

Re: Filling in Missing Data

Thank you so much! There is no worries about there being a NULL and the percent already adding up to 1.  I am just using this as an assumption that I will explain in my analysis.

Thanks for your help!

John

Super User
Posts: 9,676

Re: Filling in Missing Data

data have;
input month percent $;
cards;
1 .1
1 .3
1 NULL
1 NULL
1 NULL
2 .1
2 NULL
2 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
4 .1
4 .5
4 NULL
4 NULL
;
run;
proc sort data=have; by month percent;run;
data want(drop=x n) ;
x=1; n=0;
 do until(last.month);
  set have;
  by month;
  if not missing(input(percent,?? best8.)) then x=x-percent;
   else n+1;
 end;
 do until(last.month);
  set have;
  by month;
  if percent ='NULL' then wt=divide(x,n);
   else wt=input(percent, best8.);
   output;
end;
run;

Xia Keshan

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 2717 views
  • 0 likes
  • 3 in conversation