## Filling in Missing Data

Solved
Frequent Contributor
Posts: 101

# Filling in Missing Data

Hello,

I have data that takes the following form:

data have;

input month percent \$;

cards;

1 .1

1 .3

1 NULL

1 NULL

1 NULL

2 .1

2 NULL

2 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

3 NULL

4 .1

4 .5

4 NULL

4 NULL

;

run;

I would like to fill in all of the data that is stored as NULL (char) with an equal-weighted percent that makes all of the percentages per month add to 1.  So for the first month, all three NULL entries would become .2.

I know that I am going to have to turn the character variables into numeric variables, and I can change all of the NULL entries to zero without worry, because there are no entries that are 0 in the dataset, so I won't be losing data.

Is this something that I can use PROC STANDARD for?  I have only used that for setting the mean and the SD before, but maybe this can work here too.

Thanks,

John

Accepted Solutions
Solution
‎06-23-2014 11:19 AM
Super User
Posts: 13,358

## Re: Filling in Missing Data

You can turn them into numeric by dropping the \$ on the read. You'll get warnings but that's okay. Then the story is to replace MISSING.

Is there any chance that the sum of percents will already equal 1 and still have a NULL value within the group?
Assuming percent is numeric:

proc summary data=have nway;
class month;
var percent;
output out=summary (drop=_type_ _freq_) sum=PctSum Nmiss=NumberMissing;
run;

set summary;
If missing(pctsum) then pctsum=0;
NewPct = (1-pctsum)/NumberMissing;
run;

data want;
by month;
if missing(percent) then percent = NewPct;
drop newpct;
run;

All Replies
Solution
‎06-23-2014 11:19 AM
Super User
Posts: 13,358

## Re: Filling in Missing Data

You can turn them into numeric by dropping the \$ on the read. You'll get warnings but that's okay. Then the story is to replace MISSING.

Is there any chance that the sum of percents will already equal 1 and still have a NULL value within the group?
Assuming percent is numeric:

proc summary data=have nway;
class month;
var percent;
output out=summary (drop=_type_ _freq_) sum=PctSum Nmiss=NumberMissing;
run;

set summary;
If missing(pctsum) then pctsum=0;
NewPct = (1-pctsum)/NumberMissing;
run;

data want;
by month;
if missing(percent) then percent = NewPct;
drop newpct;
run;

Frequent Contributor
Posts: 101

## Re: Filling in Missing Data

Thank you so much! There is no worries about there being a NULL and the percent already adding up to 1.  I am just using this as an assumption that I will explain in my analysis.

John

Super User
Posts: 10,699

## Re: Filling in Missing Data

```data have;
input month percent \$;
cards;
1 .1
1 .3
1 NULL
1 NULL
1 NULL
2 .1
2 NULL
2 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
3 NULL
4 .1
4 .5
4 NULL
4 NULL
;
run;
proc sort data=have; by month percent;run;
data want(drop=x n) ;
x=1; n=0;
do until(last.month);
set have;
by month;
if not missing(input(percent,?? best8.)) then x=x-percent;
else n+1;
end;
do until(last.month);
set have;
by month;
if percent ='NULL' then wt=divide(x,n);
else wt=input(percent, best8.);
output;
end;
run;
```

Xia Keshan

🔒 This topic is solved and locked.