Solved: Imputing numbers based on the frequency distribution

Cruise · Posted 04-15-2020 10:52 AM

Hi Folks:

I'd like to impute data where id1name='D'. I'd like to bring current n=21 to a total of n =120 keeping the frequency distribution of n by idname. I tried it manually by using proc freq on the 'n' and multipling 120 by the resulting percent from proc freq and allocated those numbers to idname. I will have to do the same thing for the rest of id1name such as 'C' too.

Could you please help accomplish this imputation task more efficiently than manual?

data have;
input n id1name $ idname $; 
cards;
0 D Buk 
4 D Dalseo 
1 D Dalseong 
1 D Dong 
3 D Jung 
8 D Nam 
2 D Seo 
2 D Suseong
5 C Hart
6 C Sous
;

Reeza · Posted 04-15-2020 12:51 PM

data have;
input n id1name $ idname $; 
cards;
0 D Buk 
4 D Dalseo 
1 D Dalseong 
1 D Dong 
3 D Jung 
8 D Nam 
2 D Seo 
2 D Suseong
5 C Hart
6 C Sous
; 



data new_values;
input id1name $ newBase;
cards;
D 120
C 50
;;;
run;

proc sort data=have;
by id1name;
run;

proc sort data=new_values;
by id1name;
run;

proc freq data=have noprint;
by id1name;
table idname / out=percents;
weight n;
run;

data want;
merge percents new_values;
by id1name;
newValue = round(newBase*percent/100, 1);
run;

View solution in original post

Reeza · Posted 04-15-2020 12:51 PM

data have;
input n id1name $ idname $; 
cards;
0 D Buk 
4 D Dalseo 
1 D Dalseong 
1 D Dong 
3 D Jung 
8 D Nam 
2 D Seo 
2 D Suseong
5 C Hart
6 C Sous
; 



data new_values;
input id1name $ newBase;
cards;
D 120
C 50
;;;
run;

proc sort data=have;
by id1name;
run;

proc sort data=new_values;
by id1name;
run;

proc freq data=have noprint;
by id1name;
table idname / out=percents;
weight n;
run;

data want;
merge percents new_values;
by id1name;
newValue = round(newBase*percent/100, 1);
run;

Imputing numbers based on the frequency distribution

Re: Imputing numbers based on the frequency distribution

Re: Imputing numbers based on the frequency distribution

Imputing numbers based on the frequency distribution

Re: Imputing numbers based on the frequency distribution

Re: Imputing numbers based on the frequency distribution

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away