Dear community,
I was hoping for a bit of help with a simple question. I have a dataset with two columns such as this:
Column1 Column2
A 123
A .
A 123
A 456
B .
B 789
All I'm looking for is to fill in the missings of Column2 with the mode by Column 1, like this:
Column1 Column2
A 123
A 123
A 123
A 456
B 789
B 789
I know that it's possible with a select distinct and left join in proc sql, but since I am an R user, I'd highly appreciate a loop solution.
In R, one solution is this:
for(X in unique(Data[,"Column1"]))
{
Data[Data[,"Column1"]==X&is.na(Data[,"Column2"]),"Column2"] <- which.max(table(Data[Data[,"Column1"]==X,"Column2"]))
}
Surely there's some equivalent in SAS?
Thanks a lot!
Try this
data have;
input Column1 $ Column2;
datalines;
A 123
A .
A 123
A 456
B .
B 789
;
data want (drop=c);
do _N_=1 by 1 until (last.Column1);
set have;
by Column1;
c = max(c, Column2);
end;
do _N_=1 to _N_;
set have;
Column2 = c;
output;
end;
run;
Please explain the algorithm that replaces the missing A with 123 and the missing B with 789. What if the missing A was underneath a 123 and above a 456, what would the algorithm do in that case?
Most common observation is the mode, but if there are ties (for example two records with 123 and two records with 456) SAS will pick one arbitrarily.
proc summary data=have;
class column1;
var columns;
output out=mode mode=mode;
run;
data want;
merge have mode;
if missing(column2) then column2=mode;
run;
Adding: I see you have marked an answer as correct, but it does not seem to compute the "most common observation". So which is it?
Try this
data have;
input Column1 $ Column2;
datalines;
A 123
A .
A 123
A 456
B .
B 789
;
data want (drop=c);
do _N_=1 by 1 until (last.Column1);
set have;
by Column1;
c = max(c, Column2);
end;
do _N_=1 to _N_;
set have;
Column2 = c;
output;
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.