How to fill in missings with the non-missing value for each session_id group? Here device_name(character) and time_sum(numeric) have missings for each session_id group. The non-missing only appear once for each group, and I'm hoping to repeat the non-missing value by the group.
session_id device_name time_sum
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
Hoping to have this outcome:
session_id device_name time_sum
1 desktop 12
1 desktop 12
1 desktop 12
1 desktop 12
2 mobile 11
2 mobile 11
2 mobile 11
3 desktop 10
3 desktop 10Grateful for any input!
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
proc sql;
create table want as
select a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by session_id;
quit;
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
proc sql;
create table want as
select a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by session_id;
quit;
Hi @lydiawawa The carry over variables should be part of
from have(keep=session_id) a
I just took your sample, but of course you could have session_id and the carry over variables
however this part below
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
should only include sesion_id and vars that need filling.
I hope you are able to follow
There are many ways to do this...here is but one.
You also don't say what you want if there are duplicate data within a session_id, for example session_id=2 and time_sum in (11,99). In my code, the first value wins.
data have;
input session_id :8. device_name :$10. time_sum :8.;
datalines;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
run;
data want;
if 0 then set have;
%let _hashnum_=0;
%hash_define(data=have,keys=session_id,vars=device_name,where=device_name is not missing);
%hash_define(data=have,keys=session_id,vars=time_sum,where=time_sum is not missing);
set have;
%hash_lookup;
drop _:;
run;Download %hash_define and %hash_lookup (and any supporting macros, such as %parmv) from here:
https://github.com/scottbass/SAS/tree/master/Macro
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
data want;
merge have(keep= session_id)
have(keep= session_id device_name where=(device_name is not missing) )
have(keep= session_id time_sum where=(time_sum is not missing) );
by session_id;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.