How to fill in missings with the non-missing value for each session_id group? Here device_name(character) and time_sum(numeric) have missings for each session_id group. The non-missing only appear once for each group, and I'm hoping to repeat the non-missing value by the group.
session_id device_name time_sum 1 . . 1 desktop . 1 . 12 1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
Hoping to have this outcome:
session_id device_name time_sum 1 desktop 12 1 desktop 12 1 desktop 12 1 desktop 12
2 mobile 11
2 mobile 11
2 mobile 11
3 desktop 10
3 desktop 10
Grateful for any input!
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
proc sql;
create table want as
select a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by session_id;
quit;
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
proc sql;
create table want as
select a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by session_id;
quit;
Hi @lydiawawa The carry over variables should be part of
from have(keep=session_id) a
I just took your sample, but of course you could have session_id and the carry over variables
however this part below
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
should only include sesion_id and vars that need filling.
I hope you are able to follow
There are many ways to do this...here is but one.
You also don't say what you want if there are duplicate data within a session_id, for example session_id=2 and time_sum in (11,99). In my code, the first value wins.
data have;
input session_id :8. device_name :$10. time_sum :8.;
datalines;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
run;
data want;
if 0 then set have;
%let _hashnum_=0;
%hash_define(data=have,keys=session_id,vars=device_name,where=device_name is not missing);
%hash_define(data=have,keys=session_id,vars=time_sum,where=time_sum is not missing);
set have;
%hash_lookup;
drop _:;
run;
Download %hash_define and %hash_lookup (and any supporting macros, such as %parmv) from here:
https://github.com/scottbass/SAS/tree/master/Macro
data have;
input session_id device_name $ time_sum ;
cards;
1 . .
1 desktop .
1 . 12
1 . .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10
;
data want;
merge have(keep= session_id)
have(keep= session_id device_name where=(device_name is not missing) )
have(keep= session_id time_sum where=(time_sum is not missing) );
by session_id;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.