BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lydiawawa
Lapis Lazuli | Level 10

How to fill in missings with the non-missing value for each session_id group? Here device_name(character) and time_sum(numeric) have missings for each session_id group. The non-missing only appear once for each group, and I'm hoping to repeat the non-missing value by the group.

 

session_id  device_name time_sum
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10

 

 Hoping to have this outcome:

 

session_id  device_name time_sum
    1          desktop       12
    1          desktop       12
    1          desktop       12
    1          desktop       12
2 mobile 11
2 mobile 11
2 mobile 11
3 desktop 10
3 desktop 10

Grateful for any input!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;

proc sql;
create table want as
select 	a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by  session_id;
quit;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;

proc sql;
create table want as
select 	a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by  session_id;
quit;
lydiawawa
Lapis Lazuli | Level 10
@novinosrin thank you for the prompt response and efficient coding. Just an add-on question, if I want to carry over other variables that do not need filling missings, how should I add them to the code?
novinosrin
Tourmaline | Level 20

Hi @lydiawawa  The carry over variables should be part of

from have(keep=session_id) a

I just took your sample, but of course you could have session_id and the carry over variables

 

however  this part below

(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b

 should only include sesion_id and vars that need filling.

 

I hope you are able to follow

lydiawawa
Lapis Lazuli | Level 10
@novinosrin I have a column that carries over unique observation numbers that have no missings.
ScottBass
Rhodochrosite | Level 12

There are many ways to do this...here is but one. 

 

You also don't say what you want if there are duplicate data within a session_id, for example session_id=2 and time_sum in (11,99).  In my code, the first value wins.

 

data have;
   input session_id :8. device_name :$10. time_sum :8.;
   datalines;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
;
run; 

data want;
   if 0 then set have;
   %let _hashnum_=0;
   %hash_define(data=have,keys=session_id,vars=device_name,where=device_name is not missing);
   %hash_define(data=have,keys=session_id,vars=time_sum,where=time_sum is not missing);
   set have;
   %hash_lookup;
   drop _:;
run;

Download %hash_define and %hash_lookup (and any supporting macros, such as %parmv) from here:

 

https://github.com/scottbass/SAS/tree/master/Macro


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
Astounding
PROC Star
Two short steps?

Data temp;
Update have (obs=0) have;
by id;
run;

data want;
update temp have;
by id;
output;
run;
Ksharp
Super User
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;
data want;
 merge have(keep= session_id)
 have(keep= session_id device_name where=(device_name is not missing) )
 have(keep= session_id time_sum  where=(time_sum  is not missing) );
by session_id;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1925 views
  • 6 likes
  • 5 in conversation