BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lydiawawa
Lapis Lazuli | Level 10

How to fill in missings with the non-missing value for each session_id group? Here device_name(character) and time_sum(numeric) have missings for each session_id group. The non-missing only appear once for each group, and I'm hoping to repeat the non-missing value by the group.

 

session_id  device_name time_sum
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
2 . 11
2 . .
2 mobile .
3 desktop 10
3 . 10

 

 Hoping to have this outcome:

 

session_id  device_name time_sum
    1          desktop       12
    1          desktop       12
    1          desktop       12
    1          desktop       12
2 mobile 11
2 mobile 11
2 mobile 11
3 desktop 10
3 desktop 10

Grateful for any input!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;

proc sql;
create table want as
select 	a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by  session_id;
quit;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;

proc sql;
create table want as
select 	a.session_id, device_name, time_sum
from have(keep=session_id) a inner join
(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b
on a.session_id=b.session_id
order by  session_id;
quit;
lydiawawa
Lapis Lazuli | Level 10
@novinosrin thank you for the prompt response and efficient coding. Just an add-on question, if I want to carry over other variables that do not need filling missings, how should I add them to the code?
novinosrin
Tourmaline | Level 20

Hi @lydiawawa  The carry over variables should be part of

from have(keep=session_id) a

I just took your sample, but of course you could have session_id and the carry over variables

 

however  this part below

(select session_id,max(device_name) as device_name, max(time_sum) as time_sum from have group by session_id) b

 should only include sesion_id and vars that need filling.

 

I hope you are able to follow

lydiawawa
Lapis Lazuli | Level 10
@novinosrin I have a column that carries over unique observation numbers that have no missings.
ScottBass
Rhodochrosite | Level 12

There are many ways to do this...here is but one. 

 

You also don't say what you want if there are duplicate data within a session_id, for example session_id=2 and time_sum in (11,99).  In my code, the first value wins.

 

data have;
   input session_id :8. device_name :$10. time_sum :8.;
   datalines;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
;
run; 

data want;
   if 0 then set have;
   %let _hashnum_=0;
   %hash_define(data=have,keys=session_id,vars=device_name,where=device_name is not missing);
   %hash_define(data=have,keys=session_id,vars=time_sum,where=time_sum is not missing);
   set have;
   %hash_lookup;
   drop _:;
run;

Download %hash_define and %hash_lookup (and any supporting macros, such as %parmv) from here:

 

https://github.com/scottbass/SAS/tree/master/Macro


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
Astounding
PROC Star
Two short steps?

Data temp;
Update have (obs=0) have;
by id;
run;

data want;
update temp have;
by id;
output;
run;
Ksharp
Super User
data have;
input session_id  device_name $ time_sum ;
cards;
    1             .          .
    1          desktop       .
    1             .          12
    1             .          .
    2             .          11
    2             .          .
    2          mobile        .  
    3          desktop       10
    3             .          10
	;
data want;
 merge have(keep= session_id)
 have(keep= session_id device_name where=(device_name is not missing) )
 have(keep= session_id time_sum  where=(time_sum  is not missing) );
by session_id;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 763 views
  • 6 likes
  • 5 in conversation