BookmarkSubscribeRSS Feed
Emma2021
Quartz | Level 8

I have the following data--how can I impute within the same ID and DAY the flag variable--if multiple flag values are available then take the first value and impute within the same ID and DAY--see example ID=3 and DAY=3:

 

data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     2
3   3     .
3   3     8

 

 

data wanted;
input id day flag;
cards;
1   1     2
1   1     2
2   4     2
2   4     2
2   5     .
3   3     2
3   3     2
3   3     8

 

10 REPLIES 10
novinosrin
Tourmaline | Level 20

data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     2
3   3     .
3   3     8
;

data want ;
  do until (last.day) ;
    set have ;
	by id day ;
	if flag and _n=. then _n = flag ;
  end ;
  do until (last.day) ;
    set have ;
	by id day ;
	if flag=. then flag = _n ;
	output ;
  end ;
  drop _n ;
run ;
acordes
Rhodochrosite | Level 12

it's not the most effcient code.

I do not manage to save the sort operation not even by putting tjhe notsorted option in the data step. 

 

data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     2
3   3     .
3   3     8
;
run;

data have;
set have;
helper=missing(flag);
run;

proc sort data=have;
by id day helper;
run;

data want;
set have;
by id day helper;
retain lasty;
lasty=lag(flag);
if first.day then call missing(lasty);
if missing(flag) then flag=lasty;
drop helper lasty;
run;

 

 

novinosrin
Tourmaline | Level 20

data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     2
3   3     .
3   3     8
;

data want ;
  if _n_ = 1 then do ;
    dcl hash h (dataset:'have(where=(flag))' ) ;
	h.definekey ('id', 'day') ;
	h.definedata ('flag') ;
	h.definedone ( ) ;
  end ;
  set have ;
  if flag=. then _n_ = h.find( ) ;
run ;
Emma2021
Quartz | Level 8
Is it possible to change this SAS code to SQL (maybe mysql or any SQL)?
Thank you!
Reeza
Super User
I'd actually recommend you look into PROC EXPAND and the different options for imputation for time series. You can get some different options using that procedure. You do need a license for SAS/ETS though.

SQL is typically not a good solution for this type of problem and it requires more intensive computations as you're joining data sets with itself. What's your rational for trying to use SQL instead of data steps?
r_behata
Barite | Level 11

proc sql;
    create table WANT(drop=flag rename=(flag_=flag)) as
   select id,day,flag,max(flag) as flag_
   from have
   group by 1,2;
quit;
novinosrin
Tourmaline | Level 20

Hi @Emma2021   @Reeza 's recommendation is spot on and very wise. Right, I am just posting for fun and to get my fingers back into some practice. I'd still suggest to go with Reeza's suggestion or use a DB specific SQL making use of the PARTIITON BY() option, enabling the use of min from each partition.

 

Anyways, for fun-


data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     2
3   3     .
3   3     8
;


proc sql;
  create table want as
  select a.id,a.day,coalesce(a.flag,b.flag) as flag
  from have a
  left join 
  (select *
  from (select *,monotonic() as n from have)
  group by id, day
  having min(case when flag>. then n else . end)=n) b
  on a.id=b.id and a.day=b.day;
quit;

 

 

Emma2021
Quartz | Level 8
Thank you so much.
I have oracle sql. Would you change the sas sql into oracle sql?
Thanks again!
Ksharp
Super User
data have;
input id day flag;
cards;
1   1     .
1   1     2
2   4     2
2   4     .
2   5     .
3   3     .
3   3     2
3   3     .
3   3     8
3   3     .
3   3     88
3   3     .
;

data temp;
 set have;
 n+1;
run;
data temp1;
 set temp;
 by id day;
 retain new_flag;
 if first.day then call missing(new_flag);
 if not missing(flag) then new_flag=flag;
run;
proc sort data=temp1;by descending n;run;

data want;
 set temp1;
 by id  day notsorted;
 retain new_flag2;
 if first.day then call missing(new_flag2);
 if not missing(new_flag) then new_flag2=new_flag;
 drop new_flag ;
run;
proc sort data=want;by  n;run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2595 views
  • 7 likes
  • 7 in conversation