BookmarkSubscribeRSS Feed
eric2
Calcite | Level 5

Hi,

how do i fill gaps between records. since there is gap between visit 1 and visit 2 i dont want those gaps to be filled but i want gaps filled between same visit like there is gap between visit 2, so that gap should be filled with same visit thats 2.

data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
;
run;

 

final output expected should be 

101 1
101 1
101 .
101 .
101 2
101 2 
101 2
101 3

 

thanks,

er

17 REPLIES 17
SuryaKiran
Meteorite | Level 14
data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
;
run;


data want;
retain _Visit;
do i=1 by 1 until(not missing(Visit_));
	set one(rename=Visit=Visit_);
	end;
do j=1 to i;
set one;
If not missing(visit) then _Visit=Visit;
if _Visit=Visit_ then Visit=_Visit;
Output;
end;
keep id visit;
run;

I need to thank @PGStats , I learned this approach from him Smiley Embarassed

 

Thanks,
Suryakiran
eric2
Calcite | Level 5

Thanks. Good one but if your last record is has missing visit then your total number of records are decreased by one and we want to have same number of records as original one.

data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
101 .
;
run;

thx

mkeintz
PROC Star

This program uses the lag function, which updates lagged value of visit only  once per visit "group", i.e. only  at the end of a sequence of identical  visit values.   If the lagged value is missing, and the double-lagged value matches the current value then set _FILLIN=1.

 

Reread the records read above, and if _FILLIN=1 then replace missing visit values with the double-lagged value.  After the  re-read output the records:

 

data want (drop=_:);
  do _N=1 by 1 until (last.id or _visit_lag1=.);
    set one;
    by id visit notsorted;
    if last.visit then do;
      _visit_lag1=lag1(visit);
      _visit_lag2=lag2(visit);
      _fillin= (visit=_visit_lag2);
    end;
  end;
  do _I=1 to _N;
    set one;
    if visit=. and _fillin=1 then visit=_visit_lag2;
    output;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20
data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
;
run;

data _null_;
if _n_=1 then do;
dcl hash H (multidata:'y',ordered:'a') ;
   h.definekey  ("id",'_visit') ;
   h.definedata ('id',"visit") ;
   h.definedone () ;
end;
set one end=last;
by id ;
retain _visit;
if not missing(visit) and lag(visit)=. and visit=_visit then 
h.replace(key:id,key:_visit, data:id,data: _visit);
if not missing(visit) then _visit=visit;
h.add();
if last then h.output(dataset:'want');
run;
eric2
Calcite | Level 5

Thanks . I am not so familiar with hash objects. How do you keep other variables if you have say more  variables in this dataset but dont use them in your logic?

 

thx

novinosrin
Tourmaline | Level 20

specify all of them in the definedata section as you initialize the hash obj at the top

 

  h.definedata ('id',"visit") ;/*and others to be included */

 but i think this will require the data: in replace method to be expanded too making it a verbose 

eric2
Calcite | Level 5

Thanks . I will try it out but i wish there is a way to do it by not using hash objects.

novinosrin
Tourmaline | Level 20

@eric2  Ok, an easy solution for you and not to worry about variables at all:

 

data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
101 .
;
run;
data want;
do _n_=1 by 1 until(last.id);
set one;
by id  ;
array _t(10);/*Please assign a high value subscript as mine is arbitrary/small for demo*/
if not missing(visit) and lag(visit)=. and visit=_visit then _t(_n_)=visit;
if not missing(visit) then _visit=visit;
end;
do until(last.id);
set one;
by id;
if not missing(visit) then _visit=visit;
if _visit  in _t then  visit=_visit;
output;
end;
drop _:;
run;

 

eric2
Calcite | Level 5

awesome. thanks much. this worked ....: smileyhappy:

PGStats
Opal | Level 21

Covering all cases:

 

data one;
input id $3. visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
102 .
102 3
102 .
102 3
103 4
103 5
103 .
;

data want;
do until(last.id);
    do i = 1 by 1 until(last.id or not missing(Visit_));
    	set one(rename=Visit=Visit_); by id;
    	end;
    do j = 1 to i;
        set one;
        If missing(visit) then if _Visit = Visit_ then Visit = _Visit;
        Output;
        end;
    _Visit = Visit_;
    end;
keep id visit;
run;

proc print; run;
PG
novinosrin
Tourmaline | Level 20

@PGStats Sir, may i request a min or two for a quick clarification on your very interesting statement that i want to learn and comprehend

 

   If missing(visit) then if _Visit = Visit_ then Visit = _Visit;

 

how does the above evaluate  if then follows if then 

 Did i miss something not in doc http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202239.htm

 

I would appreciate your response and time. Thank you

 

 

EDITED at 6:35 chicago time after figuring out with some diligent reading:

I got the understanding. My bad I didn't realize "if then" is executable and is used in one if then without an if then do. Sorry for the bother

Ksharp
Super User
data one;
input id visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
101 .
1011 1
1011 1
1011 .
1011 .
1011 2
1011 .
1011 2
1011 3
1011 .
;
run;
data have;
 set one;
 by id;
 if first.id then n=0;
 n+1;
run;
data temp;
 set have;
 if not missing(visit);
run;
data temp1;
 merge temp temp(rename=(id=_id n=_n visit=_visit) firstobs=2);
 output;
 if id=_id then do;
  do i=n+1 to _n-1;
    if visit ne _visit then call missing(visit);
	n=i;
	output;
  end;
 end;
 drop _: i ;
run;
data want;
 merge have temp1(rename=(visit=_visit) in=inb) ;
 by id n;
 if inb then visit=_visit;
 drop _visit;
run;
Ksharp
Super User
data one;
input id visit;
datalines;
101 1
101 1
101 .
101 .
101 2
101 .
101 2
101 3
101 .
1011 1
1011 1
1011 .
1011 .
1011 2
1011 .
1011 2
1011 3
1011 .
;
run;

data temp;
 retain _visit;
 do until(last.id or not missing(visit));
   set one;
   by id;
   if first.id then call missing(_visit);
 end;
 flag=(_visit=visit);
 do until(last.id or not missing(visit));
   set one;
   by id;
   if flag then visit1=_visit;
   output;
 end;
 _visit=visit;
 drop flag _visit;
 run;
 data want;
  set temp;
  if not missing(visit1) then visit=visit1;
  drop visit1;
run;
s_lassen
Meteorite | Level 14

Here is a solution which updates the data in place:

 

data temp;
  set one;
  if not missing(visit);
  obsno=_N_;
  rename visit=replace_val;
run;

data one;
  set temp;
  by id replace_val notsorted;
  last_n=lag(obsno);
  if not first.replace_val then do _N_=last_n+1 to obsno-1;
    modify one point=_N_;
    visit=replace_val;
    replace;
    end;
run;

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 17 replies
  • 2906 views
  • 2 likes
  • 7 in conversation