BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AKHILA
Obsidian | Level 7
I have a dataset like this
Id visits val date1 date2
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17 jan 2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019

I want to derive a flag variable , as per the conditions
The flag should be populated ONLY for last non missing 'val' variable for each subjects
Also date1 < date2 like

Id visits val date1 date2 flag
101 1 2.4 02jan2019 05jan2019 Y
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019 Y
102 3 8 15jan2019 15jan2019
103 1 2.3 17 jan 2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019 Y
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If performance is an issue, consider a double do loop:

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

data want;
do until (last.id);
  set have;
  by id;
  if date1 < date2
  then do;
    _date1 = date1;
    _date2 = date2;
  end;
end;
do until (last.id);
  set have;
  by id;
  if date1 = _date1 and date2 = _date2
  then flag = 'Y';
  else flag = ' ';
  output;
end;
drop _date1 _date2;
run;

proc print data=want noobs;
run;

Result:

 Id    visits    val        date1        date2    flag

101       1      2.4    02JAN2019    05JAN2019     Y  
101       2      4.0    05JAN2019    05JAN2019        
101       3      5.0    10JAN2019    05JAN2019        
102       1      6.0    05JAN2019    15JAN2019        
102       2      7.0    12JAN2019    15JAN2019     Y  
102       3      8.0    15JAN2019    15JAN2019        
103       1      2.3    17JAN2019    20JAN2019        
103       2      6.0    18JAN2019    20JAN2019        
103       3      8.0    19JAN2019    20JAN2019     Y  

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

Please post data in usable form: as data-step using datalines-statement.

 

"The flag should be populated ONLY for last non missing 'val' variable for each subjects" -- how are "subjects" identified?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

As @andreas_lds  has said, post test data in a datastep using he code window - its the {i} above post area.

 

At a guess I would say reverse sort the data, so the date is descending.  Then something like:

data want;
  set have;
  retain flag;
  if first.id then flag="";
  if lag(id)=id and lag(flag) ne "Y" and date1 < date2 then flag="Y";
run;

Then sort the data back again.  You could also do something in sql like:

proc sql;
  create table want as 
  select a.*,
         b.flag
  from   have a
  left join (select *,"Y" as flag from have group by id having date1 <= date2) b
  on     a.id=b.id 
  and    a.date1=b.date1;
quit;
s_lassen
Meteorite | Level 14

Something like this, perhaps?

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;run;

data flagged;
  set have;
  where date1<date2;
  by id;
  if last.id;
  retain flag 'Y';
run;

data want;
  merge have flagged;
  by id date1 date2;
run;
Ksharp
Super User
data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;run;
data temp;
 set have;
 by id;
 if first.id or missing(val) or date1 ge date2 then group+1;
run;
data want;
 set temp;
 by group;
 if last.group and not missing(val) and date1 lt date2 then flag='Y';
run;
Kurt_Bremser
Super User

If performance is an issue, consider a double do loop:

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

data want;
do until (last.id);
  set have;
  by id;
  if date1 < date2
  then do;
    _date1 = date1;
    _date2 = date2;
  end;
end;
do until (last.id);
  set have;
  by id;
  if date1 = _date1 and date2 = _date2
  then flag = 'Y';
  else flag = ' ';
  output;
end;
drop _date1 _date2;
run;

proc print data=want noobs;
run;

Result:

 Id    visits    val        date1        date2    flag

101       1      2.4    02JAN2019    05JAN2019     Y  
101       2      4.0    05JAN2019    05JAN2019        
101       3      5.0    10JAN2019    05JAN2019        
102       1      6.0    05JAN2019    15JAN2019        
102       2      7.0    12JAN2019    15JAN2019     Y  
102       3      8.0    15JAN2019    15JAN2019        
103       1      2.3    17JAN2019    20JAN2019        
103       2      6.0    18JAN2019    20JAN2019        
103       3      8.0    19JAN2019    20JAN2019     Y  
novinosrin
Tourmaline | Level 20

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

proc sql;
create table want(drop=t) as
select *,date1<date2 as t,ifc(calculated t and max(date1)=date1,'Y',' ') as Flag
from have
group by id,t
order by id,date1,date2;
quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1543 views
  • 1 like
  • 7 in conversation