BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AKHILA
Obsidian | Level 7
I have a dataset like this
Id visits val date1 date2
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17 jan 2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019

I want to derive a flag variable , as per the conditions
The flag should be populated ONLY for last non missing 'val' variable for each subjects
Also date1 < date2 like

Id visits val date1 date2 flag
101 1 2.4 02jan2019 05jan2019 Y
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019 Y
102 3 8 15jan2019 15jan2019
103 1 2.3 17 jan 2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019 Y
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If performance is an issue, consider a double do loop:

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

data want;
do until (last.id);
  set have;
  by id;
  if date1 < date2
  then do;
    _date1 = date1;
    _date2 = date2;
  end;
end;
do until (last.id);
  set have;
  by id;
  if date1 = _date1 and date2 = _date2
  then flag = 'Y';
  else flag = ' ';
  output;
end;
drop _date1 _date2;
run;

proc print data=want noobs;
run;

Result:

 Id    visits    val        date1        date2    flag

101       1      2.4    02JAN2019    05JAN2019     Y  
101       2      4.0    05JAN2019    05JAN2019        
101       3      5.0    10JAN2019    05JAN2019        
102       1      6.0    05JAN2019    15JAN2019        
102       2      7.0    12JAN2019    15JAN2019     Y  
102       3      8.0    15JAN2019    15JAN2019        
103       1      2.3    17JAN2019    20JAN2019        
103       2      6.0    18JAN2019    20JAN2019        
103       3      8.0    19JAN2019    20JAN2019     Y  

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

Please post data in usable form: as data-step using datalines-statement.

 

"The flag should be populated ONLY for last non missing 'val' variable for each subjects" -- how are "subjects" identified?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

As @andreas_lds  has said, post test data in a datastep using he code window - its the {i} above post area.

 

At a guess I would say reverse sort the data, so the date is descending.  Then something like:

data want;
  set have;
  retain flag;
  if first.id then flag="";
  if lag(id)=id and lag(flag) ne "Y" and date1 < date2 then flag="Y";
run;

Then sort the data back again.  You could also do something in sql like:

proc sql;
  create table want as 
  select a.*,
         b.flag
  from   have a
  left join (select *,"Y" as flag from have group by id having date1 <= date2) b
  on     a.id=b.id 
  and    a.date1=b.date1;
quit;
s_lassen
Meteorite | Level 14

Something like this, perhaps?

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;run;

data flagged;
  set have;
  where date1<date2;
  by id;
  if last.id;
  retain flag 'Y';
run;

data want;
  merge have flagged;
  by id date1 date2;
run;
Ksharp
Super User
data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;run;
data temp;
 set have;
 by id;
 if first.id or missing(val) or date1 ge date2 then group+1;
run;
data want;
 set temp;
 by group;
 if last.group and not missing(val) and date1 lt date2 then flag='Y';
run;
Kurt_Bremser
Super User

If performance is an issue, consider a double do loop:

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

data want;
do until (last.id);
  set have;
  by id;
  if date1 < date2
  then do;
    _date1 = date1;
    _date2 = date2;
  end;
end;
do until (last.id);
  set have;
  by id;
  if date1 = _date1 and date2 = _date2
  then flag = 'Y';
  else flag = ' ';
  output;
end;
drop _date1 _date2;
run;

proc print data=want noobs;
run;

Result:

 Id    visits    val        date1        date2    flag

101       1      2.4    02JAN2019    05JAN2019     Y  
101       2      4.0    05JAN2019    05JAN2019        
101       3      5.0    10JAN2019    05JAN2019        
102       1      6.0    05JAN2019    15JAN2019        
102       2      7.0    12JAN2019    15JAN2019     Y  
102       3      8.0    15JAN2019    15JAN2019        
103       1      2.3    17JAN2019    20JAN2019        
103       2      6.0    18JAN2019    20JAN2019        
103       3      8.0    19JAN2019    20JAN2019     Y  
novinosrin
Tourmaline | Level 20

data have;
  input Id visits val date1: date9. date2 date9.;
  format date1 date2 date9.;
cards;
101 1 2.4 02jan2019 05jan2019
101 2 4 05jan2019 05jan2019
101 3 5 10jan2019 05jan2019
102 1 6 05jan2019 15jan2019
102 2 7 12jan2019 15jan2019
102 3 8 15jan2019 15jan2019
103 1 2.3 17jan2019 20jan2019
103 2 6 18jan2019 20jan2019
103 3 8 19jan2019 20jan2019
;
run;

proc sql;
create table want(drop=t) as
select *,date1<date2 as t,ifc(calculated t and max(date1)=date1,'Y',' ') as Flag
from have
group by id,t
order by id,date1,date2;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1046 views
  • 1 like
  • 7 in conversation