BookmarkSubscribeRSS Feed
supp
Pyrite | Level 9

Hello,

 

If this is my starting data:

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;

How do I get the following results?

data want;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
b,1,2019-05-03
c,1,2019-07-06
d,1,2019-03-06
f,1,2019-11-30
;
run;

Where the members returned have a last status of 1 and the date returned is the first date where status = 1 and there is no later status of 0 or null.

 

Person 'a' doesn't get returned because their last status is not 1.

Person 'b' gets returned with a date of 2019-05-03. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'c' gets returned with a date of 2019-07-06. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'd' gets returned with a date of 2019-06-06. Their only status = 1

Person 'e' doesn't gets returned because their only status <> = 1

Person 'f' gets returned with a date of 2019-11-30. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1. In this example we wouldn't return the date of 2019-04-01 because there is a subsequent event with a status <> 1 on 2019-06-14.

 

How can I code to get the desired output?

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Ah after all some fun question amid the corona thing. Thank you

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  dcl hash H () ;
  h.definekey  ("person") ;
  h.definedata ('person','status','date') ;
  h.definedone () ;
 end;
 do until(last.person);
   set have;
   by person status notsorted;
   if first.status and status then h.replace();
 end;
 if status=1 then do;
   h.find();
   output;
 end;
 h.clear();
run;

 

 

novinosrin
Tourmaline | Level 20

Though not for prod use, my colleagues at work challenged me to attempt the same in SQL, and for fun

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
proc sql;
create table want as
select a.*
from have(where=(status=1)) a ,
(select *,count(distinct status)=1 as _n_ from have group by person having date=max(date) and status=1) b
where a.person=b.person and a.date<b.date or a.person=b.person and _n_
group by a.person
having max(date)=date;
quit;

Well Our bank lives and relies on SQL, so here no matter whatever stuff contradicts SQL, they overrule in favor of SQL. God bless!

PGStats
Opal | Level 21

My version:

 

data want;
do until(last.person);
	set have; by person;
	if status ne 1 then call missing(d);
	else if missing(d) then d = date;
	end;
if status = 1 then output;
drop date; format d yymmdd10.; rename d=date;
run;
PG
Kurt_Bremser
Super User

That's where the data step and the by statement really come into it's own:

data want;
set have (rename=(date=_date));
by person status notsorted;
retain date;
format date yymmdd10.;
if first.status and status = 1 then date = _date;
if last.person and status = 1;
drop _date;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1032 views
  • 4 likes
  • 4 in conversation