BookmarkSubscribeRSS Feed
supp
Pyrite | Level 9

Hello,

 

If this is my starting data:

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;

How do I get the following results?

data want;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
b,1,2019-05-03
c,1,2019-07-06
d,1,2019-03-06
f,1,2019-11-30
;
run;

Where the members returned have a last status of 1 and the date returned is the first date where status = 1 and there is no later status of 0 or null.

 

Person 'a' doesn't get returned because their last status is not 1.

Person 'b' gets returned with a date of 2019-05-03. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'c' gets returned with a date of 2019-07-06. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'd' gets returned with a date of 2019-06-06. Their only status = 1

Person 'e' doesn't gets returned because their only status <> = 1

Person 'f' gets returned with a date of 2019-11-30. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1. In this example we wouldn't return the date of 2019-04-01 because there is a subsequent event with a status <> 1 on 2019-06-14.

 

How can I code to get the desired output?

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Ah after all some fun question amid the corona thing. Thank you

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  dcl hash H () ;
  h.definekey  ("person") ;
  h.definedata ('person','status','date') ;
  h.definedone () ;
 end;
 do until(last.person);
   set have;
   by person status notsorted;
   if first.status and status then h.replace();
 end;
 if status=1 then do;
   h.find();
   output;
 end;
 h.clear();
run;

 

 

novinosrin
Tourmaline | Level 20

Though not for prod use, my colleagues at work challenged me to attempt the same in SQL, and for fun

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
proc sql;
create table want as
select a.*
from have(where=(status=1)) a ,
(select *,count(distinct status)=1 as _n_ from have group by person having date=max(date) and status=1) b
where a.person=b.person and a.date<b.date or a.person=b.person and _n_
group by a.person
having max(date)=date;
quit;

Well Our bank lives and relies on SQL, so here no matter whatever stuff contradicts SQL, they overrule in favor of SQL. God bless!

PGStats
Opal | Level 21

My version:

 

data want;
do until(last.person);
	set have; by person;
	if status ne 1 then call missing(d);
	else if missing(d) then d = date;
	end;
if status = 1 then output;
drop date; format d yymmdd10.; rename d=date;
run;
PG
Kurt_Bremser
Super User

That's where the data step and the by statement really come into it's own:

data want;
set have (rename=(date=_date));
by person status notsorted;
retain date;
format date yymmdd10.;
if first.status and status = 1 then date = _date;
if last.person and status = 1;
drop _date;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 626 views
  • 4 likes
  • 4 in conversation