BookmarkSubscribeRSS Feed
supp
Pyrite | Level 9

Hello,

 

If this is my starting data:

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;

How do I get the following results?

data want;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
b,1,2019-05-03
c,1,2019-07-06
d,1,2019-03-06
f,1,2019-11-30
;
run;

Where the members returned have a last status of 1 and the date returned is the first date where status = 1 and there is no later status of 0 or null.

 

Person 'a' doesn't get returned because their last status is not 1.

Person 'b' gets returned with a date of 2019-05-03. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'c' gets returned with a date of 2019-07-06. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1

Person 'd' gets returned with a date of 2019-06-06. Their only status = 1

Person 'e' doesn't gets returned because their only status <> = 1

Person 'f' gets returned with a date of 2019-11-30. Their last status = 1 and that is the earliest date of status 1 with no following statuses <> 1. In this example we wouldn't return the date of 2019-04-01 because there is a subsequent event with a status <> 1 on 2019-06-14.

 

How can I code to get the desired output?

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Ah after all some fun question amid the corona thing. Thank you

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  dcl hash H () ;
  h.definekey  ("person") ;
  h.definedata ('person','status','date') ;
  h.definedone () ;
 end;
 do until(last.person);
   set have;
   by person status notsorted;
   if first.status and status then h.replace();
 end;
 if status=1 then do;
   h.find();
   output;
 end;
 h.clear();
run;

 

 

novinosrin
Tourmaline | Level 20

Though not for prod use, my colleagues at work challenged me to attempt the same in SQL, and for fun

 

data have;
	infile datalines dsd dlm=',';
	input person $ status date yymmdd10.;
	format date yymmdd10.;
datalines;
a,.,2019-02-01
a,0,2019-03-01
a,1,2019-04-01
a,0,2019-05-01
a,1,2019-06-14
a,0,2019-12-31
b,.,2019-01-31
b,0,2019-03-01
b,1,2019-05-03
b,1,2019-05-04
c,0,2019-01-30
c,0,2019-01-31
c,1,2019-07-06
c,1,2019-07-07
d,1,2019-03-06
e,0,2019-06-01
f,.,2019-02-01
f,0,2019-03-01
f,1,2019-04-01
f,1,2019-05-01
f,0,2019-06-14
f,1,2019-11-30
f,1,2019-12-31
;
run;
proc sql;
create table want as
select a.*
from have(where=(status=1)) a ,
(select *,count(distinct status)=1 as _n_ from have group by person having date=max(date) and status=1) b
where a.person=b.person and a.date<b.date or a.person=b.person and _n_
group by a.person
having max(date)=date;
quit;

Well Our bank lives and relies on SQL, so here no matter whatever stuff contradicts SQL, they overrule in favor of SQL. God bless!

PGStats
Opal | Level 21

My version:

 

data want;
do until(last.person);
	set have; by person;
	if status ne 1 then call missing(d);
	else if missing(d) then d = date;
	end;
if status = 1 then output;
drop date; format d yymmdd10.; rename d=date;
run;
PG
Kurt_Bremser
Super User

That's where the data step and the by statement really come into it's own:

data want;
set have (rename=(date=_date));
by person status notsorted;
retain date;
format date yymmdd10.;
if first.status and status = 1 then date = _date;
if last.person and status = 1;
drop _date;
run;

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1139 views
  • 4 likes
  • 4 in conversation