BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hk2013
Fluorite | Level 6
Name IDStart_dateEnd_date
TPSON101MAY2017:00:00:00.
TPSON101MAR2016:00:00:0031DEC2016:00:00:00
TPSON114NOV2014:00:00:00.
RED201MAY2015:00:00:00.
RED201DEC2005:00:00:0030APR2015:00:00:00
RICH301NOV2018:00:00:00.
RICH301JAN2018:00:00:0031DEC2018:00:00:00
VOR401APR2016:00:00:00.
VOR401OCT2018:00:00:0031OCT2018:00:00:00
Fan 507JUL2007:00:00:0019OCT2015:00:00:00
VIX620AUG2018:00:00:0031DEC2018:00:00:00


I have the data set above where members have  dates associated to them. members with multiple dates i want to keep the max start_date and the  end_date that is empty and also keep the members with both start and end date 

 

output: 

 

Name IDStart_dateEnd_date
TPSON101MAY2017:00:00:00.
RED201MAY2015:00:00:00.
RICH301NOV2018:00:00:00.
VOR401APR2016:00:00:00.
Fan 507JUL2007:00:00:0019OCT2015:00:00:00
VIX620AUG2018:00:00:0031DEC2018:00:00:00
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Assuming i understand your requirement-->

 

data have;
input Name $	ID	Start_date :datetime20.	End_date :datetime20.;
format start_date end_Date datetime20.;
cards;
TPSON	1	01MAY2017:00:00:00	.
TPSON	1	01MAR2016:00:00:00	31DEC2016:00:00:00
TPSON	1	14NOV2014:00:00:00	.
RED	2	01MAY2015:00:00:00	.
RED	2	01DEC2005:00:00:00	30APR2015:00:00:00
RICH	3	01NOV2018:00:00:00	.
RICH	3	01JAN2018:00:00:00	31DEC2018:00:00:00
VOR	4	01APR2016:00:00:00	.
VOR	4	01OCT2018:00:00:00	31OCT2018:00:00:00
Fan 	5	07JUL2007:00:00:00	19OCT2015:00:00:00
VIX	6	20AUG2018:00:00:00	31DEC2018:00:00:00
;

proc sql;
create table want as
select *
from
(select Name,id, case when start_date=max(start_date) and end_date=. then start_date
		when start_date=max(start_date) and end_date>. and count(*)>1 then .
		else start_date
		end as start_date format=datetime20.,end_date
from have
group by id,name)
group by id,name
having start_date=max(start_date)
order by id,name;
quit;

View solution in original post

6 REPLIES 6
r_behata
Barite | Level 11
Data have;
Input Name $ ID Start_date:	datetime26.	End_date: datetime26.;
format Start_date End_date datetime26.;
cards;
TPSON 1 01MAY2017:00:00:00 .
TPSON 1 01MAR2016:00:00:00 31DEC2016:00:00:00
TPSON 1 14NOV2014:00:00:00 .
RED   2 01MAY2015:00:00:00 .
RED	  2 01DEC2005:00:00:00 30APR2015:00:00:00
RICH  3 01NOV2018:00:00:00 .
RICH  3 01JAN2018:00:00:00 31DEC2018:00:00:00
VOR	  4 01APR2016:00:00:00 .
VOR	  4 01OCT2018:00:00:00 31OCT2018:00:00:00
Fan   5 07JUL2007:00:00:00 19OCT2015:00:00:00
VIX	  6 20AUG2018:00:00:00 31DEC2018:00:00:00
run;

Proc sort data=have ;
	by ID   descending Start_date  End_date;
run;
Data want;
	set have;
	by ID;

	if first.id;
run;
hk2013
Fluorite | Level 6

thanks for replying however for this data point i dont want the max start date i want the date with empty end date 

VOR	  4 01APR2016:00:00:00 .
VOR	  4 01OCT2018:00:00:00 31OCT2018:00:00:00

 and you code i picking the max start date 

novinosrin
Tourmaline | Level 20

Assuming i understand your requirement-->

 

data have;
input Name $	ID	Start_date :datetime20.	End_date :datetime20.;
format start_date end_Date datetime20.;
cards;
TPSON	1	01MAY2017:00:00:00	.
TPSON	1	01MAR2016:00:00:00	31DEC2016:00:00:00
TPSON	1	14NOV2014:00:00:00	.
RED	2	01MAY2015:00:00:00	.
RED	2	01DEC2005:00:00:00	30APR2015:00:00:00
RICH	3	01NOV2018:00:00:00	.
RICH	3	01JAN2018:00:00:00	31DEC2018:00:00:00
VOR	4	01APR2016:00:00:00	.
VOR	4	01OCT2018:00:00:00	31OCT2018:00:00:00
Fan 	5	07JUL2007:00:00:00	19OCT2015:00:00:00
VIX	6	20AUG2018:00:00:00	31DEC2018:00:00:00
;

proc sql;
create table want as
select *
from
(select Name,id, case when start_date=max(start_date) and end_date=. then start_date
		when start_date=max(start_date) and end_date>. and count(*)>1 then .
		else start_date
		end as start_date format=datetime20.,end_date
from have
group by id,name)
group by id,name
having start_date=max(start_date)
order by id,name;
quit;
novinosrin
Tourmaline | Level 20

Hi @hk2013,   The proc sql approach I gave you should work and the reason I took that approach was I noticed @r_behata had already responded with a datastep, to have some distinction. If you are not comfortable with proc sql, and want only datastep, let us know

 

EDIT: Also, Please clarify, should the maxdate of startdate record had non missing enddate, is it ok to  assume the next maxdate would have a missing enddate as your sample suggests? If not, the logic would have to scan until it finds a record with end_Date=. and the highest record of startdate?

 

Either way, with this clarification, the coding is less than a minute work

novinosrin
Tourmaline | Level 20

This is much simpler to understand in my opinion

 


data have;
input Name $	ID	Start_date :datetime20.	End_date :datetime20.;
format start_date end_Date datetime20.;
cards;
TPSON	1	01MAY2017:00:00:00	.
TPSON	1	01MAR2016:00:00:00	31DEC2016:00:00:00
TPSON	1	14NOV2014:00:00:00	.
RED	2	01MAY2015:00:00:00	.
RED	2	01DEC2005:00:00:00	30APR2015:00:00:00
RICH	3	01NOV2018:00:00:00	.
RICH	3	01JAN2018:00:00:00	31DEC2018:00:00:00
VOR	4	01APR2016:00:00:00	.
VOR	4	01OCT2018:00:00:00	31OCT2018:00:00:00
Fan 	5	07JUL2007:00:00:00	19OCT2015:00:00:00
VIX	6	20AUG2018:00:00:00	31DEC2018:00:00:00
;

proc sql;
create table want as
select *
from have
where end_date=.
group by id
having max(start_date)=start_date
union all
select *
from have 
where id not in (select id from have where end_date=.)
group by id
having max(start_date)=start_date
order by id;
quit;
novinosrin
Tourmaline | Level 20

data have;
input Name $	ID	Start_date :datetime20.	End_date :datetime20.;
format start_date end_Date datetime20.;
cards;
TPSON	1	01MAY2017:00:00:00	.
TPSON	1	01MAR2016:00:00:00	31DEC2016:00:00:00
TPSON	1	14NOV2014:00:00:00	.
RED	2	01MAY2015:00:00:00	.
RED	2	01DEC2005:00:00:00	30APR2015:00:00:00
RICH	3	01NOV2018:00:00:00	.
RICH	3	01JAN2018:00:00:00	31DEC2018:00:00:00
VOR	4	01APR2016:00:00:00	.
VOR	4	01OCT2018:00:00:00	31OCT2018:00:00:00
Fan 	5	07JUL2007:00:00:00	19OCT2015:00:00:00
VIX	6	20AUG2018:00:00:00	31DEC2018:00:00:00
;
proc sort data=have out=_have;
by id end_date  descending Start_date ;
run;

data want;
do until(last.id);
merge _have(in=a) _have(drop=start_date where=(_End_date=.) rename=(End_date=_End_date) in=b);
by id;
if a and not b then _m=max(_m,Start_date);
end;
do until(last.id);
set _have;
by id;
if first.id and end_date=. or _m=start_date then output;
end;
drop _:;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2018 views
  • 2 likes
  • 3 in conversation