Hello Friends,
I am stuck on one problem and request for some help. Please see table below -
Id | Date | Flag | Start | End | ||
1 | 31/01/2013 | Y | ||||
1 | 28/02/2013 | Y | ||||
1 | 31/03/2013 | Y | ||||
1 | 30/04/2013 | |||||
1 | 31/05/2013 | |||||
1 | 30/06/2013 | |||||
1 | 31/07/2013 | |||||
1 | 31/08/2013 | |||||
1 | 30/09/2013 | Y | ||||
1 | 31/10/2013 | Y | ||||
1 | 30/11/2013 | Y | ||||
1 | 31/12/2013 | Y | ||||
What I want to do is to create two new fields - Start_Date & End_Date which gets populated only when the Flag = "Y" something like the table below -
Id | Date | Flag | Start | End | ||
1 | 31/01/2013 | Y | 31/01/2013 | |||
1 | 31/03/2013 | Y | 31/03/2013 | |||
1 | 30/09/2013 | Y | 30/09/2013 | |||
1 | 31/12/2013 | Y | 31/12/2013 | |||
Any help will be geratly appreciated.
Regards
Siddharth
Assuming your source data is sorted by ID and Date something like below should work.
data want1;
set have;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
if first.flag and flag='Y' then start_dt=date;
if last.flag and flag='Y' then end_dt=date;
run;
data want2;
set have;
by id flag notsorted;
if first.flag and flag='Y' then start_end_ind='S';
if last.flag and flag='Y' then start_end_ind='E';
run;
Assuming your source data is sorted by ID and Date something like below should work.
data want1;
set have;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
if first.flag and flag='Y' then start_dt=date;
if last.flag and flag='Y' then end_dt=date;
run;
data want2;
set have;
by id flag notsorted;
if first.flag and flag='Y' then start_end_ind='S';
if last.flag and flag='Y' then start_end_ind='E';
run;
Thanks Patrick.
This works but not perfectly!
The solution gives me 2 entries for Start & End date for the id. What I am trying is to have the values popualted for "???" in table below. Soin final output, Id 1 should have 4 rows (but the solution you have will give me 2 rows!!) -
id | Date | Flag | start_dt | end_dt |
1 | 30/04/2013 | . | . | |
1 | 31/05/2013 | . | . | |
1 | 30/06/2013 | . | . | |
1 | 31/07/2013 | . | . | |
1 | 31/08/2013 | . | . | |
1 | 31/01/2013 | Y | 31/01/2013 | . |
1 | 28/02/2013 | Y | . | . |
1 | 31/03/2013 | Y | . | ??? |
1 | 30/09/2013 | Y | . | ??? |
1 | 31/10/2013 | Y | . | . |
1 | 30/11/2013 | Y | . | . |
1 | 31/12/2013 | Y | . |
31/12/2013 |
Hi Patrick,
Sorry that I did not mention that the Start & End Dates get populated whenever there is a break in the actual "Date" column that is why we should have 4 entries for the Id 1.
Kind regards
Siddharth
Hi. This gives four observations for ID=1. I added an ID=2 set of data that starts with FLAG=Y and modified the data step that Patrick posted ...
data have;
input id date :ddmmyy. flag :$1. @@;
format date ddmmyy10.;
datalines;
1 31/01/2013 Y 1 28/02/2013 Y 1 31/03/2013 Y 1 30/04/2013 .
1 31/05/2013 . 1 30/06/2013 . 1 31/07/2013 . 1 31/08/2013 .
1 30/09/2013 Y 1 31/10/2013 Y 1 30/11/2013 Y 1 31/12/2013 Y
2 31/01/2013 Y 2 28/02/2013 Y 2 31/03/2013 Y 2 30/04/2013 .
2 31/05/2013 Y 2 30/06/2013 Y 2 31/07/2013 . 2 31/08/2013 .
2 30/09/2013 Y 2 31/10/2013 . 2 30/11/2013 Y 2 31/12/2013 Y
;
data want;
format start_dt end_dt ddmmyy10.;
set have;
by id flag notsorted;
if flag eq 'Y' then do;
if first.flag or first.id then start_dt=date;
if last.flag then end_dt=date;
if first.flag or last.flag then output;
end;
run;
data set WANT (printed BY ID) ...
id=1
Obs start_dt end_dt date flag
1 31/01/2013 . 31/01/2013 Y
2 . 31/03/2013 31/03/2013 Y
3 30/09/2013 . 30/09/2013 Y
4 . 31/12/2013 31/12/2013 Y
id=2
Obs start_dt end_dt date flag
5 31/01/2013 . 31/01/2013 Y
6 . 31/03/2013 31/03/2013 Y
7 31/05/2013 . 31/05/2013 Y
8 . 30/06/2013 30/06/2013 Y
9 30/09/2013 30/09/2013 30/09/2013 Y
10 30/11/2013 . 30/11/2013 Y
11 . 31/12/2013 31/12/2013 Y
If I understand correctly what you're after then the following should work.
data have;
infile datalines truncover dlm=',' dsd;
input Id $ Date:ddmmyy10. Flag:$1.;
format date ddmmyy10.;
datalines;
1,31/01/2013,Y,
1,28/02/2013,Y,
1,31/03/2013,Y,
1,30/04/2013,
1,31/05/2013,
1,30/06/2013,
1,31/07/2013,
1,31/08/2013,
1,30/09/2013,Y,
1,31/10/2013,Y,
1,30/11/2013,Y,
1,31/12/2013,Y,
1,31/12/2014,Y,
1,30/11/2015,,
1,31/12/2015,Y,
2,31/01/2013,Y,
2,28/02/2013,Y,
2,31/03/2013,Y,
2,30/09/2013,Y,
2,31/10/2013,Y,
2,30/11/2013,Y,
2,31/12/2013,Y,
;
run;
proc sort data=have;
by id date;
run;
data want1(drop=_:);
set have curobs=curobs nobs=nobs;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
_lag_date=lag(date);
if flag='Y' then
do;
/* start_dt */
if first.flag then start_dt=date;
else if intnx('month',_lag_date,1,'e') ne date then start_dt=date;
/* end_dt */
if last.flag then end_dt=date;
else if curobs<nobs then
do;
curobs+1;
set have(keep=date rename=(date=_next_date)) point=curobs;
if intnx('month',_next_date,-1,'e') ne date then end_dt=date;
end;
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.