Hello Friends,
I am stuck on one problem and request for some help. Please see table below -
Id | Date | Flag | Start | End | ||
1 | 31/01/2013 | Y | ||||
1 | 28/02/2013 | Y | ||||
1 | 31/03/2013 | Y | ||||
1 | 30/04/2013 | |||||
1 | 31/05/2013 | |||||
1 | 30/06/2013 | |||||
1 | 31/07/2013 | |||||
1 | 31/08/2013 | |||||
1 | 30/09/2013 | Y | ||||
1 | 31/10/2013 | Y | ||||
1 | 30/11/2013 | Y | ||||
1 | 31/12/2013 | Y | ||||
What I want to do is to create two new fields - Start_Date & End_Date which gets populated only when the Flag = "Y" something like the table below -
Id | Date | Flag | Start | End | ||
1 | 31/01/2013 | Y | 31/01/2013 | |||
1 | 31/03/2013 | Y | 31/03/2013 | |||
1 | 30/09/2013 | Y | 30/09/2013 | |||
1 | 31/12/2013 | Y | 31/12/2013 | |||
Any help will be geratly appreciated.
Regards
Siddharth
Assuming your source data is sorted by ID and Date something like below should work.
data want1;
set have;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
if first.flag and flag='Y' then start_dt=date;
if last.flag and flag='Y' then end_dt=date;
run;
data want2;
set have;
by id flag notsorted;
if first.flag and flag='Y' then start_end_ind='S';
if last.flag and flag='Y' then start_end_ind='E';
run;
Assuming your source data is sorted by ID and Date something like below should work.
data want1;
set have;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
if first.flag and flag='Y' then start_dt=date;
if last.flag and flag='Y' then end_dt=date;
run;
data want2;
set have;
by id flag notsorted;
if first.flag and flag='Y' then start_end_ind='S';
if last.flag and flag='Y' then start_end_ind='E';
run;
Thanks Patrick.
This works but not perfectly!
The solution gives me 2 entries for Start & End date for the id. What I am trying is to have the values popualted for "???" in table below. Soin final output, Id 1 should have 4 rows (but the solution you have will give me 2 rows!!) -
id | Date | Flag | start_dt | end_dt |
1 | 30/04/2013 | . | . | |
1 | 31/05/2013 | . | . | |
1 | 30/06/2013 | . | . | |
1 | 31/07/2013 | . | . | |
1 | 31/08/2013 | . | . | |
1 | 31/01/2013 | Y | 31/01/2013 | . |
1 | 28/02/2013 | Y | . | . |
1 | 31/03/2013 | Y | . | ??? |
1 | 30/09/2013 | Y | . | ??? |
1 | 31/10/2013 | Y | . | . |
1 | 30/11/2013 | Y | . | . |
1 | 31/12/2013 | Y | . |
31/12/2013 |
Hi Patrick,
Sorry that I did not mention that the Start & End Dates get populated whenever there is a break in the actual "Date" column that is why we should have 4 entries for the Id 1.
Kind regards
Siddharth
Hi. This gives four observations for ID=1. I added an ID=2 set of data that starts with FLAG=Y and modified the data step that Patrick posted ...
data have;
input id date :ddmmyy. flag :$1. @@;
format date ddmmyy10.;
datalines;
1 31/01/2013 Y 1 28/02/2013 Y 1 31/03/2013 Y 1 30/04/2013 .
1 31/05/2013 . 1 30/06/2013 . 1 31/07/2013 . 1 31/08/2013 .
1 30/09/2013 Y 1 31/10/2013 Y 1 30/11/2013 Y 1 31/12/2013 Y
2 31/01/2013 Y 2 28/02/2013 Y 2 31/03/2013 Y 2 30/04/2013 .
2 31/05/2013 Y 2 30/06/2013 Y 2 31/07/2013 . 2 31/08/2013 .
2 30/09/2013 Y 2 31/10/2013 . 2 30/11/2013 Y 2 31/12/2013 Y
;
data want;
format start_dt end_dt ddmmyy10.;
set have;
by id flag notsorted;
if flag eq 'Y' then do;
if first.flag or first.id then start_dt=date;
if last.flag then end_dt=date;
if first.flag or last.flag then output;
end;
run;
data set WANT (printed BY ID) ...
id=1
Obs start_dt end_dt date flag
1 31/01/2013 . 31/01/2013 Y
2 . 31/03/2013 31/03/2013 Y
3 30/09/2013 . 30/09/2013 Y
4 . 31/12/2013 31/12/2013 Y
id=2
Obs start_dt end_dt date flag
5 31/01/2013 . 31/01/2013 Y
6 . 31/03/2013 31/03/2013 Y
7 31/05/2013 . 31/05/2013 Y
8 . 30/06/2013 30/06/2013 Y
9 30/09/2013 30/09/2013 30/09/2013 Y
10 30/11/2013 . 30/11/2013 Y
11 . 31/12/2013 31/12/2013 Y
If I understand correctly what you're after then the following should work.
data have;
infile datalines truncover dlm=',' dsd;
input Id $ Date:ddmmyy10. Flag:$1.;
format date ddmmyy10.;
datalines;
1,31/01/2013,Y,
1,28/02/2013,Y,
1,31/03/2013,Y,
1,30/04/2013,
1,31/05/2013,
1,30/06/2013,
1,31/07/2013,
1,31/08/2013,
1,30/09/2013,Y,
1,31/10/2013,Y,
1,30/11/2013,Y,
1,31/12/2013,Y,
1,31/12/2014,Y,
1,30/11/2015,,
1,31/12/2015,Y,
2,31/01/2013,Y,
2,28/02/2013,Y,
2,31/03/2013,Y,
2,30/09/2013,Y,
2,31/10/2013,Y,
2,30/11/2013,Y,
2,31/12/2013,Y,
;
run;
proc sort data=have;
by id date;
run;
data want1(drop=_:);
set have curobs=curobs nobs=nobs;
by id flag notsorted;
format start_dt end_dt ddmmyy10.;
_lag_date=lag(date);
if flag='Y' then
do;
/* start_dt */
if first.flag then start_dt=date;
else if intnx('month',_lag_date,1,'e') ne date then start_dt=date;
/* end_dt */
if last.flag then end_dt=date;
else if curobs<nobs then
do;
curobs+1;
set have(keep=date rename=(date=_next_date)) point=curobs;
if intnx('month',_next_date,-1,'e') ne date then end_dt=date;
end;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.