Hello,
I've been unable to find any help on this problem so I thought I'd try here. I'm wanting to split rows into multiple rows based on dates if particular conditions are met, and then update the start and end points of the rows.
I've got a dataset:
data data_to_modify;
input @1 ID $
@3 EffDate ddmmyy10.
@15 ExpDate ddmmyy10.;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
run;
And I want it to look like this:
data data_modified;
input @1 ID $
@3 EffDate ddmmyy10.
@14 ExpDate ddmmyy10.
@25 Year;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008 2008
B 01/02/2008 31/12/2008 2008
B 01/01/2009 31/12/2009 2009
B 01/01/2010 01/07/2010 2010
;
run;
I've tried using a full cartesian join based on by year variables 2008-2011 however I still can’t see the logic pattern. I’ve been unable to figure out how to use if-then-else and output statements either…
Any help is greatly appreciated !
Cheers
Jason
EDIT: Fixed code and struggling with text import
Or, in one datastep:
data data_to_modify;
input @1 ID $
@3 EffDate ddmmyy10.
@15 ExpDate ddmmyy10.;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
run;
data want (drop=_: i);
set data_to_modify;
_EffYear=year(EffDate);
_ExpYear=year(ExpDate);
_ExpDate=ExpDate;
do i=_EffYear to _ExpYear;
if _EffYear ne i then do;
EffDate=mdy(1,1,i);
end;
if _ExpYear ne i then do;
ExpDate=mdy(12,31,i);
end;
else ExpDate=_ExpDate;
output;
end;
run;
Here’s my solution, not sure how computationally efficient it is though:
data time;
input @1 start_date ddmmyy10. @12 end_date ddmmyy10. ;
format start_date date9. end_date date9. ;
datalines;
01/01/2008 31/12/2008
01/01/2009 31/12/2009
01/01/2010 31/12/2010
01/01/2011 31/12/2011
;
run;
PROC SQL;
create table dm1 as
select
*
from data_to_modify, time ;
QUIT;
data dm2;
set dm1;
format NewEffDate date9. NewExpDate date9.;
if year(effdate) = year(expdate) then do;
Year = year(effdate);
NewEffDate = EffDate;
NewExpDate = ExpDate;
end;
if year(effdate) ^= year(expdate) then do;
Year = year(Start_date);
NewEffDate = max(EffDate, Start_Date);
NewExpDate = min(ExpDate, End_Date);
end;
if ExpDate < Start_Date then delete;
drop effdate expdate start_date end_date;
rename neweffdate = Effdate;
rename newexpdate = Expdate;
run;
Or, in one datastep:
data data_to_modify;
input @1 ID $
@3 EffDate ddmmyy10.
@15 ExpDate ddmmyy10.;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
run;
data want (drop=_: i);
set data_to_modify;
_EffYear=year(EffDate);
_ExpYear=year(ExpDate);
_ExpDate=ExpDate;
do i=_EffYear to _ExpYear;
if _EffYear ne i then do;
EffDate=mdy(1,1,i);
end;
if _ExpYear ne i then do;
ExpDate=mdy(12,31,i);
end;
else ExpDate=_ExpDate;
output;
end;
run;
No IFs needed...
data data_to_modify;
input @1 ID $
@3 EffDate ddmmyy10.
@15 ExpDate ddmmyy10.;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
run;
data data_modified(drop=_endDate);
set data_to_modify;
_endDate = expDate;
year = year(EffDate);
do while (year < year(_endDate));
expDate = mdy(12,31,year);
output;
year + 1;
effDate = mdy(1,1,year);
end;
expDate = _endDate;
output;
run;
PG
These pieces of code, really put my coding ability to shame haha.. Cheers for the help.
data data_to_modify;
input @1 ID $
@3 EffDate ddmmyy10.
@15 ExpDate ddmmyy10.;
format EffDate date9. ExpDate date9.;
datalines;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
run;
data want(drop=_: n);
set data_to_modify;
n=year(expdate)-year(effdate);
_expdate = expdate;
if n=0 then output;
else do;
do _n_=0 to n-1;
expdate=mdy(12,31,year(effdate)+_n_);
output;
effdate=expdate;
end;
expdate=_expdate ;
output;
end;
proc print;run;
And another one, featuring intnx():
data have;
input ID $ EffDate :ddmmyy10. ExpDate :ddmmyy10.;
format EffDate date9. ExpDate date9.;
cards;
A 01/04/2008 01/08/2008
B 01/02/2008 01/07/2010
;
data want (drop=exp);
set have end=done;
exp=expdate;
if expdate <= intnx('year',effdate,0,'e') then output;
else do until (intnx('year',effdate,0,'e') > exp);
expdate= intnx('year',effdate,0,'e');
output;
effdate=intnx('year',effdate,1,'b');
if (intnx('year',effdate,0,'e') > exp) then do;
expdate=exp;
output;
end;
end;
run;
data a; input @1 ID $ @3 EffDate ddmmyy10. @15 ExpDate ddmmyy10.; format EffDate date9. ExpDate date9.; datalines; A 01/04/2008 01/08/2008 B 01/02/2008 01/07/2010 ; run; data b(keep= id effdate expdate); set a; end=intnx('year',effdate,0,'e') ; exp=expdate; do while(end lt exp); expdate=end; output; effdate=expdate+1; end=intnx('year',effdate,0,'e'); end; expdate=exp; output; run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.