proc format ;
value formula
low - 15 =0
16-40 =1
41-70=2
71-100=3
101-140=4
141-175=5
176-225=6
226-300=7
301-high=8
;
run;
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Sdate 10/05/2015
1 4 Sdate 11/09/2015
1 5 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/09/2015
2 4 Sdate 06/21/2016
2 5 Sdate 07/05/2016
2 6 Sdate 07/09/2016
;
run;
Im trying to subtract the last rdate ( which is email 2 in this case) from first sdate ( email 3 in this case) for example if the diff between rdate and sdate = 130 . ( 4 from the proc format ) this means a 4 rows will be inserted to the table . and will be evenly distributed 130/4+1.
below is the final desire output .
any ideas how to approach that ?
;
Hi
Below is an example of a data I have and what im trying to achieve.
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Formula 06/26/2015
1 4 Formula 08/02/2015
1 5 Formula 09/07/2015
1 6 Formula 10/13/2015
1 7 Sdate 11/18/2015
1 8 Sdate 11/30/2015
1 9 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/31/2015
2 3 Formula 08/21/2015
2 3 Formula 09/11/2015
2 4 Sdate 10/01/2016
2 5 Sdate 11/10/2015
2 6 Sdate 01/09/2016
;
run;
I am no SAS expert and i'm sure the real experts on this sight can make this better. I would love some suggestions but this is what i came up with. There is currently no logic to increment your email numbers up but i wasn't sure that was necessary
proc format ;
value formula
low - 15 =0
16-40 =1
41-70=2
71-100=3
101-140=4
141-175=5
176-225=6
226-300=7
301-high=8
;
run;
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Sdate 10/05/2015
1 4 Sdate 11/09/2015
1 5 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/09/2015
2 4 Sdate 06/21/2016
2 5 Sdate 07/05/2016
2 6 Sdate 07/09/2016
;
run;
/*Sort have dataset*/
proc sort data=have;
by id email type edate;
run;
data want;
/*merge have dataset to itself with on incremented forward on record*/
merge have
have(firstobs=2 rename=(id=_nxt_id email=_nxt_email
type=_nxt_type edate=_nxt_edate));
/*test if current and next record are same id and expected types*/
if id=_nxt_id and type = 'rdate' and _nxt_type = 'Sdate' then do;
output; /*ouput rdate observations*/
/*calculate the interval between the rdate and sdate records*/
/*then get formula count from the format*/
_interval = intck('days', edate, _nxt_edate);
_form_cnt = put(_interval, formula.);
type = 'formula';
/*loop through and create formula observations based on above calc's*/
do i=1 to _form_cnt;
edate = intnx('days',edate, _interval/(_form_cnt+1));
output; /*output formula observations*/
end;
end;
else output; /*output the Sdate observations*/
drop _: i; /*drop temp variables and loop variable*/
run;
You can recreate the email sequence with the following additional Data step:
data want(rename=seq=email);
set want(drop=email);
by ID;
retain seq;
if first.id then seq=1;
else seq=seq+1;
run;
CTorres
Is it only this instance, or is it every rdate/sdate combination?
I think this is what you are looking for but I"m not sure:
data want;
set have;
by type notsorted;
format l_edate mmddyy10.;
l_edate = lag(edate);
if first.type and type = 'Sdate' then do;
diff_date = edate - l_edate;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.