proc format ;
value formula
low - 15 =0
16-40 =1
41-70=2
71-100=3
101-140=4
141-175=5
176-225=6
226-300=7
301-high=8
;
run;
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Sdate 10/05/2015
1 4 Sdate 11/09/2015
1 5 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/09/2015
2 4 Sdate 06/21/2016
2 5 Sdate 07/05/2016
2 6 Sdate 07/09/2016
;
run;
Im trying to subtract the last rdate ( which is email 2 in this case) from first sdate ( email 3 in this case) for example if the diff between rdate and sdate = 130 . ( 4 from the proc format ) this means a 4 rows will be inserted to the table . and will be evenly distributed 130/4+1.
below is the final desire output .
any ideas how to approach that ?
;
Hi
Below is an example of a data I have and what im trying to achieve.
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Formula 06/26/2015
1 4 Formula 08/02/2015
1 5 Formula 09/07/2015
1 6 Formula 10/13/2015
1 7 Sdate 11/18/2015
1 8 Sdate 11/30/2015
1 9 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/31/2015
2 3 Formula 08/21/2015
2 3 Formula 09/11/2015
2 4 Sdate 10/01/2016
2 5 Sdate 11/10/2015
2 6 Sdate 01/09/2016
;
run;
I am no SAS expert and i'm sure the real experts on this sight can make this better. I would love some suggestions but this is what i came up with. There is currently no logic to increment your email numbers up but i wasn't sure that was necessary
proc format ;
value formula
low - 15 =0
16-40 =1
41-70=2
71-100=3
101-140=4
141-175=5
176-225=6
226-300=7
301-high=8
;
run;
data have;
input ID Email $ type $ edate mmddyy10. ;
format edate mmddyy10.;
datalines;
1 1 rdate 05/20/2015
1 2 rdate 05/21/2015
1 3 Sdate 10/05/2015
1 4 Sdate 11/09/2015
1 5 Sdate 12/11/2015
2 1 rdate 06/21/2015
2 2 rdate 07/05/2015
2 3 rdate 07/09/2015
2 4 Sdate 06/21/2016
2 5 Sdate 07/05/2016
2 6 Sdate 07/09/2016
;
run;
/*Sort have dataset*/
proc sort data=have;
by id email type edate;
run;
data want;
/*merge have dataset to itself with on incremented forward on record*/
merge have
have(firstobs=2 rename=(id=_nxt_id email=_nxt_email
type=_nxt_type edate=_nxt_edate));
/*test if current and next record are same id and expected types*/
if id=_nxt_id and type = 'rdate' and _nxt_type = 'Sdate' then do;
output; /*ouput rdate observations*/
/*calculate the interval between the rdate and sdate records*/
/*then get formula count from the format*/
_interval = intck('days', edate, _nxt_edate);
_form_cnt = put(_interval, formula.);
type = 'formula';
/*loop through and create formula observations based on above calc's*/
do i=1 to _form_cnt;
edate = intnx('days',edate, _interval/(_form_cnt+1));
output; /*output formula observations*/
end;
end;
else output; /*output the Sdate observations*/
drop _: i; /*drop temp variables and loop variable*/
run;
You can recreate the email sequence with the following additional Data step:
data want(rename=seq=email);
set want(drop=email);
by ID;
retain seq;
if first.id then seq=1;
else seq=seq+1;
run;
CTorres
Is it only this instance, or is it every rdate/sdate combination?
I think this is what you are looking for but I"m not sure:
data want;
set have;
by type notsorted;
format l_edate mmddyy10.;
l_edate = lag(edate);
if first.type and type = 'Sdate' then do;
diff_date = edate - l_edate;
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.