Hi,
I’m trying to get the difference between dates by row.
Here is my data set:
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
And this is the program I have so far:
data want;
set have;
days_between = intck('days',date,lag(date)); run;
My results look like this :
CUSTOMER_ID | date | days_between |
1057086 | 21-Aug-14 | . |
1057086 | 25-Aug-14 | -4 |
1057086 | 17-Sep-14 | -23 |
1057086 | 19-Sep-14 | -2 |
1057086 | 26-Sep-14 | -7 |
What I need is the results to be positive values
CUSTOMER_ID | date | days_between |
1057086 | 21-Aug-14 | . |
1057086 | 25-Aug-14 | 4 |
1057086 | 17-Sep-14 | 23 |
1057086 | 19-Sep-14 | 2 |
1057086 | 26-Sep-14 | 7 |
And after that I need to transpose the values so they look like this:
CUSTOMER_ID | First_Purchase Date | Purchase_2 | Purchase_3 | Purchase_4 | Purchase_5 |
1057086 | 21-Aug-14 | 4 | 23 | 2 | 7 |
Notice I have kept the first purchase date.
Any help will be greatly appreciated. Thanks!
Hi Robert,
I tried to reconstruct your program. See if below will help you.
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
proc sort; by customer_id date; run;
data nhave;
set have;
by customer_id date;
if first.customer_id then output;
run;
data wantz;
if _N_ = 1 then set nhave(rename=(date = date1));
set have;
days_between = dif(date);
drop date;
rename date1 = date;
run;
proc transpose data = wantz out = nwant(drop = _name_ Purchase_1) prefix = Purchase_;
by customer_id date;
var days_between;
run;
Br,Amit
Change your INTCK argument order for the two dates and the results will go positive. Then have a look at using PROC TRANSPOSE in the SAS PROCEDURES documentation / examples provided.
Scott Barry
SBBWorks, Inc.
Thanks Sbb,
I'm good with the transposing its just keeping/ not transposing the First_Purchase Date that I can't figure out.
data set options can be your friend:
proc transpose data=have
out=want (drop=_name_ purchase_1 rename=(date=First_purchase_date) where=(Purchase_2 ne .))
prefix=Purchase_;
by customer_id;
copy date;
var days_between;
run;
I won't claim efficiency other than code isn't complex
Also, if you intend to have multiple Customer_ID values, the data should be sorted, and you will need BY CUSTOMER_ID; with some additional IF FIRST.CUSTOMER_ID THEN ....; code to conditionally use the LAG(date) variable result/value.
Also, the consideration for having PURCHASE_n variables with/using multiple output formats will need to be addressed with any TRANSPOSE-output file unless you do capture / maintain First_Purchase_Date separately -- possibly it could be that PURCHASE_1 would use DATE9 format and other PURCHASE_n variables having no output FORMAT defined.
Scott Barry
SBBWorks, Inc.
If you want it to be positive just use abs (absolute difference).
Hi Robert,
I tried to reconstruct your program. See if below will help you.
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
proc sort; by customer_id date; run;
data nhave;
set have;
by customer_id date;
if first.customer_id then output;
run;
data wantz;
if _N_ = 1 then set nhave(rename=(date = date1));
set have;
days_between = dif(date);
drop date;
rename date1 = date;
run;
proc transpose data = wantz out = nwant(drop = _name_ Purchase_1) prefix = Purchase_;
by customer_id date;
var days_between;
run;
Br,Amit
I am a little surprise . You didn't use DIF().
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
run;
data temp;
set have;
by CUSTOMER_ID;
length v id $ 20;
days_between=dif(date);
if first.CUSTOMER_ID then do;
n=1;
call missing(days_between);
v=put(date,date11.);
id='First_Purchase_date';
end;
else do;
n+1;
v=put(days_between,best8.);
id=cats('Purchase_',n);
end;
run;
proc transpose data=temp out=want(drop=_:) ;
by CUSTOMER_ID;
var v;
id id;
run;
Mistake. Updated the correct program. I picked the dataset from work.
Thanks, Amit
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.