Hi,
I’m trying to get the difference between dates by row.
Here is my data set:
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
And this is the program I have so far:
data want;
set have;
days_between = intck('days',date,lag(date)); run;
My results look like this :
CUSTOMER_ID | date | days_between |
1057086 | 21-Aug-14 | . |
1057086 | 25-Aug-14 | -4 |
1057086 | 17-Sep-14 | -23 |
1057086 | 19-Sep-14 | -2 |
1057086 | 26-Sep-14 | -7 |
What I need is the results to be positive values
CUSTOMER_ID | date | days_between |
1057086 | 21-Aug-14 | . |
1057086 | 25-Aug-14 | 4 |
1057086 | 17-Sep-14 | 23 |
1057086 | 19-Sep-14 | 2 |
1057086 | 26-Sep-14 | 7 |
And after that I need to transpose the values so they look like this:
CUSTOMER_ID | First_Purchase Date | Purchase_2 | Purchase_3 | Purchase_4 | Purchase_5 |
1057086 | 21-Aug-14 | 4 | 23 | 2 | 7 |
Notice I have kept the first purchase date.
Any help will be greatly appreciated. Thanks!
Hi Robert,
I tried to reconstruct your program. See if below will help you.
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
proc sort; by customer_id date; run;
data nhave;
set have;
by customer_id date;
if first.customer_id then output;
run;
data wantz;
if _N_ = 1 then set nhave(rename=(date = date1));
set have;
days_between = dif(date);
drop date;
rename date1 = date;
run;
proc transpose data = wantz out = nwant(drop = _name_ Purchase_1) prefix = Purchase_;
by customer_id date;
var days_between;
run;
Br,Amit
Change your INTCK argument order for the two dates and the results will go positive. Then have a look at using PROC TRANSPOSE in the SAS PROCEDURES documentation / examples provided.
Scott Barry
SBBWorks, Inc.
Thanks Sbb,
I'm good with the transposing its just keeping/ not transposing the First_Purchase Date that I can't figure out.
data set options can be your friend:
proc transpose data=have
out=want (drop=_name_ purchase_1 rename=(date=First_purchase_date) where=(Purchase_2 ne .))
prefix=Purchase_;
by customer_id;
copy date;
var days_between;
run;
I won't claim efficiency other than code isn't complex
Also, if you intend to have multiple Customer_ID values, the data should be sorted, and you will need BY CUSTOMER_ID; with some additional IF FIRST.CUSTOMER_ID THEN ....; code to conditionally use the LAG(date) variable result/value.
Also, the consideration for having PURCHASE_n variables with/using multiple output formats will need to be addressed with any TRANSPOSE-output file unless you do capture / maintain First_Purchase_Date separately -- possibly it could be that PURCHASE_1 would use DATE9 format and other PURCHASE_n variables having no output FORMAT defined.
Scott Barry
SBBWorks, Inc.
If you want it to be positive just use abs (absolute difference).
Hi Robert,
I tried to reconstruct your program. See if below will help you.
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
proc sort; by customer_id date; run;
data nhave;
set have;
by customer_id date;
if first.customer_id then output;
run;
data wantz;
if _N_ = 1 then set nhave(rename=(date = date1));
set have;
days_between = dif(date);
drop date;
rename date1 = date;
run;
proc transpose data = wantz out = nwant(drop = _name_ Purchase_1) prefix = Purchase_;
by customer_id date;
var days_between;
run;
Br,Amit
I am a little surprise . You didn't use DIF().
data have;
input CUSTOMER_ID date anydtdte23.;
format date date9.;
cards;
1057086 21AUG2014
1057086 25AUG2014
1057086 17SEP2014
1057086 17SEP2014
1057086 19SEP2014
1057086 26SEP2014
;
run;
data temp;
set have;
by CUSTOMER_ID;
length v id $ 20;
days_between=dif(date);
if first.CUSTOMER_ID then do;
n=1;
call missing(days_between);
v=put(date,date11.);
id='First_Purchase_date';
end;
else do;
n+1;
v=put(days_between,best8.);
id=cats('Purchase_',n);
end;
run;
proc transpose data=temp out=want(drop=_:) ;
by CUSTOMER_ID;
var v;
id id;
run;
Mistake. Updated the correct program. I picked the dataset from work.
Thanks, Amit
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.