Hi there,
I have a dataset that is structured the following way:
Customer_ID | Date_offended |
A | 19JUL2011:00:00:00.000 |
A | 29AUG2013:00:00:00.000 |
A | 29MAR2015:00:00:00.000 |
B | 05MAR2010:00:00:00.000 |
B | 14AUG2010:00:00:00.000 |
C | 06JUL2013:00:00:00.000 |
C | 02JUN2015:00:00:00.000 |
D | 21AUG2009:00:00:00.000 |
D | 02MAY2013:00:00:00.000 |
I want to be able to calculate the difference (in number of days or months) between the dates of offence for these customers. How do I go about it?
I appreciate your help in advance!
Krisztina
Assuming that your date_offended variable is a SAS datetime valued variable and your data is grouped by customer and sorted within customer by the date_offended:
data want;
set have;
by customer_id notsorted;
Intervaldays = intck('dtday',lag(Date_offended),Date_offended);
Intervalmonths= intck('dtmonth',lag(Date_offended),Date_offended);
if first.customer_id then do;
Intervaldays=.;
Intervalmonths=.;
end;
run;
Lag returns the value of the previous record. You will get a note in the log because the first record doesn't have a "previous" and the value is missing.
The Intck functions returns boundary interval counts so things like March 31 to April 1 still return 1 month. If you need more complex rules then details will need to be provided. The 'dtday' and 'dtmonth' say to look at the values of the varaibles in the function as datetime values and return the number of day or month boundaries.
Assuming that your date_offended variable is a SAS datetime valued variable and your data is grouped by customer and sorted within customer by the date_offended:
data want;
set have;
by customer_id notsorted;
Intervaldays = intck('dtday',lag(Date_offended),Date_offended);
Intervalmonths= intck('dtmonth',lag(Date_offended),Date_offended);
if first.customer_id then do;
Intervaldays=.;
Intervalmonths=.;
end;
run;
Lag returns the value of the previous record. You will get a note in the log because the first record doesn't have a "previous" and the value is missing.
The Intck functions returns boundary interval counts so things like March 31 to April 1 still return 1 month. If you need more complex rules then details will need to be provided. The 'dtday' and 'dtmonth' say to look at the values of the varaibles in the function as datetime values and return the number of day or month boundaries.
Thanks a lot! It worked. I was doing it in such an inefficient way. Creating separate columns for each differring date. This advice saved a lot of time and I will for sure use it another time. Thanks again!
For number of days:
data want;
set have;
by Customer_ID;
n_days = datepart(date_offended) - datepart(lag(date_offended));
if first.Customer_ID then n_days = .;
run;
Given that you have the number of days, you can convert that to months using whatever formula you would like. But if you have a different calculation in mind, SAS can probably do it. You just have to specify what your rules are.
There's a simpler formula for number of days that may work, but I'm unable to test it now:
n_days = dif(datepart(date_offended));
But however you do it, your DATE_OFFENDED variable is a datetime value, not a date value. So the DATEPART function will be needed.
You didn't post the output yet ?
data have;
infile cards expandtabs;
input Customer_ID $ Date_offended anydtdtm32.;
format Date_offended datetime.;
cards;
A 19JUL2011:00:00:00.000
A 29AUG2013:00:00:00.000
A 29MAR2015:00:00:00.000
B 05MAR2010:00:00:00.000
B 14AUG2010:00:00:00.000
C 06JUL2013:00:00:00.000
C 02JUN2015:00:00:00.000
D 21AUG2009:00:00:00.000
D 02MAY2013:00:00:00.000
;
run;
data want;
set have;
by Customer_ID ;
lag=lag(Date_offended);
dif_date=intck('dtday',lag,Date_offended,'c');
dif_month=intck('dtmonth',lag,Date_offended,'c');
dif_year=intck('dtyear',lag,Date_offended,'c');
if first.Customer_ID then call missing(of dif_:);
drop lag;
run;
Thank you! I think the first reply i got was the simplest and easiest way for what i will need my data for. But I appreciate your help. Thanks again.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.