Hi,
I have a dataset at the person/month/year level--ie, each row is one person for one mth/year combination. People occur multiple times in the dataset (but a new row would be a different mth/year). There are two date columns. date_col_1 is at the person level--ie, for each of the person's rows, date_col_1 will always be the same. date_col_2 reflects the mth/year that the event took place (format is yyyymm). Total is a sum variable reflecting the # of times an event happened within each mth/year for each person.
Example data:
ID date_col_1 date_col_2 total
1 1/3/2012 201203 5
1 1/3/2012 201204 7
1 1/3/2012 201205 3
2 10/5/2013 201201 1
2 10/5/2013 201202 12
3 3/4/2012 201301 11
3 3/4/2012 201305 4
3 3/4/2012 201401 5
3 3/4/2012 201403 10
What I want is a new variable that gives date_col_2 relative to date_col_1--that is, the number of months that date_col_2 occurs before or after date_col_1. For example, in the first row it would be +2 months because date_col_2 is two months after date_col_1; in the first row for ID 2 it would be -21 because date_col_2 is 21 months prior to date_col_1.
Any help is much appreciated.
Are your dates SAS date value variables or character? That is the first step then use the function INTCK.
months = intck('month', date_col_1, date_col_2);
How about something like this:
data have;
informat date_col_1 mmddyy10. date_col_2 yymmn6.;
format date_col_1 mmddyy10. date_col_2 yymmn6.;
input ID date_col_1 date_col_2 total;
diff_mths = (date_col_1 - date_col_2)/30.25;
datalines;
1 1/3/2012 201203 5
1 1/3/2012 201204 7
1 1/3/2012 201205 3
2 10/5/2013 201201 1
2 10/5/2013 201202 12
3 3/4/2012 201301 11
3 3/4/2012 201305 4
3 3/4/2012 201401 5
3 3/4/2012 201403 10
;
run;
proc print;run;
Anca.
I would look into the INTCK() function as you can subtract years, days, months, hours, etc.. from date variables. Below is what you're looking for:
Months=intck('MONTH',date_col_1,date_col_2);
Hope this helps!
But I see Ballardw beat me to it, so look into that function if both fields are date formats in your data set. ![]()
Also with INTCK you need to consider if you are counting month boundaries or month intervals.
The default method is month boundaries so intck('month', '30jun2015'd, '01jul2015') will count as 1 month as a boundary has been crossed.
If you want to count month intervals based on the start and end date add the METHOD parameter C for CONTINUOUS: Months=intck('MONTH',date_col_1,date_col_2, 'C');
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.