Hi...I am trying to calculate the number of days, weeks, months and years between successive claims for each ID. I am getting some unusual results. This is the code I am using:
proc sort data=Summary;
by ID Claim_Date;
run;
data pc1;
set Summary;
by ID;
if first.ID then do;
Number_Days = .;
Number_Weeks = .;
Number_Months = .;
Number_Years = .;
end;
else if ID = lag(ID) then do;
Number_Days = (intck('Day', lag(input(put(Claim_Date,8.),yymmdd8.)), input(put(Claim_Date,8.),yymmdd8.),'c'));
Number_Weeks = (intck('Week', lag(input(put(Claim_Date,8.),yymmdd8.)), input(put(Claim_Date,8.),yymmdd8.),'c'));
Number_Months = (intck('Month', lag(input(put(Claim_Date,8.),yymmdd8.)), input(put(Claim_Date,8.),yymmdd8.),'c'));
Number_Years = (intck('Year', lag(input(put(Claim_Date,8.),yymmdd8.)), input(put(Claim_Date,8.),yymmdd8.),'c'));
end;
run;
This is what I get:
Claim_Date | ID | Number_Days | Number_Weeks | Number_Months | Number_Years |
20141003 | 100001945 | . | . | . | . |
20090902 | 100002160 | . | . | . | . |
20090902 | 100002160 | . | . | . | . |
20090902 | 100002160 | . | . | . | . |
20090903 | 100002160 | 1 | 0 | 0 | 0 |
20111124 | 100002727 | . | . | . | . |
20111124 | 100002727 | . | . | . | . |
20111124 | 100002727 | 812 | 116 | 26 | 2 |
This is what I want:
Claim_Date | ID | Number_Days | Number_Weeks | Number_Months | Number_Years |
20141003 | 100001945 | . | . | . | . |
20090902 | 100002160 | . | . | . | . |
20090902 | 100002160 | 0 | 0 | 0 | 0 |
20090902 | 100002160 | 0 | 0 | 0 | 0 |
20090903 | 100002160 | 1 | 0 | 0 | 0 |
20111124 | 100002727 | . | . | . | . |
20111124 | 100002727 | 0 | 0 | 0 | 0 |
20111124 | 100002727 | 0 | 0 | 0 | 0 |
Thanks.
First, The point of BY groups is to not have to check for the same ID - that's what it does.
Replace
if id=lag(Id)
with only an Else.
Second - Lag doesn't work the way you think it does, and using it in an IF condition creates issues.
Calculate your lag value every iteration. So after BY
lag_claim= lag(claim_date); add any conversions necessary here as well.
And in in your if first.id set lag_claim to .
Replace lag in functions with lag_claim.
There red are papers galore on how the LAG functions work so I would recommend researching and reading one. Or remember - don't use it conditionally.
First, The point of BY groups is to not have to check for the same ID - that's what it does.
Replace
if id=lag(Id)
with only an Else.
Second - Lag doesn't work the way you think it does, and using it in an IF condition creates issues.
Calculate your lag value every iteration. So after BY
lag_claim= lag(claim_date); add any conversions necessary here as well.
And in in your if first.id set lag_claim to .
Replace lag in functions with lag_claim.
There red are papers galore on how the LAG functions work so I would recommend researching and reading one. Or remember - don't use it conditionally.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.