Hello everyone,
So, I have these two datasets as below-
data test1;
input ssn loanNO effDate ddmmyy10. program $ ;
cards;
1234 1 01/01/2013 no
1234 2 01/05/2013 no
1234 3 01/01/2014 yes
5678 1 01/01/2012 no
9101 1 01/05/2014 yes
9101 2 01/09/2013 no
;
run;
data test2;
input ssn fico date ddmmyy10. ;
cards;
1234 600 01/02/2012
1234 580 01/05/2012
1234 700 01/08/2012
1234 680 01/11/2012
1234 500 01/02/2013
1234 580 01/05/2013
1234 700 01/08/2013
1234 680 01/11/2013
1234 600 01/02/2014
1234 580 01/05/2014
1234 600 01/08/2014
1234 690 01/11/2014
;
run;
I need to join only those records from first datset test1 where program is yes and then get corresponding fico scores for that person for that year, previous year and following year. So, for example, for ssn 1234, yes was for 1 jan 2014 and the fico score during that period was 680 (fico scores are for 4 periods for each year, feb, may, aug and nov and in case we are looking for month other than these 4 months, we use the previous one-for ex, in this case nov 2013 was the last one, so we use that). Similarly fico score for previous year during that period was also 680 for nov 2012 and for following year we use nov 2014 which was 690.
Thanks a lot in advance.
Hopefully, one of us will have correctly guessed the full requirements. I think the following is what you want. Note: I changed the fico scores so that we could identify where they came from. The 3 digit numbers I used represent the year for the first digit, and the next two represent the month. Those values aren't necessary for the code to run, but make it a lot easier to see which fico scores were used:
data test1; input ssn loanNO effDate ddmmyy10. program $ ; format effDate date9.; cards; 1234 1 01/01/2013 no 1234 2 01/05/2013 no 1234 3 01/01/2014 yes 5678 1 01/01/2012 no 9101 1 01/05/2014 yes 9101 2 01/09/2013 no ; run; data test2; input ssn fico date ddmmyy10. ; format date date9.; cards; 1234 202 01/02/2012 1234 205 01/05/2012 1234 208 01/08/2012 1234 211 01/11/2012 1234 302 01/02/2013 1234 305 01/05/2013 1234 308 01/08/2013 1234 311 01/11/2013 1234 402 01/02/2014 1234 405 01/05/2014 1234 408 01/08/2014 1234 411 01/11/2014 9101 202 01/02/2012 9101 205 01/05/2012 9101 208 01/08/2012 9101 211 01/11/2012 9101 302 01/02/2013 9101 305 01/05/2013 9101 308 01/08/2013 9101 311 01/11/2013 9101 402 01/02/2014 9101 405 01/05/2014 9101 408 01/08/2014 9101 411 01/11/2014 9101 502 01/02/2015 9101 505 01/05/2015 9101 508 01/08/2015 9101 511 01/11/2015 ; proc sql noprint; create table want as select a.*, b.fico,b.date, c.fico as fico2, d.fico as fico3 from test1 (where=(program eq 'yes')) a left join test2 b on a.ssn eq b.ssn and (a.effdate eq b.date or intnx('month',a.effdate,-1,'b') eq b.date or intnx('month',a.effdate,-2,'b') eq b.date) left join test2 c on a.ssn eq c.ssn and intnx('year',b.date,-1,'s') eq c.date left join test2 d on a.ssn eq d.ssn and intnx('year',b.date,1,'s') eq d.date ; quit;
Art, CEO, AnalystFinder.com
In case I understood your business logic correctly below should do:
proc sql feedback;
create table want as
select t1.*, t2.date, t2.fico
from
test1 as t1
left join
test2 as t2
on
t1.ssn=t2.ssn and
t2.date between intnx('year',t1.effDate,-1,'b') and intnx('year',t1.effDate,+1,'e')
where t1.program='yes'
order by t1.ssn, t2.date
;
quit;
Result:
If the code doesn't return what you want then please post a "desired result" table showing us the output you want based on your sample data.
sorry guys, should have been more clearer. Thanks for your codes, i guess my below updated data along with want dataset should make it more clear. Please use these numbers now, the data is not necessarily sorted with dates etc.
data test1;
input ssn loanNO effDate ddmmyy10. program $ ;
cards;
1234 1 01/01/2013 no
1234 2 01/05/2013 no
1234 3 01/01/2014 yes
5678 1 01/01/2012 no
9101 1 01/05/2014 yes
9101 2 01/09/2013 no
;
run;
data test2;
input ssn fico date ddmmyy10. ;
cards;
1234 600 01/02/2012
1234 580 01/05/2012
1234 700 01/08/2012
1234 680 01/11/2012
1234 500 01/02/2013
1234 580 01/05/2013
1234 700 01/08/2013
1234 680 01/11/2013
1234 600 01/02/2014
1234 580 01/05/2014
1234 600 01/08/2014
1234 680 01/11/2014
5678 600 01/02/2012
5678 580 01/05/2012
5678 700 01/08/2012
5678 680 01/11/2012
5678 500 01/02/2013
5678 580 01/05/2013
5678 700 01/08/2013
5678 680 01/11/2013
5678 600 01/02/2014
5678 580 01/05/2014
5678 600 01/08/2014
5678 680 01/11/2014
9101 600 01/02/2015
9101 690 01/05/2015
9101 670 01/08/2015
9101 680 01/11/2015
9101 550 01/02/2013
9101 580 01/05/2013
9101 700 01/08/2013
9101 680 01/11/2013
9101 600 01/02/2014
9101 680 01/05/2014
9101 700 01/08/2014
9101 680 01/11/2014
;
run;
data want;
input ssn loanno ficopresent ficoprevYr ficoNextYr;
cards;
1234 3 680 680 680
9101 1 680 580 690
;
run;
Try next, not tested code:
data test2a;
set test2;
by ssn;
date_from = lag(date);
if first.ssn then call missing(date_upto );
date_upto = date - 1;
if not missing(date_from) then output;
drop date;
run;
proc sql;
create table want as select t1.*, t2.fico
from test1 as t1
left join test2a as t2
on t1.ssn = t2.ssn and
t1.effdate between t2.date_from and t2.date_upto;
quit
Hopefully, one of us will have correctly guessed the full requirements. I think the following is what you want. Note: I changed the fico scores so that we could identify where they came from. The 3 digit numbers I used represent the year for the first digit, and the next two represent the month. Those values aren't necessary for the code to run, but make it a lot easier to see which fico scores were used:
data test1; input ssn loanNO effDate ddmmyy10. program $ ; format effDate date9.; cards; 1234 1 01/01/2013 no 1234 2 01/05/2013 no 1234 3 01/01/2014 yes 5678 1 01/01/2012 no 9101 1 01/05/2014 yes 9101 2 01/09/2013 no ; run; data test2; input ssn fico date ddmmyy10. ; format date date9.; cards; 1234 202 01/02/2012 1234 205 01/05/2012 1234 208 01/08/2012 1234 211 01/11/2012 1234 302 01/02/2013 1234 305 01/05/2013 1234 308 01/08/2013 1234 311 01/11/2013 1234 402 01/02/2014 1234 405 01/05/2014 1234 408 01/08/2014 1234 411 01/11/2014 9101 202 01/02/2012 9101 205 01/05/2012 9101 208 01/08/2012 9101 211 01/11/2012 9101 302 01/02/2013 9101 305 01/05/2013 9101 308 01/08/2013 9101 311 01/11/2013 9101 402 01/02/2014 9101 405 01/05/2014 9101 408 01/08/2014 9101 411 01/11/2014 9101 502 01/02/2015 9101 505 01/05/2015 9101 508 01/08/2015 9101 511 01/11/2015 ; proc sql noprint; create table want as select a.*, b.fico,b.date, c.fico as fico2, d.fico as fico3 from test1 (where=(program eq 'yes')) a left join test2 b on a.ssn eq b.ssn and (a.effdate eq b.date or intnx('month',a.effdate,-1,'b') eq b.date or intnx('month',a.effdate,-2,'b') eq b.date) left join test2 c on a.ssn eq c.ssn and intnx('year',b.date,-1,'s') eq c.date left join test2 d on a.ssn eq d.ssn and intnx('year',b.date,1,'s') eq d.date ; quit;
Art, CEO, AnalystFinder.com
Yes, sir, it seems to work. I actually posted my updated data again. Thanks so much1
Assuming I understand what you mean.
data test1;
input ssn loanNO effDate ddmmyy10. program $ ;
if program='yes';
cards;
1234 1 01/01/2013 no
1234 2 01/05/2013 no
1234 3 01/01/2014 yes
5678 1 01/01/2012 no
9101 1 01/05/2014 yes
9101 2 01/09/2013 no
;
run;
data test2;
input ssn fico date ddmmyy10. ;
if month(date) in (2 5 8 11);
cards;
1234 600 01/02/2012
1234 580 01/05/2012
1234 700 01/08/2012
1234 680 01/11/2012
1234 500 01/02/2013
1234 580 01/05/2013
1234 700 01/08/2013
1234 680 01/11/2013
1234 600 01/02/2014
1234 580 01/05/2014
1234 600 01/08/2014
1234 690 01/11/2014
;
run;
data temp;
set test1(in=ina rename=(effDate=date)) test2 ;
by ssn date;
retain prev prev_date;
if first.ssn then call missing(prev,prev_date);
if not missing(fico) then do;prev=fico;prev_date=date;end;
next_date=intnx('year',prev_date,1,'s');
if ina;
drop fico;
format date prev_date next_date ddmmyy10.;
run;
data want;
merge temp(in=ina) test2(rename=(date=next_date fico=next));
by ssn next_date;
if ina;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.