Hi all, Have been doing live testing of the code and it works but my big concern is the very large filesize that occurs. I then wondered if putting all the dates on a timeline might be a better way of doing it and have written the following code, is there a way to use this approach to get accurate dates? I've included the test data at the top. If this could be used then filesizes are dramatically reduced. Basically I've got this far but cant work out how to get the dates from it. data phone ; infile datalines delimiter=','; input clientid $ sharerid $ phone $ startdt enddt ; informat startdt enddt date9. ; format startdt enddt date9. ; datalines; client1,sharer1,555-6532,21Nov2011,10Dec2012 client1,sharer1,444-5655,29Nov2010,14Feb2011 client1,sharer1,333-1234,20May1993,17Aug1993 client1,sharer1,333-1234,08Sep1993,08Sep1993 client2,sharer2,666-6548,10Jul2001,12Nov2001 client2,sharer2,666-6548,10Apr2002,06Aug2002 client2,sharer2,111-5658,02Nov1992,12Aug1993 client2,sharer2,222-6589,10Jan2000,31Jan2000 client2,sharer2,777-8755,31Jan2000,03Feb2000 client2,sharer2,777-8755,25Jun2009,14Sep2009 client2,sharer2,321-6544,18Dec2003,08Apr2004 client2,sharer2,778-6589,07Jun2001,10Jul2001 client2,sharer2,999-9988,31Dec1993,26Mar1994 client2,sharer2,999-9988,28Mar1994,28Mar1994 client2,sharer2,888-7845,12Aug1993,23Aug1993 client2,sharer2,789-9876,10Aug1994,05Sep1994 client2,sharer2,789-9876,22Jun1995,10Jul1995 client2,sharer2,951-6235,08Apr2004,10Aug2004 client2,sharer2,753-1245,25Jan2007,18Jul2007 client2,sharer2,656-8989,12Nov1998,26Feb1999 client2,sharer2,656-8989,10Dec1999,10Jan2000 client2,sharer2,141-1414,23Aug2000,26Mar2001 client2,sharer2,141-1414,07Jun2001,10Jul2001 client2,sharer2,363-3636,19Jun1998,12Nov1998 client2,sharer2,852-8525,18Jun2009,02Jun2010 client2,sharer2,852-8525,20Oct2010,16May2011 client2,sharer2,852-8525,31May2012,10Dec2012 client2,sharer2,565-5656,05Sep1994,01Nov1994 client2,sharer2,565-5656,14Nov1994,14Nov1994 client2,sharer2,221-2212,01jan2012,30jan2012 ; data paid_periods; infile datalines delimiter=','; input clientid $ sharerid $ paystart payend ; informat paystart payend date9. ; format paystart payend date9. ; datalines; client2,sharer2,31Aug1992,23Aug1993 client2,sharer2,25Dec1993,26Mar1994 client2,sharer2,10Aug1994,01Nov1994 client2,sharer2,15Mar1995,20Mar1995 client2,sharer2,19Jun1998,18Feb1999 client2,sharer2,10Dec1999,31Jan2000 client2,sharer2,20Jun2001,12Nov2001 client2,sharer2,10Apr2002,27Jul2002 client2,sharer2,18Dec2003,06Aug2004 client2,sharer2,11Dec2006,17Jul2007 client2,sharer2,18Jun2009,01Jun2010 client2,sharer2,20Oct2010,13May2011 client2,sharer2,14jan2012,21jan2012 ; proc sql; create table ph as select distinct clientid,sharerid,phone from phone ; create table pays as select distinct rs.*,ph.phone from paid_periods rs, ph where rs.clientid=ph.clientid and rs.sharerid=ph.sharerid ; quit; data phprd(keep=phone clientid sharerid phone_flag date) ; set phone ; phone_flag=1; date=startdt; output; phone_flag=0; date=enddt; output; run; proc sort data=phprd; by phone clientid sharerid date; run; data payprd(keep = phone clientid sharerid pay_flag date) ; set pays; pay_flag=1; date=paystart; output; pay_flag=0; date=payend; output; run; proc sort data=payprd; by phone clientid sharerid date; run; data timeline ; set payprd phprd; format date ddmmyy10. clientid sharerid 12.; by phone clientid sharerid date; run;
... View more