I want to get a count for ID that is 7 apart from the initial date. | |||||
ID | Date | Payment | |||
A123 | Jan-12 | 250 | |||
A123 | Mar-12 | 25 | |||
A123 | Oct-12 | 350 | |||
A123 | Dec-12 | 300 | |||
A123 | Jan-13 | 450 | |||
A123 | Dec-13 | 650 | |||
A123 | Feb-14 | 300 | |||
A123 | Mar-14 | 300 | |||
A123 | Apr-14 | 300 | |||
A123 | May-14 | 300 | |||
A123 | Jun-14 | 300 | |||
A123 | Jul-14 | 300 | |||
A123 | Aug-14 | 300 | |||
A123 | Sep-14 | 300 | |||
A124 | Mar-12 | 300 | |||
A124 | Jun-12 | 25 | |||
A124 | Sep-12 | 30 | |||
A124 | Dec-12 | 63 | |||
A124 | Mar-13 | 15 | |||
A124 | Jun-13 | 30 | |||
A124 | Sep-13 | 21 | |||
Here is what I want | |||||
ID | Initial Date | Payment | New Date | Month Cnt | Cnt |
A123 | Jan-12 | 250 | Jan-12 | 1 | |
A123 | Mar-12 | 25 | Jan-12 | - | 0 |
A123 | Oct-12 | 350 | Oct-12 | 9 | 1 |
A123 | Dec-12 | 300 | Oct-12 | - | 0 |
A123 | May-13 | 450 | May-13 | 7 | 1 |
A123 | Dec-13 | 650 | Dec-13 | 7 | 1 |
A123 | Feb-14 | 300 | Dec-13 | - | 0 |
A123 | Mar-14 | 300 | Dec-13 | - | 0 |
A123 | Apr-14 | 300 | Dec-13 | - | 0 |
A123 | May-14 | 300 | Dec-13 | - | 0 |
A123 | Jun-14 | 300 | Dec-13 | - | 0 |
A123 | Jul-14 | 300 | Jul-14 | 7 | 1 |
A123 | Aug-14 | 300 | Jul-14 | - | 0 |
A123 | Sep-14 | 300 | Jul-14 | - | 0 |
A124 | Mar-12 | 300 | Mar-12 | 1 | |
A124 | Jun-12 | 25 | Mar-12 | - | 0 |
A124 | Sep-12 | 30 | Mar-12 | - | 0 |
A124 | Dec-12 | 63 | Dec-12 | 9 | 1 |
A124 | Mar-13 | 15 | Dec-12 | - | 0 |
A124 | Jul-13 | 30 | Jul-13 | 7 | 1 |
A124 | Sep-13 | 21 | Jul-13 | - | 0 |
Summary | Total Payment | Total Count | |||
A123 | 4,425 | 5 | |||
A124 | 484 | 3 |
Hi,
Can I sugesst that you post some test data, in the form of a datastep, so that we have something to work with. I do not have time to type all that in. Secondly your logic does not seem to be clear:
ID | Initial Date | Payment | New Date | Month Cnt | Cnt |
A123 | Jan-12 | 250 | Jan-12 | 1 | |
A123 | Mar-12 | 25 | Jan-12 | - | 0 |
A123 | Oct-12 | 350 |
Oct-12 |
What is new date, how is it derived? What is "start date". Why is jan12 counted as 1, but mar12 is not and then oct12 is, there appears to be no pattern. Generally speaking your logic would break down into three steps:
1) Ascertain your base date
2) Merge that onto your data, creating a field for months different
3) Counting where months <= 7
RW9 - THANKS YOUR INPUT.
BELOW IS THE SAMPLE DATA AND CODE I TRY OUT.
data sample;
input id$4. DATE PAYMENT ;
attrib ID format =$4. informat=$4.;
attrib DATE format =date9. informat=date9.;
attrib PAYMENT format =9.0 informat=9.0;
datalines;
A123 01JAN2012 250
A123 01MAR2012 25
A123 01OCT2012 350
A123 01DEC2012 300
A123 01JAN2013 450
A123 01DEC2013 650
A123 01DEC2013 30
A123 01DEC2013 50
A123 01FEB2014 300
A123 01MAR2015 650
A124 01DEC2012 250
A124 01MAR2013 25
A124 01OCT2013 350
A124 01DEC2013 300
A124 01JAN2014 450
A124 01DEC2014 650
A124 01DEC2014 30
A124 01DEC2014 50
A124 01FEB2015 300
A124 01MAR2015 650
A125 01DEC2012 250
A125 01MAR2013 25
A125 01OCT2013 350
A125 01DEC2013 300
A125 01JAN2014 450
A125 01DEC2014 650
A125 01DEC2014 30
A125 01DEC2014 50
A125 01FEB2015 300
A125 01MAR2015 650
;
run;
proc sort data=sample; by ID Date; run;
Data TEST;
set sample;
by ID;
*retain Date;
if first.ID then Do;
NewDate=date;
Cnt=1;
end; return;
format NewDate date9.;
run;
I AM INTERESTING TO GET A COUNT FOR EVERY ID THAT ARE 7 MONHTS PART FROM THE PRIOR DATE IF THEY ARE NOT 7 MONTH APART. THE DEFAULT COUNT IS 1 AND NEWDATE IS INITIAL DATE EVERY FIRST ID AND DATE.
EX. THE FIRST ID A123 AND DATE 01JAN2012, THE COUNT =1 AND NEWDATE = 01JAN2012. AFTER THAT I WANT TO FILL THE NEWDATE AND COUNT IF THEY ARE 7 MONTH APART. IF THE DATE IS NOT 7 MONTHS PART, THEN KEEP THE NEWDATE THE SAME. EX: (01MAR2012-01JAN2012)/365.25*12 NOT GREATER OR EQUAL TO 7 THEN NEWDATE=01JAN2012, COUNT=0. THEN MOVE TO THE NEXT ONE (01OCT2012-01JAN2012)/365.25*12 > 7, NEWDATE=01OCT12, COUNT=1 AND SO ON. NEXT (01DEC2012-01OCT2012)/365.25*12<7, NEWDATE=01OCT12, COUNT=0 AND SO ON. IT IS SO COMPLECATED. SO THE END RESULT I WANT IS LOOK LIKE THIS
id | DATE | PAYMENT | NewDate | Cnt |
A123 | 1/1/2012 | 250 | 1/1/2012 | 1 |
A123 | 3/1/2012 | 25 | 1/1/2012 | 0 |
A123 | 10/1/2012 | 350 | 10/1/2012 | 1 |
A123 | 12/1/2012 | 300 | 10/1/2012 | 0 |
A123 | 1/1/2013 | 450 | 10/1/2012 | 0 |
A123 | 12/1/2013 | 650 | 12/1/2013 | 1 |
A123 | 12/1/2013 | 30 | 12/1/2013 | 0 |
A123 | 12/1/2013 | 50 | 12/1/2013 | 0 |
A123 | 2/1/2014 | 300 | 12/1/2013 | 0 |
A123 | 3/1/2015 | 650 | 3/1/2015 | 1 |
A124 | 12/1/2012 | 250 | 12/1/2012 | 1 |
A124 | 3/1/2013 | 25 | 12/1/2012 | |
A124 | 10/1/2013 | 350 | 10/1/2013 | 1 |
A124 | 12/1/2013 | 300 | 10/1/2013 | 0 |
A124 | 1/1/2014 | 450 | 10/1/2013 | 0 |
A124 | 12/1/2014 | 650 | 12/1/2014 | 1 |
A124 | 12/1/2014 | 30 | 12/1/2014 | 0 |
A124 | 12/1/2014 | 50 | 12/1/2014 | 0 |
A124 | 2/1/2015 | 300 | 12/1/2014 | 0 |
A124 | 3/1/2015 | 650 | 12/1/2014 | 0 |
A125 | 12/1/2012 | 250 | 12/1/2012 | 1 |
A125 | 3/1/2013 | 25 | 12/1/2012 | 0 |
A125 | 10/1/2013 | 350 | 12/1/2012 | 1 |
A125 | 12/1/2013 | 300 | 12/1/2013 | 1 |
A125 | 1/1/2014 | 450 | 12/1/2013 | 0 |
A125 | 12/1/2014 | 650 | 12/1/2014 | 1 |
A125 | 12/1/2014 | 30 | 12/1/2014 | 0 |
A125 | 12/1/2014 | 50 | 12/1/2014 | 0 |
A125 | 2/1/2015 | 300 | 12/1/2014 | 0 |
A125 | 3/1/2015 | 650 | 12/1/2014 | 0 |
THANK YOU.
Hi,
Sorry, forgot to get back to this. Something like this should work:
data sample; input id$4. date payment ; attrib id format =$4. informat=$4.; attrib date format =date9. informat=date9.; attrib payment format =9.0 informat=9.0; datalines; A123 01JAN2012 250 A123 01MAR2012 25 A123 01OCT2012 350 A123 01DEC2012 300 A123 01JAN2013 450 A123 01DEC2013 650 A123 01DEC2013 30 A123 01DEC2013 50 A123 01FEB2014 300 A123 01MAR2015 650 A124 01DEC2012 250 A124 01MAR2013 25 A124 01OCT2013 350 A124 01DEC2013 300 A124 01JAN2014 450 A124 01DEC2014 650 A124 01DEC2014 30 A124 01DEC2014 50 A124 01FEB2015 300 A124 01MAR2015 650 A125 01DEC2012 250 A125 01MAR2013 25 A125 01OCT2013 350 A125 01DEC2013 300 A125 01JAN2014 450 A125 01DEC2014 650 A125 01DEC2014 30 A125 01DEC2014 50 A125 01FEB2015 300 A125 01MAR2015 650 ; run; proc sort data=sample; by id date; run; data want; set sample; retain newdate count; if _n_=1 then do; newdate=date; count=1; end; else if intck('month',newdate,date) > 7 then do; count=1; newdate=date; end; else count=0; format newdate date9.; run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.