Hi Team,
my database has millions of register with different IDs, i will show just an example with one ID.
data temp;
infile DATALINES dsd missover;
input DATE:DDMMYY10. value_payment ID;
FORMAT DATE DDMMYY10.;
CARDS;
16/11/2020,9000,1
02/12/2020,9000,1
03/12/2020,8000,1
03/12/2020,6000,1
03/12/2020,5000,1
03/12/2020,1000,1
07/12/2020,2000,1
29/12/2020,3000,1
29/12/2020,4000,1
29/12/2020,9999,1
29/12/2020,2000,1
26/01/2021,5000,1
26/01/2021,1111,1
26/01/2021,1000,1
01/02/2021,2000,1
01/02/2021,3000,1
01/02/2021,1000,1
02/02/2021,5000,1
02/02/2021,7000,1
02/02/2021,8000,1
02/02/2021,9000,1
08/02/2021,10000,1
08/02/2021,1000,1
08/02/2021,4444,1
01/06/2020,3000,1
;
run;
PROC SORT DATA=temp;
BY DATE ID;
RUN;
data test;
set temp;
FORMAT tempdate ddmmyy10.;
by ID DATE ;
tempdate=lag(DATE);
if first.ID then tempdate=DATE;
retain cflag;
if first.ID then cflag=0;
if DATE < INTNX('DAY',tempdate,5) then cflag=cflag+1;
run;
My difficulty is to count only if the ID is the same and the first date cannot be longer than 05 days compared to the second date, because i need to identify 4 or more payments within 5 or more days.
Result of my code:
the result that i need:
DATE value_payment ID tempdate cflag 01/06/2020 3000 1 01/06/2020 1 16/11/2020 9000 1 01/06/2020 1 02/12/2020 9000 1 16/11/2020 1 03/12/2020 8000 1 02/12/2020 2 03/12/2020 6000 1 03/12/2020 3 03/12/2020 5000 1 03/12/2020 4 03/12/2020 1000 1 03/12/2020 5 07/12/2020 2000 1 03/12/2020 6 29/12/2020 3000 1 07/12/2020 1 29/12/2020 4000 1 29/12/2020 2 29/12/2020 9999 1 29/12/2020 3 29/12/2020 2000 1 29/12/2020 4 26/01/2021 5000 1 29/12/2020 1 26/01/2021 1111 1 26/01/2021 2 26/01/2021 1000 1 26/01/2021 3 01/02/2021 2000 1 26/01/2021 1 01/02/2021 3000 1 01/02/2021 2 01/02/2021 1000 1 01/02/2021 3 02/02/2021 5000 1 01/02/2021 4 02/02/2021 7000 1 02/02/2021 5 02/02/2021 8000 1 02/02/2021 6 02/02/2021 9000 1 02/02/2021 7 08/02/2021 10000 1 02/02/2021 1 08/02/2021 1000 1 08/02/2021 2 08/02/2021 4444 1 08/02/2021 3
Tks!!!!!
Do I see corect that you need something like:
data test;
set temp;
by ID DATE ;
if first.ID then cflag=1;
else do;
if dif(date) < 5 then cflag+1;
else cflag=1;
end;
run;
?
Btw. I think sorting should be by ID and DATE, isn't it?
Bart
Do I see corect that you need something like:
data test;
set temp;
by ID DATE ;
if first.ID then cflag=1;
else do;
if dif(date) < 5 then cflag+1;
else cflag=1;
end;
run;
?
Btw. I think sorting should be by ID and DATE, isn't it?
Bart
Or even simpler I suppose:
data test;
set temp;
by ID DATE ;
if not first.ID and dif(date) < 5
then cflag+1;
else cflag=1;
run;
Bart
With this requirement:
My difficulty is to count only if the ID is the same and the first date cannot be longer than 05 days compared to the second date, because i need to identify 4 or more payments within 5 or more days.
what do you expect in your output when there do not exist "4 or more payments", such as may happen with a very new ID?
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.