Hi Team,
my database has millions of register with different IDs, i will show just an example with one ID.
data temp;
infile DATALINES dsd missover;
input DATE:DDMMYY10. value_payment ID;
FORMAT DATE DDMMYY10.;
CARDS;
16/11/2020,9000,1
02/12/2020,9000,1
03/12/2020,8000,1
03/12/2020,6000,1
03/12/2020,5000,1
03/12/2020,1000,1
07/12/2020,2000,1
29/12/2020,3000,1
29/12/2020,4000,1
29/12/2020,9999,1
29/12/2020,2000,1
26/01/2021,5000,1
26/01/2021,1111,1
26/01/2021,1000,1
01/02/2021,2000,1
01/02/2021,3000,1
01/02/2021,1000,1
02/02/2021,5000,1
02/02/2021,7000,1
02/02/2021,8000,1
02/02/2021,9000,1
08/02/2021,10000,1
08/02/2021,1000,1
08/02/2021,4444,1
01/06/2020,3000,1
;
run;
PROC SORT DATA=temp;
BY DATE ID;
RUN;
data test;
set temp;
FORMAT tempdate ddmmyy10.;
by ID DATE ;
tempdate=lag(DATE);
if first.ID then tempdate=DATE;
retain cflag;
if first.ID then cflag=0;
if DATE < INTNX('DAY',tempdate,5) then cflag=cflag+1;
run;
My difficulty is to count only if the ID is the same and the first date cannot be longer than 05 days compared to the second date, because i need to identify 4 or more payments within 5 or more days.
Result of my code:
the result that i need:
DATE value_payment ID tempdate cflag 01/06/2020 3000 1 01/06/2020 1 16/11/2020 9000 1 01/06/2020 1 02/12/2020 9000 1 16/11/2020 1 03/12/2020 8000 1 02/12/2020 2 03/12/2020 6000 1 03/12/2020 3 03/12/2020 5000 1 03/12/2020 4 03/12/2020 1000 1 03/12/2020 5 07/12/2020 2000 1 03/12/2020 6 29/12/2020 3000 1 07/12/2020 1 29/12/2020 4000 1 29/12/2020 2 29/12/2020 9999 1 29/12/2020 3 29/12/2020 2000 1 29/12/2020 4 26/01/2021 5000 1 29/12/2020 1 26/01/2021 1111 1 26/01/2021 2 26/01/2021 1000 1 26/01/2021 3 01/02/2021 2000 1 26/01/2021 1 01/02/2021 3000 1 01/02/2021 2 01/02/2021 1000 1 01/02/2021 3 02/02/2021 5000 1 01/02/2021 4 02/02/2021 7000 1 02/02/2021 5 02/02/2021 8000 1 02/02/2021 6 02/02/2021 9000 1 02/02/2021 7 08/02/2021 10000 1 02/02/2021 1 08/02/2021 1000 1 08/02/2021 2 08/02/2021 4444 1 08/02/2021 3
Tks!!!!!
Do I see corect that you need something like:
data test;
set temp;
by ID DATE ;
if first.ID then cflag=1;
else do;
if dif(date) < 5 then cflag+1;
else cflag=1;
end;
run;
?
Btw. I think sorting should be by ID and DATE, isn't it?
Bart
Do I see corect that you need something like:
data test;
set temp;
by ID DATE ;
if first.ID then cflag=1;
else do;
if dif(date) < 5 then cflag+1;
else cflag=1;
end;
run;
?
Btw. I think sorting should be by ID and DATE, isn't it?
Bart
Or even simpler I suppose:
data test;
set temp;
by ID DATE ;
if not first.ID and dif(date) < 5
then cflag+1;
else cflag=1;
run;
Bart
With this requirement:
My difficulty is to count only if the ID is the same and the first date cannot be longer than 05 days compared to the second date, because i need to identify 4 or more payments within 5 or more days.
what do you expect in your output when there do not exist "4 or more payments", such as may happen with a very new ID?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.