Hello guys!
I'm struggling to achieve this, but I can't find the solution.
What I need to do, is to count the records where the inquiry_date is older than the per_obs, for each per_obs, and make it accumulative for each id.
So, it must restart when the ID changes. (the per_obs is set at the beginning of the month):
This is the original table:
ID | INQUIRY_DATE | PERIOD |
A | 9/12/2019 | 2019-12 |
A | 9/12/2019 | 2019-12 |
A | 9/12/2019 | 2019-12 |
A | 4/11/2019 | 2019-11 |
A | 19/10/2019 | 2019-10 |
A | 19/10/2019 | 2019-10 |
A | 8/10/2019 | 2019-10 |
A | 20/9/2019 | 2019-09 |
A | 21/8/2019 | 2019-08 |
A | 21/8/2019 | 2019-08 |
A | 28/7/2019 | 2019-07 |
A | 26/7/2019 | 2019-07 |
B | 23/8/2018 | 2018-08 |
B | 23/8/2018 | 2018-08 |
B | 9/8/2018 | 2018-08 |
B | 13/7/2018 | 2018-07 |
B | 25/6/2018 | 2018-06 |
B | 11/5/2018 | 2018-05 |
B | 18/4/2018 | 2018-04 |
This is what I need:
ID | INQUIRY_DATE | PERIOD | Accumulative Inquiry dates |
A | 9/12/2019 | 2019-12 | 9 |
A | 9/12/2019 | 2019-12 | 9 |
A | 9/12/2019 | 2019-12 | 9 |
A | 4/11/2019 | 2019-11 | 8 |
A | 19/10/2019 | 2019-10 | 5 |
A | 19/10/2019 | 2019-10 | 5 |
A | 8/10/2019 | 2019-10 | 5 |
A | 20/9/2019 | 2019-09 | 4 |
A | 21/8/2019 | 2019-08 | 2 |
A | 21/8/2019 | 2019-08 | 2 |
A | 28/7/2019 | 2019-07 | 0 |
A | 26/7/2019 | 2019-07 | 0 |
B | 23/8/2018 | 2018-08 | 3 |
B | 23/8/2018 | 2018-08 | 3 |
B | 9/8/2018 | 2018-08 | 3 |
B | 13/7/2018 | 2018-07 | 3 |
B | 25/6/2018 | 2018-06 | 2 |
B | 11/5/2018 | 2018-05 | 1 |
B | 18/4/2018 | 2018-04 | 0 |
Any help would be awesome, thanks a lot!
data new;
informat date ddmmyy10.;
input id $ date period $;
format date mmddyy10.;
cards;;;
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 4/11/2019 2019-11
A 19/10/2019 2019-10
A 19/10/2019 2019-10
A 8/10/2019 2019-10
A 20/9/2019 2019-09
A 21/8/2019 2019-08
A 21/8/2019 2019-08
A 28/7/2019 2019-07
A 26/7/2019 2019-07
B 23/8/2018 2018-08
B 23/8/2018 2018-08
B 9/8/2018 2018-08
B 13/7/2018 2018-07
B 25/6/2018 2018-06
B 11/5/2018 2018-05
B 18/4/2018 2018-04
;;;
run;
proc sort data=new;
by id period;
run;
data new2;
set new;
by id period;
retain cur_queries 0 tot_prev_queries 0;
if first.period then do;
tot_prev_queries = tot_prev_queries+cur_queries;
cur_queries = 0;
end;
cur_queries = cur_queries +1;
if first.id then tot_prev_queries =0;
drop cur_queris;
run;
proc sort data=new2;
by id descending period ;
run;
data new;
informat date ddmmyy10.;
input id $ date period $;
format date mmddyy10.;
cards;;;
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 4/11/2019 2019-11
A 19/10/2019 2019-10
A 19/10/2019 2019-10
A 8/10/2019 2019-10
A 20/9/2019 2019-09
A 21/8/2019 2019-08
A 21/8/2019 2019-08
A 28/7/2019 2019-07
A 26/7/2019 2019-07
B 23/8/2018 2018-08
B 23/8/2018 2018-08
B 9/8/2018 2018-08
B 13/7/2018 2018-07
B 25/6/2018 2018-06
B 11/5/2018 2018-05
B 18/4/2018 2018-04
;;;
run;
proc sort data=new;
by id period;
run;
data new2;
set new;
by id period;
retain cur_queries 0 tot_prev_queries 0;
if first.period then do;
tot_prev_queries = tot_prev_queries+cur_queries;
cur_queries = 0;
end;
cur_queries = cur_queries +1;
if first.id then tot_prev_queries =0;
drop cur_queris;
run;
proc sort data=new2;
by id descending period ;
run;
What's `per_obs`?
Bart
Hi Bart!
Sorry, PER_OB is the same as PERIOD.
Please explain further the calculations desired.
Why is this one a zero
A | 28/7/2019 | 2019-07 | 0 |
and this one a two?
A | 21/8/2019 | 2019-08 | 2 |
Paige,
The first one is 0 because the ID A doesn't have any inquiry_date older than the PERIOD/PER_OBS. (It has two on the same month, 28/7/2019 and 26/7/2019 , but I don't have to consider inquiry_dates on the same period).
Meanwhile, the 2019-08 has two inquiry_dates from the historical data, 28/7/2019 and 26/7/2019, so the counter is marking 2.
Thanks!
Hi @marianob Please try
data have;
input ID $ INQUIRY_DATE :ddmmyy10. _PERIOD :$7.;
period=input(catx('-',_period,'01'),yymmdd10.);
format period yymmdd10. INQUIRY_DATE ddmmyy10.;
cards;
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 4/11/2019 2019-11
A 19/10/2019 2019-10
A 19/10/2019 2019-10
A 8/10/2019 2019-10
A 20/9/2019 2019-09
A 21/8/2019 2019-08
A 21/8/2019 2019-08
A 28/7/2019 2019-07
A 26/7/2019 2019-07
B 23/8/2018 2018-08
B 23/8/2018 2018-08
B 9/8/2018 2018-08
B 13/7/2018 2018-07
B 25/6/2018 2018-06
B 11/5/2018 2018-05
B 18/4/2018 2018-04
;
data want ;
if _n_=1 then do;
dcl hash H (multidata:'y') ;
h.definekey ("_INQUIRY_DATE") ;
h.definedone () ;
dcl hiter hi('h');
end;
do until(last.id);
set have;
by id;
_INQUIRY_DATE=INQUIRY_DATE;
h.add();
end;
do until(last.id);
set have;
by id INQUIRY_DATE period notsorted;
if first.period then do;
count=0;
do while(hi.next()=0);
count=sum(_INQUIRY_DATE<period,count);
end;
end;
output;
end;
h.clear();
run;
ID | INQUIRY_DATE | _PERIOD | period | _INQUIRY_DATE | count |
---|---|---|---|---|---|
A | 09/12/2019 | 2019-12 | 2019-12-01 | 21841 | 9 |
A | 09/12/2019 | 2019-12 | 2019-12-01 | 21841 | 9 |
A | 09/12/2019 | 2019-12 | 2019-12-01 | 21841 | 9 |
A | 04/11/2019 | 2019-11 | 2019-11-01 | 21841 | 8 |
A | 19/10/2019 | 2019-10 | 2019-10-01 | 21841 | 5 |
A | 19/10/2019 | 2019-10 | 2019-10-01 | 21841 | 5 |
A | 08/10/2019 | 2019-10 | 2019-10-01 | 21841 | 5 |
A | 20/09/2019 | 2019-09 | 2019-09-01 | 21841 | 4 |
A | 21/08/2019 | 2019-08 | 2019-08-01 | 21841 | 2 |
A | 21/08/2019 | 2019-08 | 2019-08-01 | 21841 | 2 |
A | 28/07/2019 | 2019-07 | 2019-07-01 | 21841 | 0 |
A | 26/07/2019 | 2019-07 | 2019-07-01 | 21841 | 0 |
B | 23/08/2018 | 2018-08 | 2018-08-01 | 21292 | 4 |
B | 23/08/2018 | 2018-08 | 2018-08-01 | 21292 | 4 |
B | 09/08/2018 | 2018-08 | 2018-08-01 | 21292 | 4 |
B | 13/07/2018 | 2018-07 | 2018-07-01 | 21292 | 3 |
B | 25/06/2018 | 2018-06 | 2018-06-01 | 21292 | 2 |
B | 11/05/2018 | 2018-05 | 2018-05-01 | 21292 | 1 |
B | 18/04/2018 | 2018-04 | 2018-04-01 | 21292 | 0 |
Double DoW-loop and Hash Table in one, I love it! 😎
Bart
Again, my initial kudos goes to Krakow and Marcin 🙂 ( to whom and Poland I owe my start 🙂 )Btw @yabwon My colleagues love your packages albeit they are complaining that it's not accesible in work professional machines. Can you make it secure for us to access and download if you don't mind. I was gonna reach out to you offline for this request
Alright, let me explain over linkedin later and give you official email. Thank you for your attention!
It works perfectly! Thanks a lot for your time!
data new;
informat date ddmmyy10.;
input id $ date period $;
format date mmddyy10.;
cards;;;
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 4/11/2019 2019-11
A 19/10/2019 2019-10
A 19/10/2019 2019-10
A 8/10/2019 2019-10
A 20/9/2019 2019-09
A 21/8/2019 2019-08
A 21/8/2019 2019-08
A 28/7/2019 2019-07
A 26/7/2019 2019-07
B 23/8/2018 2018-08
B 23/8/2018 2018-08
B 9/8/2018 2018-08
B 13/7/2018 2018-07
B 25/6/2018 2018-06
B 11/5/2018 2018-05
B 18/4/2018 2018-04
;;;
run;
proc sort data=new;
by id period;
run;
data new2;
set new;
by id period;
retain cur_queries 0 tot_prev_queries 0;
if first.period then do;
tot_prev_queries = tot_prev_queries+cur_queries;
cur_queries = 0;
end;
cur_queries = cur_queries +1;
if first.id then tot_prev_queries =0;
drop cur_queris;
run;
proc sort data=new2;
by id descending period ;
run;
data new;
informat date ddmmyy10.;
input id $ date period $;
format date mmddyy10.;
cards;;;
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 9/12/2019 2019-12
A 4/11/2019 2019-11
A 19/10/2019 2019-10
A 19/10/2019 2019-10
A 8/10/2019 2019-10
A 20/9/2019 2019-09
A 21/8/2019 2019-08
A 21/8/2019 2019-08
A 28/7/2019 2019-07
A 26/7/2019 2019-07
B 23/8/2018 2018-08
B 23/8/2018 2018-08
B 9/8/2018 2018-08
B 13/7/2018 2018-07
B 25/6/2018 2018-06
B 11/5/2018 2018-05
B 18/4/2018 2018-04
;;;
run;
proc sort data=new;
by id period;
run;
data new2;
set new;
by id period;
retain cur_queries 0 tot_prev_queries 0;
if first.period then do;
tot_prev_queries = tot_prev_queries+cur_queries;
cur_queries = 0;
end;
cur_queries = cur_queries +1;
if first.id then tot_prev_queries =0;
drop cur_queris;
run;
proc sort data=new2;
by id descending period ;
run;
It works just great as @novinosrin code.
Thanks both for helping me out!
data new; infile cards expandtabs; informat date ddmmyy10.; input id $ date period $; format date mmddyy10.; cards;;; A 9/12/2019 2019-12 A 9/12/2019 2019-12 A 9/12/2019 2019-12 A 4/11/2019 2019-11 A 19/10/2019 2019-10 A 19/10/2019 2019-10 A 8/10/2019 2019-10 A 20/9/2019 2019-09 A 21/8/2019 2019-08 A 21/8/2019 2019-08 A 28/7/2019 2019-07 A 26/7/2019 2019-07 B 23/8/2018 2018-08 B 23/8/2018 2018-08 B 9/8/2018 2018-08 B 13/7/2018 2018-07 B 25/6/2018 2018-06 B 11/5/2018 2018-05 B 18/4/2018 2018-04 ;;; run; proc sql; create table want as select *,(select count(*) from new where id=a.id and date<input(a.period,anydtdte.)) as count from new as a; quit;
Thanks, Ksharp! Your code worked great too!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.