BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Riri_V
Fluorite | Level 6

Hello everyone,

 

I've done something similar in the past, but now can't really remember how I did that.

Hope someone can help me with this.

 

Here's a sample of historical data and final result i'm aiming for

 

Sample 

customer_idreporting_date rating_18_20 (flag)
111111Apr-191
111111May-191
111111Jun-190
111111Jul-190
111111Aug-190
111111Sep-190
111111Oct-191
111111Nov-191
111111Dec-191
111111Jan-200
222222Apr-190
222222May-190
222222Jun-190
222222Jul-190
222222Aug-191
222222Sep-191
222222Oct-191
222222Nov-191
222222Dec-190
222222Jan-200
333333Apr-190
333333May-190
333333Jun-190
333333Jul-190
333333Aug-190
333333Sep-191
333333Oct-191
333333Nov-191
333333Dec-191
333333Jan-201
444444Apr-190
444444May-190
444444Jun-190
444444Jul-190
444444Aug-190
444444Sep-190
444444Oct-190
444444Nov-190
444444Dec-190
444444Jan-200
555555Apr-191
555555May-191
555555Jun-191
555555Jul-191
555555Aug-191
555555Sep-191
555555Oct-191
555555Nov-191
555555Dec-191
555555Jan-201

 

Needed table:

Data aggregated on customer level where reporting_date_rating is a date when customer got rating_18_20 =1 for the last time in history and period_rating_18_20 is amount of months the customer has been possessing the flag for the last period in history.

 

customer_idreporting_date_ratingperiod_rating_18_20
111111Oct-193
222222Aug-194
333333Sep-195
444444 0
555555Apr-1910

 

Here's a code that doesn't work for customers who didn't have rating_18_20 =1 (for example customer number 444444) and returns period_rating_18_20 = 10 instead of 0.

Could anyone please look into it and tell me where the logic is incorrect. 

Thanks.

 

/*18-20*/
proc sort data=WORK.SAMPLE (keep=customer_id reporting_date rating_18_20)
out=rating_flags_sort nodup;
by customer_id rating_18_20 ;
run;
data rating;
set rating_flags_sort;
format for_lag_dt lag_dt date9.;
by customer_id rating_18_20;
retain num_of_period for_lag_dt cnt_months;
if first.rating_18_20 then do;
num_of_period = 1;
for_lag_dt = reporting_date;
cnt_months= 1;
end;
else do;
lag_dt = for_lag_dt;
for_lag_dt = reporting_date;
a=intck('month', lag_dt, reporting_date);
if a>1 then do; num_of_period+1;
cnt_months =1;end;
else cnt_months + 1;
end;
/*drop lag_dt for_lag_dt;*/
run;

/*doesn't work for customers with 0 flags - returns max amount of months in history*/
data rating_2;
set rating;
by customer_id;
if last.customer_id;
keep customer_id rating_18_20 cnt_months;
run;

/*date of rating*/
proc sort data=rating out=rating_sort_2;
by customer_id descending rating_18_20 descending num_of_period reporting_date;
run;

data rating_3;
set rating_sort_2;
by customer_id descending rating_18_20 descending num_of_period reporting_date;
if first.customer_id and first.num_of_period;
keep customer_id reporting_date;
run;
data rating_total;
merge rating_2 rating_3;
by customer_id;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Onyx | Level 15

Hi,

 

How about group by processing?

 

data have;
input customer_id : $	reporting_date monyy6.	rating_18_20;
format reporting_date yymmdd10.;
cards4;
111111 Apr-19 1
111111 May-19 1
111111 Jun-19 0
111111 Jul-19 0
111111 Aug-19 0
111111 Sep-19 0
111111 Oct-19 1
111111 Nov-19 1
111111 Dec-19 1
111111 Jan-20 0
222222 Apr-19 0
222222 May-19 0
222222 Jun-19 0
222222 Jul-19 0
222222 Aug-19 1
222222 Sep-19 1
222222 Oct-19 1
222222 Nov-19 1
222222 Dec-19 0
222222 Jan-20 0
333333 Apr-19 0
333333 May-19 0
333333 Jun-19 0
333333 Jul-19 0
333333 Aug-19 0
333333 Sep-19 1
333333 Oct-19 1
333333 Nov-19 1
333333 Dec-19 1
333333 Jan-20 1
444444 Apr-19 0
444444 May-19 0
444444 Jun-19 0
444444 Jul-19 0
444444 Aug-19 0
444444 Sep-19 0
444444 Oct-19 0
444444 Nov-19 0
444444 Dec-19 0
444444 Jan-20 0
555555 Apr-19 1
555555 May-19 1
555555 Jun-19 1
555555 Jul-19 1
555555 Aug-19 1
555555 Sep-19 1
555555 Oct-19 1
555555 Nov-19 1
555555 Dec-19 1
555555 Jan-20 1
;;;;
run;

proc sort data = have;
  by customer_id reporting_date;
run;

data want(rename=(rd=reporting_date));
  set have;
  by customer_id rating_18_20 notsorted;
  if first.customer_id then 
    do;
      rd = .; 
      cnt = 0;
    end;

  if first.rating_18_20 = 1 = rating_18_20 then 
    do;      
      rd=reporting_date;
      cnt = 0;
    end;
  cnt+rating_18_20;

  if last.customer_id then output;
  retain rd;
  drop reporting_date rating_18_20; 
  format rd monyy6.;
run;
proc print;
run;

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

Hi @Riri_V  Assuming I understand your requirement, the following should work

 




data have;
input customer_id	reporting_date $	rating_18_20;
cards;
111111	Apr-19	1
111111	May-19	1
111111	Jun-19	0
111111	Jul-19	0
111111	Aug-19	0
111111	Sep-19	0
111111	Oct-19	1
111111	Nov-19	1
111111	Dec-19	1
111111	Jan-20	0
222222	Apr-19	0
222222	May-19	0
222222	Jun-19	0
222222	Jul-19	0
222222	Aug-19	1
222222	Sep-19	1
222222	Oct-19	1
222222	Nov-19	1
222222	Dec-19	0
222222	Jan-20	0
333333	Apr-19	0
333333	May-19	0
333333	Jun-19	0
333333	Jul-19	0
333333	Aug-19	0
333333	Sep-19	1
333333	Oct-19	1
333333	Nov-19	1
333333	Dec-19	1
333333	Jan-20	1
444444	Apr-19	0
444444	May-19	0
444444	Jun-19	0
444444	Jul-19	0
444444	Aug-19	0
444444	Sep-19	0
444444	Oct-19	0
444444	Nov-19	0
444444	Dec-19	0
444444	Jan-20	0
555555	Apr-19	1
555555	May-19	1
555555	Jun-19	1
555555	Jul-19	1
555555	Aug-19	1
555555	Sep-19	1
555555	Oct-19	1
555555	Nov-19	1
555555	Dec-19	1
555555	Jan-20	1
;

data want;
 do until(last.customer_id);
  do _n_=1 by 1 until(last.rating_18_20);
   set have ;
   by customer_id rating_18_20 notsorted;
   if rating_18_20 then do;
    if _n_=1 then reporting_date_rating=reporting_date;
    period_rating_18_20=_n_;
   end;
  end;
 end;
 period_rating_18_20=max(period_rating_18_20,0);
 keep customer_id	reporting_date_rating	period_rating_18_20; 
run;

customer_id reporting_date_rating period_rating_18_20
111111 Oct-19 3
222222 Aug-19 4
333333 Sep-19 5
444444   0
555555 Apr-19 10

 

 

Riri_V
Fluorite | Level 6

Thanks, works perfectly on the sample! But on the whole table discounted flags from past years. 

ed_sas_member
Meteorite | Level 14

Hi @Riri_V 

 

here is an attempt to achieve this:

data have;
	infile datalines dlm="09"x;
	input customer_id	reporting_date :monyy6.	rating_18_20;
	format reporting_date date9.;
	datalines;
111111	Apr-19	1
111111	May-19	1
111111	Jun-19	0
111111	Jul-19	0
111111	Aug-19	0
111111	Sep-19	0
111111	Oct-19	1
111111	Nov-19	1
111111	Dec-19	1
111111	Jan-20	0
222222	Apr-19	0
222222	May-19	0
222222	Jun-19	0
222222	Jul-19	0
222222	Aug-19	1
222222	Sep-19	1
222222	Oct-19	1
222222	Nov-19	1
222222	Dec-19	0
222222	Jan-20	0
333333	Apr-19	0
333333	May-19	0
333333	Jun-19	0
333333	Jul-19	0
333333	Aug-19	0
333333	Sep-19	1
333333	Oct-19	1
333333	Nov-19	1
333333	Dec-19	1
333333	Jan-20	1
444444	Apr-19	0
444444	May-19	0
444444	Jun-19	0
444444	Jul-19	0
444444	Aug-19	0
444444	Sep-19	0
444444	Oct-19	0
444444	Nov-19	0
444444	Dec-19	0
444444	Jan-20	0
555555	Apr-19	1
555555	May-19	1
555555	Jun-19	1
555555	Jul-19	1
555555	Aug-19	1
555555	Sep-19	1
555555	Oct-19	1
555555	Nov-19	1
555555	Dec-19	1
555555	Jan-20	1
;
run;

/* Customers with at least one flag = 1*/

proc sort data=have out=have_sorted_d;
	by customer_id descending reporting_date;
run;

data have2;
	set have_sorted_d;
	by customer_id rating_18_20 notsorted;
	if first.rating_18_20 then count + 1;
run;

proc sql;
	create table have3 as
	select *, count(count) as period_rating_18_20
	from have2
	where rating_18_20 = 1
	group by customer_id, count
	order by customer_id, count, reporting_date asc;
quit;

data have4;
	set have3;
	by customer_id;
	if first.customer_id then output;
	drop count rating_18_20;
run;

/* Customers with no flag = 1*/

proc sql;
	create table have_noflag as
	select distinct customer_id
	from have
	group by customer_id
	having sum(rating_18_20) = 0;
quit;

/* Final table */

data want;
	merge have4 have_noflag;
	by customer_id;
	if period_rating_18_20=. then period_rating_18_20=0;
run;
yabwon
Onyx | Level 15

Hi,

 

How about group by processing?

 

data have;
input customer_id : $	reporting_date monyy6.	rating_18_20;
format reporting_date yymmdd10.;
cards4;
111111 Apr-19 1
111111 May-19 1
111111 Jun-19 0
111111 Jul-19 0
111111 Aug-19 0
111111 Sep-19 0
111111 Oct-19 1
111111 Nov-19 1
111111 Dec-19 1
111111 Jan-20 0
222222 Apr-19 0
222222 May-19 0
222222 Jun-19 0
222222 Jul-19 0
222222 Aug-19 1
222222 Sep-19 1
222222 Oct-19 1
222222 Nov-19 1
222222 Dec-19 0
222222 Jan-20 0
333333 Apr-19 0
333333 May-19 0
333333 Jun-19 0
333333 Jul-19 0
333333 Aug-19 0
333333 Sep-19 1
333333 Oct-19 1
333333 Nov-19 1
333333 Dec-19 1
333333 Jan-20 1
444444 Apr-19 0
444444 May-19 0
444444 Jun-19 0
444444 Jul-19 0
444444 Aug-19 0
444444 Sep-19 0
444444 Oct-19 0
444444 Nov-19 0
444444 Dec-19 0
444444 Jan-20 0
555555 Apr-19 1
555555 May-19 1
555555 Jun-19 1
555555 Jul-19 1
555555 Aug-19 1
555555 Sep-19 1
555555 Oct-19 1
555555 Nov-19 1
555555 Dec-19 1
555555 Jan-20 1
;;;;
run;

proc sort data = have;
  by customer_id reporting_date;
run;

data want(rename=(rd=reporting_date));
  set have;
  by customer_id rating_18_20 notsorted;
  if first.customer_id then 
    do;
      rd = .; 
      cnt = 0;
    end;

  if first.rating_18_20 = 1 = rating_18_20 then 
    do;      
      rd=reporting_date;
      cnt = 0;
    end;
  cnt+rating_18_20;

  if last.customer_id then output;
  retain rd;
  drop reporting_date rating_18_20; 
  format rd monyy6.;
run;
proc print;
run;

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Riri_V
Fluorite | Level 6

@yabwon  Thaaank you so much! exactly what was needed! 

yabwon
Onyx | Level 15

Hi @Riri_V ,

 

glad I could help 🙂

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PeterClemmensen
Tourmaline | Level 20

An alternative..

 

data have;
input customer_id $ reporting_date $ rating_18_20;
datalines;
111111 Apr-19 1
111111 May-19 1
111111 Jun-19 0
111111 Jul-19 0
111111 Aug-19 0
111111 Sep-19 0
111111 Oct-19 1
111111 Nov-19 1
111111 Dec-19 1
111111 Jan-20 0
222222 Apr-19 0
222222 May-19 0
222222 Jun-19 0
222222 Jul-19 0
222222 Aug-19 1
222222 Sep-19 1
222222 Oct-19 1
222222 Nov-19 1
222222 Dec-19 0
222222 Jan-20 0
333333 Apr-19 0
333333 May-19 0
333333 Jun-19 0
333333 Jul-19 0
333333 Aug-19 0
333333 Sep-19 1
333333 Oct-19 1
333333 Nov-19 1
333333 Dec-19 1
333333 Jan-20 1
444444 Apr-19 0
444444 May-19 0
444444 Jun-19 0
444444 Jul-19 0
444444 Aug-19 0
444444 Sep-19 0
444444 Oct-19 0
444444 Nov-19 0
444444 Dec-19 0
444444 Jan-20 0
555555 Apr-19 1
555555 May-19 1
555555 Jun-19 1
555555 Jul-19 1
555555 Aug-19 1
555555 Sep-19 1
555555 Oct-19 1
555555 Nov-19 1
555555 Dec-19 1
555555 Jan-20 1
;

data want (keep=customer_id r period_rating_18_20);
    if _N_=1 then do;                            
        declare hash h(ordered : 'D', multidata : 'Y');
        h.definekey('customer_id');
        h.definedata('customer_id', 'period_rating_18_20', 'r');
        h.definedone();
        declare hiter hi ('h');
    end;
 
    do period_rating_18_20 = 1 by 1 until (last.customer_id | last.rating_18_20);             
        set have end=lr;
        by customer_id rating_18_20 notsorted;
        if first.customer_id then h.add(key: customer_id, data: customer_id, data: 0, data: " ");
        if first.rating_18_20 then r = reporting_date;
    end;

    if rating_18_20=1 then h.add();

    if lr;
    do while (hi.last()=0);
        output;
        rc=hi.next();
        h.remove();
    end;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1072 views
  • 4 likes
  • 5 in conversation