Write and run SAS programs in your web browser

How to create a dichotomous variable on SAS University Edition

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

How to create a dichotomous variable on SAS University Edition

Hi all,

I am stuck with a problem that I hope you will help me to solve. 

I am using SAS University Edition v.9.4. and I am working on a panel data that includes 162 variables dealing with the transaction volume and behavior of roughly 130000 bank customers. I want to build a predictive model using logistic regression that can forecast the probability to churn customer have. So dependent variable is 1 if Churn and 0 if Not Churn.

Now, I have to build a logistic regression having as dependent variable the dichotomous variable 1 (churn) and 0 ( No Churn). I have to build it according to this definition:
"We define that the customer who did not do any transaction across all of his accounts and product areas on his own initiative during the last three months is a churner"

My problem is: How can I tell to SAS that if variable X is 0 for three following months- i.e. Jan, Feb,Mar- then my variable "Churn" must be 1, otherwise 0? In other words, if customer Z has been active for example from January to June then my dichotomous variable must be 0 during those months. However, if from July to September this customer made no transaction at all then from October my dichotomous variable must be 1. And it must be one until the end of the observation period.

I know I must use a conditional statement to build this new variable, but I do not know how to tell SAS that it must put 1 to that specific customer if and only if he/she has done no transaction across all of his accounts and product areas during the last three months. 

Please note that I have a variable called "trans_numb" that tells me how many transaction the customer has done across all of his accounts and products areas during a specific month. So, I will use this variable as a benchmark for then constructing my dichotomous variable.

I hope I've given you the idea of what I need and hopefully you can help.

Thank you in advance!


Accepted Solutions
Solution
‎07-13-2017 04:23 AM
Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

[ Edited ]

Here's my take:

data have;
input cust_id year month trans_numb;
cards;
1 2017 1 123
1 2017 2 456
1 2017 3 76
1 2017 4 0
1 2017 5 0
1 2017 6 0
1 2017 7 0
1 2017 8 0
1 2017 9 0
1 2017 10 0
1 2017 11 0
1 2017 12 0
2 2017 1 23
2 2017 2 15
2 2017 3 9
2 2017 4 1
2 2017 5 0
2 2017 6 0
2 2017 7 0
2 2017 8 0
2 2017 9 0
2 2017 10 1
2 2017 11 0
2 2017 12 0
3 2017 1 0
3 2017 2 0
3 2017 3 0
3 2017 4 0
3 2017 5 0
3 2017 6 0
3 2017 7 0
3 2017 8 0
3 2017 9 0
3 2017 10 0
3 2017 11 0
3 2017 12 0
4 2017 1 123
4 2017 2 456
4 2017 3 76
4 2017 4 65
4 2017 5 88
4 2017 6 90
4 2017 7 78
4 2017 8 76
4 2017 9 65
4 2017 10 89
4 2017 11 86
4 2017 12 79
5 2017 1 345
5 2017 2 456
5 2017 3 125
5 2017 4 65
5 2017 5 78
5 2017 6 100
5 2017 7 76
5 2017 8 78
5 2017 9 98
5 2017 10 99
5 2017 11 65
5 2017 12 45
6 2017 1 0
6 2017 2 0
6 2017 3 2
6 2017 4 0
6 2017 5 0
6 2017 6 0
6 2017 7 1
6 2017 8 0
6 2017 9 0
6 2017 10 0
6 2017 11 0
6 2017 12 0
7 2017 1 45
7 2017 2 36
7 2017 3 10
7 2017 4 7
7 2017 5 0
7 2017 6 0
7 2017 7 0
7 2017 8 0
7 2017 9 0
7 2017 10 0
7 2017 11 0
7 2017 12 0
;
run;

data want1; /* "simple" method */
set have;
by cust_id year month;
retain cust_count old_churn;
if first.cust_id
then cust_count = 1;
else cust_count + 1;
trans1 = lag(trans_numb);
trans2 = lag2(trans_numb);
if cust_count < 3
then churn = 0;
else do;
  if trans1 or trans2 or trans_numb
  then churn = 0;
  else churn = 1;
end;
old_churn = churn;
* drop old_churn cust_count trans1 trans2;
run;

proc print data=want1 noobs;
run;

data want2; /* "complex" method */
set have;
by cust_id year month;
retain cust_count old_churn;
if first.cust_id
then cust_count = 1;
else cust_count + 1;
trans1 = lag(trans_numb);
trans2 = lag2(trans_numb);
if cust_count < 3
then churn = 0;
else do;
  if not trans1 and not trans2 and not trans_numb
  then churn = 1;
  else if trans1 and trans2 and trans_numb
  then churn = 0;
  else if old_churn
  then churn = 1;
  else churn = 0;
end;
old_churn = churn;
* drop old_churn cust_count trans1 trans2;
run;

proc print data=want2 noobs;
run;

Note that I named the first dataset "have"; I'm so used to name datasets in examples have and want that I keep it that way. Smiley Wink

Most of the logic is done by using the lag() functions and a counter for the current customer ID. Adding year and month in the by statement is done to make sure that the dataset is correctly sorted (if not, the step throws an ERROR).

 

Once the code runs to your satisfaction, activate the drop statement to get rid of the helper variables.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

What if a customer makes a transaction after a three-month pause? Should churn be kept at 1, or be reset to 0?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to create a dichotomous variable on SAS University Edition

Posted in reply to KurtBremser

Thank you for the kind and prompt reply @KurtBremser! I apologize for not having provided an example of my dataset, but I'm a newbie in SAS community and I still have to learn how to properly ask questions. However, my dataset looks like the example you provided in your second comment. 

The objective of my work is to develop a predictive model that can help a bank to detectlikely churners before they leave in order to finally retain them. In my dataset, the clients associated to a transaction volume of 0 for more than three-months usually make only one or two transactions in a random subsequent month. And not all of them do it- most of them simply keep doing no transactions. If a customer only has one bank, it is highly uncommon that he makes no or very few transactions across accounts or financial products in an extended timeframe- i.e. 3-month-. For this reason, I would still regard them as being prone to churn. So I would say that if a customer makes a transaction after a three-month pause the churn variable should still be kept at 1.

 

However, if I'm not asking too much, can you please provide the SAS code for both scenarios? 


1- if customer makes a transaction after a three-month pause AND churn is kept 1

AND

2-  if customer makes a transaction after a three-month pause AND churn is reset to 0

Thank you a lot!!

Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

I need some example data to play around with. Use the macro provided in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your data into a data step that can be posted here, or write a data step that recreates your data (just as in the example I gave you).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to create a dichotomous variable on SAS University Edition

Posted in reply to KurtBremser

@KurtBremser Unfortunately, I cannot post the full data step of my dataset due to size and confidentiality matters. However, below a very small data step of six customers that includes all the possible scenarios I have in my full dataset:

 

data bank;
input cust_id year month trans_numb;
cards;
1 2017 1 123
1 2017 2 456
1 2017 3 76
1 2017 4 0
1 2017 5 0
1 2017 6 0
1 2017 7 0
1 2017 8 0
1 2017 9 0
1 2017 10 0
1 2017 11 0
1 2017 12 0
2 2017 1 23
2 2017 2 15
2 2017 3 9
2 2017 4 1
2 2017 5 0
2 2017 6 0
2 2017 7 0
2 2017 8 0
2 2017 9 0
2 2017 10 1
2 2017 11 0
2 2017 12 0
3 2017 1 0
3 2017 2 0
3 2017 3 0
3 2017 4 0
3 2017 5 0
3 2017 6 0
3 2017 7 0
3 2017 8 0
3 2017 9 0
3 2017 10 0
3 2017 11 0
3 2017 12 0
4 2017 1 123
4 2017 2 456
4 2017 3 76
4 2017 4 65
4 2017 5 88
4 2017 6 90
4 2017 7 78
4 2017 8 76
4 2017 9 65
4 2017 10 89
4 2017 11 86
4 2017 12 79
5 2017 1 345
5 2017 2 456
5 2017 3 125
5 2017 4 65
5 2017 5 78
5 2017 6 100
5 2017 7 76
5 2017 8 78
5 2017 9 98
5 2017 10 99
5 2017 11 65
5 2017 12 45
6 2017 1 0
6 2017 2 0
6 2017 3 2
6 2017 4 0
6 2017 5 0
6 2017 6 0
6 2017 7 1
6 2017 8 0
6 2017 9 0
6 2017 10 0
6 2017 11 0
6 2017 12 0
7 2017 1 45
7 2017 2 36
7 2017 3 10
7 2017 4 7
7 2017 5 0
7 2017 6 0
7 2017 7 0
7 2017 8 0
7 2017 9 0
7 2017 10 0
7 2017 11 0
7 2017 12 0
;
run;

 

As you can see, I have four types of customers:

1- Customers that simply remain active for the whole period (see cust_id 4). In this case, the new variable churn should be 0 for the whole observation period

2- Customers that first are active and then stop doing transactions until the end of the observation period (see cust_id 7). In this case, the new variable churn should first be 0, then it should turn into 1 after a three-month pause.

3- Customers that do not do any transactions for the whole observation period (See cust_id 3). In this case, the variable churn should be 1 after the three-month pause.

4- Customers that are first active then take a three-month pause and then do 1 or 2 transactions on a random month and then return inactive (See cust_id 2). In this case if you can give me two options. 1st option--> the variable churn remains 1; 2nd option--> the variable churn reset and return being 1 again after the second three-month pause.

I hope I've given you an idea you can play around with. 

Thank you so much!!!

Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

Exactly what the doctor needed! Will work on it when I'm back at the office.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Solution
‎07-13-2017 04:23 AM
Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

[ Edited ]

Here's my take:

data have;
input cust_id year month trans_numb;
cards;
1 2017 1 123
1 2017 2 456
1 2017 3 76
1 2017 4 0
1 2017 5 0
1 2017 6 0
1 2017 7 0
1 2017 8 0
1 2017 9 0
1 2017 10 0
1 2017 11 0
1 2017 12 0
2 2017 1 23
2 2017 2 15
2 2017 3 9
2 2017 4 1
2 2017 5 0
2 2017 6 0
2 2017 7 0
2 2017 8 0
2 2017 9 0
2 2017 10 1
2 2017 11 0
2 2017 12 0
3 2017 1 0
3 2017 2 0
3 2017 3 0
3 2017 4 0
3 2017 5 0
3 2017 6 0
3 2017 7 0
3 2017 8 0
3 2017 9 0
3 2017 10 0
3 2017 11 0
3 2017 12 0
4 2017 1 123
4 2017 2 456
4 2017 3 76
4 2017 4 65
4 2017 5 88
4 2017 6 90
4 2017 7 78
4 2017 8 76
4 2017 9 65
4 2017 10 89
4 2017 11 86
4 2017 12 79
5 2017 1 345
5 2017 2 456
5 2017 3 125
5 2017 4 65
5 2017 5 78
5 2017 6 100
5 2017 7 76
5 2017 8 78
5 2017 9 98
5 2017 10 99
5 2017 11 65
5 2017 12 45
6 2017 1 0
6 2017 2 0
6 2017 3 2
6 2017 4 0
6 2017 5 0
6 2017 6 0
6 2017 7 1
6 2017 8 0
6 2017 9 0
6 2017 10 0
6 2017 11 0
6 2017 12 0
7 2017 1 45
7 2017 2 36
7 2017 3 10
7 2017 4 7
7 2017 5 0
7 2017 6 0
7 2017 7 0
7 2017 8 0
7 2017 9 0
7 2017 10 0
7 2017 11 0
7 2017 12 0
;
run;

data want1; /* "simple" method */
set have;
by cust_id year month;
retain cust_count old_churn;
if first.cust_id
then cust_count = 1;
else cust_count + 1;
trans1 = lag(trans_numb);
trans2 = lag2(trans_numb);
if cust_count < 3
then churn = 0;
else do;
  if trans1 or trans2 or trans_numb
  then churn = 0;
  else churn = 1;
end;
old_churn = churn;
* drop old_churn cust_count trans1 trans2;
run;

proc print data=want1 noobs;
run;

data want2; /* "complex" method */
set have;
by cust_id year month;
retain cust_count old_churn;
if first.cust_id
then cust_count = 1;
else cust_count + 1;
trans1 = lag(trans_numb);
trans2 = lag2(trans_numb);
if cust_count < 3
then churn = 0;
else do;
  if not trans1 and not trans2 and not trans_numb
  then churn = 1;
  else if trans1 and trans2 and trans_numb
  then churn = 0;
  else if old_churn
  then churn = 1;
  else churn = 0;
end;
old_churn = churn;
* drop old_churn cust_count trans1 trans2;
run;

proc print data=want2 noobs;
run;

Note that I named the first dataset "have"; I'm so used to name datasets in examples have and want that I keep it that way. Smiley Wink

Most of the logic is done by using the lag() functions and a counter for the current customer ID. Adding year and month in the by statement is done to make sure that the dataset is correctly sorted (if not, the step throws an ERROR).

 

Once the code runs to your satisfaction, activate the drop statement to get rid of the helper variables.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to create a dichotomous variable on SAS University Edition

Posted in reply to KurtBremser

@KurtBremser I can't thank you enough!! You're my savior Smiley Tongue

Contributor
Posts: 25

Re: How to create a dichotomous variable on SAS University Edition

[ Edited ]
Posted in reply to KurtBremser

Dear @KurtBremser,

Apologies for partially reopening this post but I need your precious help once more. In the codes your created, the variable churn is equal to 1 if the customer has done no transactions (so trans_numb=0) during the last three months. And this is right!

 

However, I initially told you that if i.e. customer 1 has done no transations across all his accounts during the months January, February and March then the variable churn should start being one from the month March. Now, I need the following code edit: 

"if i.e. customer 1 has done no transations across all his accounts during the months January, February and March then the variable churn should start being 1 NOT from the month March but from January".

 

Following a small datastep that describes how I would like my variable churn to look like:

 

data have;
input cust_id year month trans_numb churn;
cards;
1 2017 1 123 0
1 2017 2 456 0
1 2017 3 76 0
1 2017 4 0 1
1 2017 5 0 1
1 2017 6 0 1
1 2017 7 0 1
1 2017 8 0 1
1 2017 9 0 1
1 2017 10 0 1
1 2017 11 0 1
1 2017 12 0 1
2 2017 1 23
2 2017 2 15
2 2017 3 9
2 2017 4 1
2 2017 5 0 1
2 2017 6 0 1
2 2017 7 0 1
2 2017 8 0 1
2 2017 9 0 1
2 2017 10 1 0
2 2017 11 0 0
2 2017 12 0 0
3 2017 1 2 0
3 2017 2 0 0
3 2017 3 0 0
3 2017 4 1 0
3 2017 5 0 1
3 2017 6 0 1
3 2017 7 0 1
3 2017 8 0 1
3 2017 9 0 1
3 2017 10 2 0
3 2017 11 0 0
3 2017 12 0 0
4 2017 1 123 0
4 2017 2 456 0
4 2017 3 76 0
4 2017 4 65 0
4 2017 5 88 0
4 2017 6 90 0
4 2017 7 78 0
4 2017 8 76 0
4 2017 9 65 0
4 2017 10 0 0
4 2017 11 85 0
4 2017 12 79 0
5 2017 1 345 0
5 2017 2 456 0
5 2017 3 125 0
5 2017 4 65 0
5 2017 5 78 0
5 2017 6 100 0
5 2017 7 76 0
5 2017 8 32 0
5 2017 9 15 0
5 2017 10 0 1
5 2017 11 0 1
5 2017 12 0 1
;
run;

Can you help me again? If you have any further questions do not hesitate to ask!

Thank you a lot in advance!!! 

Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

Since we can't "go back" in a current data step, there's usually two solutions: reverse sort the dataset so you can again use the lag() functions, or keep a flag in an extra dataset and remerge that.

I selected the second option:

data have;
input cust_id year month trans_numb;
cards;
1 2017 1 123
1 2017 2 456
1 2017 3 76
1 2017 4 0
1 2017 5 0
1 2017 6 0
1 2017 7 0
1 2017 8 0
1 2017 9 0
1 2017 10 0
1 2017 11 0
1 2017 12 0
2 2017 1 23
2 2017 2 15
2 2017 3 9
2 2017 4 1
2 2017 5 0
2 2017 6 0
2 2017 7 0
2 2017 8 0
2 2017 9 0
2 2017 10 1
2 2017 11 0
2 2017 12 0
3 2017 1 0
3 2017 2 0
3 2017 3 0
3 2017 4 0
3 2017 5 0
3 2017 6 0
3 2017 7 0
3 2017 8 0
3 2017 9 0
3 2017 10 0
3 2017 11 0
3 2017 12 0
4 2017 1 123
4 2017 2 456
4 2017 3 76
4 2017 4 65
4 2017 5 88
4 2017 6 90
4 2017 7 78
4 2017 8 76
4 2017 9 65
4 2017 10 89
4 2017 11 86
4 2017 12 79
5 2017 1 345
5 2017 2 456
5 2017 3 125
5 2017 4 65
5 2017 5 78
5 2017 6 100
5 2017 7 76
5 2017 8 78
5 2017 9 98
5 2017 10 99
5 2017 11 65
5 2017 12 45
6 2017 1 0
6 2017 2 0
6 2017 3 2
6 2017 4 0
6 2017 5 0
6 2017 6 0
6 2017 7 1
6 2017 8 0
6 2017 9 0
6 2017 10 0
6 2017 11 0
6 2017 12 0
7 2017 1 45
7 2017 2 36
7 2017 3 10
7 2017 4 7
7 2017 5 0
7 2017 6 0
7 2017 7 0
7 2017 8 0
7 2017 9 0
7 2017 10 0
7 2017 11 0
7 2017 12 0
;
run;

data want2 start_ctl (keep=cust_id); /* "complex" method */
set have;
by cust_id year month;
retain cust_count old_churn;
if first.cust_id
then cust_count = 1;
else cust_count + 1;
trans1 = lag(trans_numb);
trans2 = lag2(trans_numb);
if cust_count < 3
then churn = 0;
else do;
  if not trans1 and not trans2 and not trans_numb
  then do;
    churn = 1;
    if cust_count = 3 then output start_ctl;
  end;
  else if trans1 and trans2 and trans_numb
  then churn = 0;
  else if old_churn
  then churn = 1;
  else churn = 0;
end;
old_churn = churn;
drop old_churn cust_count trans1 trans2;
output want2;
run;

data want;
merge
  want2 (in=a)
  start_ctl (in=b)
;
by cust_id;
if a;
if first.cust_id
then counter = 1;
else counter + 1;
if b and counter < 3 then churn = 1;
drop counter;
run;

proc print data=want noobs;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 7,809

Re: How to create a dichotomous variable on SAS University Edition

PS consider supplying some example data in a form like this:

data have;
input cust_id year month trans_numb;
cards;
1 2017 1 123
1 2017 2 456
1 2017 3 76
1 2017 4 0
1 2017 5 0
1 2017 6 0
;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 332 views
  • 4 likes
  • 2 in conversation