BookmarkSubscribeRSS Feed
juanvg1972
Pyrite | Level 9
 

Hi,

 

I am using 'proc tabulate' procedure to know whether are dependencies between two vars in a SAS dataset

 

 

This is my proc:
 
 
PROC FREQ DATA = tabsas.customer_churn_telco4;
 
TABLES rango_tenure * Churn /CHISQ;
 
RUN; 
 
I attached the output of the freq. If I take a look to the output I see dependecies between 'churn' and 'rango_tenure' vars.
The distribution of 'churn' var change between groups of 'rango_tenure' values.
 
I see that p-value is under 0.05 then I can say that there is a dependency, is that true??
 
The value of chi-square test is high, Can I say that the bigger the value the stronger the dependency?, is that true??
 
If I use more vars in proc tabulate checking the dependecies with 'churn', I can see that the bigger value in chi-square test gives the most important var (the stronger dependency)?
 
Thanks in advance
4 REPLIES 4
Reeza
Super User
Churn is usually measured with a survival type analysis but it's possible you have restrictions that make this impossible. In that case, I think it would still be a more appropriate test to use the Cochran-Armitage test, which looks for a trend. A Chi Square just tells you that the data is different at some levels, not that there's necessarily any trend.

There's an example in the documentation that is similar to this.

However, I will come back to the survival aspect of churn. If you don't have event data for a portion of your population it makes sense to use survival type analysis especially if they can leave at some point later on, and basically just have not yet.
juanvg1972
Pyrite | Level 9

Thanks Reeza,

 

I will consider survival analysis in the future, but know I am interested in understandinf chi-square test in proc tabulate

Can you help me with my questions about chi-square results??

 

Thanks

juanvg1972
Pyrite | Level 9

I think 'survival analysis' is not my solution, because I am trying to get likelihood of churn (not years until churn) based on input vars, I am considering to use logistic regression or decision trees..., then for me is very importante dependencies and the results of chi-square test

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1449 views
  • 1 like
  • 2 in conversation