BookmarkSubscribeRSS Feed
Toni2
Lapis Lazuli | Level 10

hi i want to estimate Population Stability Index (PSI) for numeric variables.

 

I have seen some people estimate separately PSI for continuous and discrete variables. However, i am not sure why the use different estimations based on the fact the PSI formula is one

 

My question is: do we need to use different approach in the estimation of PSI for continuous/discrete variables ? if yes, why? 

 

2 REPLIES 2
sbxkoenk
SAS Super FREQ

Hello,

 

As far as I know : NO !

But you must discretise continuous variables.

 

See here :

Examining Distributional Shifts by Using Population Stability Index (PSI) for Model Validation and Diagnosis
Alec Zhixiao Lin, LoanDepot, Foothill Ranch, CA
https://www.lexjansen.com/wuss/2017/47_Final_Paper_PDF.pdf

 

and here :

sbxkoenk_0-1651504868991.png

 

Koen

jorgecarballo93
Calcite | Level 5

Hi!

Thank you for your question about estimating the Population Stability Index (PSI) for numeric variables. You are correct that the PSI formula is the same, but the approach can differ slightly between continuous and discrete variables.

 

For continuous variables, you typically need to bin the data, meaning you group the continuous values into intervals or "bins". This helps in comparing distributions between two periods or datasets effectively.

For discrete or categorical variables, the values are already in distinct categories, so the PSI calculation can be applied directly to these categories without binning.

 

The reason for using different approaches lies in the nature of the data. Continuous variables require transformation to compare distributions appropriately, while categorical variables are already formatted for direct comparison.

If you want to dive deeper into this topic and see worked examples for both numeric and categorical variables in SAS, I invite you to read my article titled "Implementación en SAS del Population Stability Index (PSI)" on my website SAS desde Cero. The article provides a step-by-step guide on how to calculate PSI for both types of variables using SAS.

 

You can read the full article here: https://www.sasdesdecero.com/implementacion-en-sas-del-population-stability-index/

 

Please note that the article is in Spanish.

 

I hope you find this information helpful. Good luck with your analysis!

 

Best regards,

Jorge Carballo

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 761 views
  • 0 likes
  • 3 in conversation