Hi,
I've gone through a few articles and haven't found proper information on how to analyze the chi-square in SAS EM.
Here are my results
Can someone please give a detailed explanation of how this would be analyzed, I have the gist of it but would like to understand this more!
Can you briefly describe the problem, and what you did to get this far?
This is just the initial data exploration, I simply imported my file and connected the StatExplore to the original dataset (https://github.com/washingtonpost/data-police-shootings)
Im using StatExplore to see missing values and look at the chi-square to see the relationship between the input and target variable. Hence the graphs above have been generated as a result
It is telling you which variables are "good predictors" of the target variable.
If the PROB is <0.05 then the predictor is statistically significant (in other words, has some predictive ability). The bigger the Chi-Square value, then the better the predictive value of this variable.
For DF, i can see some of my variables are showing a large amount such as for city it is showing 16500, for reference, the race target has 7 levels at the moment, so is this being taken into account for DF?
When I get the chi-square for another binary target variable, the df is significantly smaller.
Since I don't have your data, I don't know. Maybe you should LOOK AT the data with your own eyes and see if you can see a reason why data set 1 has 300 df for variable STATE, while data set 2 has 50 df for variable STATE.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.