Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- SAS Data Science
- /
- Re: Interpretation of Chi-Square Stat and Variable Worth

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-20-2021 03:54 PM
(2242 views)

Hi,

I've gone through a few articles and haven't found proper information on how to analyze the chi-square in SAS EM.

Here are my results

Can someone please give a detailed explanation of how this would be analyzed, I have the gist of it but would like to understand this more!

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Can you briefly describe the problem, and what you did to get this far?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is just the initial data exploration, I simply imported my file and connected the StatExplore to the original dataset (https://github.com/washingtonpost/data-police-shootings)

Im using StatExplore to see missing values and look at the chi-square to see the relationship between the input and target variable. Hence the graphs above have been generated as a result

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It is telling you which variables are "good predictors" of the target variable.

If the PROB is <0.05 then the predictor is statistically significant (in other words, has some predictive ability). The bigger the Chi-Square value, then the better the predictive value of this variable.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For DF, i can see some of my variables are showing a large amount such as for city it is showing 16500, for reference, the race target has 7 levels at the moment, so is this being taken into account for DF?

When I get the chi-square for another binary target variable, the df is significantly smaller.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since I don't have your data, I don't know. Maybe you should LOOK AT the data with your own eyes and see if you can see a reason why data set 1 has 300 df for variable STATE, while data set 2 has 50 df for variable STATE.

--

Paige Miller

Paige Miller

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.