What are some predictive models I can do with these target variables ?...

b_smsha · Posted 06-06-2021 03:46 PM

Hi Everyone,

The past few days I have went through a lot of questions and answers to help me out and it greatly did. But I would like to ask more questions

My target variables are Race and Mental Illness.

The dataset I've merged with certain factors such as county and preprocessed some information such as region, day, month and year in SAS Studio.

At the moment in my SAS Eminer I've made Decision Tree modeling each target variable and I need one more model.

The regression model for M.I comes out fine, (I haven't analyzed it yet however it's producing properly) however when I produce one for Race, it looks weird and just doesn't look right and also because I have went through countless search to find out if there was a way to model Race using logistic regression but it seems impossible unless I have 2 levels or 3.

Can someone please suggest example of a model I may create in SAS EM which can work for both target variables. Please and thank you!

PaigeMiller · Posted 06-06-2021 03:54 PM

@b_smsha wrote:

however when I produce one for Race, it looks weird and just doesn't look right and also because I have went through countless search to find out if there was a way to model Race using logistic regression but it seems impossible unless I have 2 levels or 3.

Logistic regression is not limited to three levels of the target variable. See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_logistic_examples03.htm. Maybe E-Miner has such a limitation, I don't know.

Can someone please suggest example of a model I may create in SAS EM which can work for both target variables.

I'm afraid this question isn't clear to me. Do you want the same model method (logistic, decision tree, neural network, etc.) to work on both target variables? Or do you want the same model fit to apply to both target variables?

--
Paige Miller

b_smsha · Posted 06-06-2021 05:20 PM

By limited to level of target variables, I mean like if it is nominal target, then if it has more than 2 levels, it will not do multinomial regression in sas eminer. According to sas it says this

but according to the book by Mr. Kattamuri in the "Predictive Analytics with SAS Eminer" he says this

For your second answer,

At the moment Ive created decision trees per target variables as such

So i believe I mean that I want to be able to do the same model for each target variable..

PaigeMiller · Posted 06-06-2021 06:03 PM

That's very unfortunate if E-Miner does not allow more than two levels of a categorical target variable. Especially since PROC LOGISTIC does allow more than two levels. Both are SAS products, but one has a limitation.

I find there are a number of shortcomings in E-Miner, which to me don't seem to have an obvious rationale. The other big shortcoming in the E-Miner modeling is that there are no features to handle problems with multiple Y variables, even though other SAS software such as JMP and PROC GLM do allow this.

Both of these limitations prevent certain real-world situations from being properly analyzed by E-miner.

--
Paige Miller

ballardw · Posted 06-06-2021 05:13 PM

You say:

" when I produce one for Race, it looks weird and just doesn't look right".

Why doesn't it look right? Some code (generated or otherwise) might help.

If you try to predict race from other variables you might be looking at the equivalent of trying to predict package color from the contents of a package.

I can see a model using race as an independent variable, in which case "looking right" can depend a lot on how well other data as well as "race" is collected and used. Race should almost never be dependent variable.

b_smsha · Posted 06-06-2021 05:28 PM

Hi,

Yes here I have listed the code and the property setting I used for my regression node.

I've the last part of the analysis of maximum likely hood and the summary. It somewhat is turning out like this.

The reason for race as one of the target variables is because im doing a report on analyzing how mental illness and race affect police shootings, im still a student and am working under a supervisor however i'm somewhat learning or finding everything on my own, like supervising myself.. 😕

PaigeMiller · Posted 06-06-2021 06:13 PM

@ballardw wrote:

You say:

" when I produce one for Race, it looks weird and just doesn't look right".

Why doesn't it look right? Some code (generated or otherwise) might help.

If you try to predict race from other variables you might be looking at the equivalent of trying to predict package color from the contents of a package.

I can see a model using race as an independent variable, in which case "looking right" can depend a lot on how well other data as well as "race" is collected and used. Race should almost never be dependent variable.

Actually, I don't find this to be a problem at all. For example, you have found an actual skeleton and by measuring the bones, you want to determine gender, or age, or race (one famous case is that bones found on a South Pacific island in 1940, near the known flight path of Amelia Earhart, were determined to likely be from a female of European descent of approximately the same height and age as Earhart, and there are no other known females of European descent that were lost in this area of the South Pacific).

These are all real-world problems that use discriminant analysis (PROC DISCRIM) to determine a model which can be used on skeletons found in the future (or past). And of course, the problem isn't really limited to skeletons. Whether it makes sense to do a logistic regression or decision tree or discriminant analysis in the EXACT situation that @b_smsha faces, well I don't know, but I don't have a problem with the concept.

Your example of predicting the color of a package by knowing the contents is somewhat spurious because the color of the package is likely uncorrelated with the contents. The race of a skeleton may be (I don't know, I'm not an anthropologist) correlated with the physical dimensions of a skeleton.

--
Paige Miller

What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

Re: What are some predictive models I can do with these target variables ? (Race and M.I)

SAS Innovate 2025: Save the Date