turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Varying intercept and slope in regression analyses

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-08-2017 04:16 PM

Hello there,

I'm trying to understand how to create a regression analysis code depending on if the intercept and/or slope varies based on some group variables. Below is a hypothetical dataset with 15 students nested in 3 schools. Their passing an exam is correlated with their family income and the school they attend to which is the nesting variable.

Student_ID | PASS | FAMILY_INCOME | SCHOOL |

1 | 1 | 120000 | sch1 |

2 | 1 | 170000 | sch1 |

3 | 0 | 90000 | sch1 |

4 | 0 | 91500 | sch1 |

5 | 0 | 93000 | sch1 |

6 | 1 | 180000 | sch1 |

7 | 1 | 225000 | sch1 |

8 | 1 | 150000 | sch1 |

9 | 1 | 90000 | sch2 |

10 | 1 | 92000 | sch2 |

11 | 1 | 94000 | sch2 |

12 | 0 | 78000 | sch2 |

13 | 1 | 110000 | sch3 |

14 | 0 | 50000 | sch3 |

15 | 1 | 140000 | sch3 |

If I want to write a regression analysis with c__onstant intercept and slope for each school:__

I would:

PROC LOGISTIC DATA = TEST;

MODEL PASS = FAMILY_INCOME;

RUN;

If I want to write a regression analysis with __varying intercept____ but constant slope for each school:__

where

I would:

PROC LOGISTIC DATA = TEST;

CLASS SCHOOL (ref="sch1");

MODEL PASS = FAMILY_INCOME SCHOOL;

RUN;

If I want to write a regression analysis with __varying intercept and slope for each school__:

where

Now, my questions are:

1. Did I get the varying intercept constant slope model in PROC LOGISTIC right?

2. How would I write the PROC LOGISTIC to accommodate varying intercept and slope for each hospital?

Thanks a lot in advance!

Recep

Accepted Solutions

Solution

06-14-2017
01:06 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-08-2017 04:20 PM - edited 06-08-2017 04:25 PM

- yes
- MODEL PASS = FAMILY_INCOME|SCHOOL; (the vertical bar creates main effects and interactions, which can also be written as MODEL PASS=FAMILY SCHOOL FAMILY*INCOME; )

--

Paige Miller

Paige Miller

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-08-2017 04:18 PM

2 - BY statement.

proc logistic ...;

by school hospital;

model .....;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-08-2017 04:21 PM

Reeza wrote:

2 - BY statement.

proc logistic ...;

by school hospital;

model .....;

run;

No, you wouldn't do this unless you had a very good reason. You want all terms in one model, which gives you better estimate of the overall variability than you get if you do it with a BY statement.

--

Paige Miller

Paige Miller

Solution

06-14-2017
01:06 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-08-2017 04:20 PM - edited 06-08-2017 04:25 PM

- yes
- MODEL PASS = FAMILY_INCOME|SCHOOL; (the vertical bar creates main effects and interactions, which can also be written as MODEL PASS=FAMILY SCHOOL FAMILY*INCOME; )

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

06-08-2017 06:39 PM

I think I may have misunderstood the question, I would definitely trust @PaigeMiller solution over mine!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

06-09-2017 04:49 PM

Thanks a lot for your response Paige!

Did you mean:

MODEL PASS = FAMILY_INCOME SCHOOL FAMILY_INCOME*SCHOOL;

?

Cheers,

Recep

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-14-2017 01:11 PM

Yes, that's what I meant

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-09-2017 09:53 AM - edited 06-09-2017 11:30 PM

that wil lead you to Generalize Mixed Model. Try GLIMMIX.

PROC GLIMMIX DATA = TEST;

class school;

MODEL PASS = FAMILY_INCOME/dist=binomial;

random intercept FAMILY_INCOME /subject=school;

RUN;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

06-14-2017 01:13 PM - edited 06-14-2017 01:15 PM

Ksharp wrote:

that wil lead you to Generalize Mixed Model. Try GLIMMIX.

PROC GLIMMIX DATA = TEST;

class school;

MODEL PASS = FAMILY_INCOME/dist=binomial;

random intercept FAMILY_INCOME /subject=school;

RUN;

I don't see this as answering the original question. My interpretation is that the original question was not asking for RANDOM effects to be included in the model. If you make something a RANDOM effect, then you won't get an estimate of the slopes, and you won't get estimates of the intercepts.

I would not have a problem with this as a solution to the problem:

proc glimmix data=test; class school; model pass=family_income|school/dist=binomial; run;

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

06-14-2017 02:47 PM

This discussion points to the root of my original problem: I think I'm having a bit of a hard time to understand the difference between having a "varying slope" vs. "random effects" for a particular variable (the school in my example). So, Paige, would you say the LOGISTIC regression you proposed would still be a fixed effects for "SCHOOL"?

@Ksharp, thanks for your response as well! Even though the code you provided did not work for my data (it may be just because my made up data was not big/versitile enough for the GLIMMIX) how would you differentiate a "varying slope" vs "random effects"?

Thanks a lot in advance to both of you...

Recep

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Recep

06-14-2017 03:19 PM - edited 06-14-2017 03:25 PM

In statistical terminology

If you are only interested in these specific schools, and you want to estimate a different slope and estimate a different intercept for each school, these are FIXED effects.

If you are interested in the entire population of schools and you have randomly selected these schools and you want to know the variability of the intercept across schools, or the variability of slopes across schools (in other words, a standard deviation of the intercepts or a standard deviation of the slopes), then you have a RANDOM effect.

The SAS procedure GLIMMIX will not estimate the slopes for you if you put SCHOOL in a RANDOM statement, it will give you a variance of the slopes.

As I understand your original question, and the models which you carefully wrote out, it seems to me the slopes and intercepts are FIXED effects, and thus a RANDOM statement is not needed here (in fact, in my opinion, a RANDOM statement incorrect in this situation).

--

Paige Miller

Paige Miller