## Simulating logistic regression data with correlated observations

Regular Contributor
Posts: 182

# Simulating logistic regression data with correlated observations

Hello all

I wish to simulate data from a logistic regression model, with the following elements:

1. A binary response ( 1 = Treatment worked, 0 = Treatment didn't work)

2. A binary independent variable (1 = Treatment, 0 = Control)

3. For each subject in the sample, there are 2 observations, which are assumed to be correlated.

The purpose is to do a power analysis for a given sample size. Before I try the power analysis, I am not sure how to simulate a single data set of that kind.

I have a copy of Rick Wicklin's book on simulations, I saw the code he wrote for logistic regression. The independent variables there are continuous, and there is no correlation (no clusters), the clusters comes later with a normal dependent variable. I am not sure how to merge the two examples.

One more comment, I would prefer to do it using the data step and not IML, if possible.

Any assistant will be very appreciated !

(when helping you can make up any correlation and proportion you like, I can always change it later).

Super User
Posts: 10,787

## Re: Simulating logistic regression data with correlated observations

1) Why not post it at IML forum, since you have already mentioned him ? I think he can also do it via Data Step other than IML.

2)I am curious that what is your dependent variables ?  Only A binary response variable ? That could not be possible .

3)What do you mean by 'correlated" , Each subject has two obs ,maybe present  one is for P, another is for 1-P .

4)What do you mean by 'clusters', You mean using STRAT statement in proc logistic ? you can also use  CALSS variable instead of STRAT variable.

Regular Contributor
Posts: 182

## Re: Simulating logistic regression data with correlated observations

1) I didn't cause a violation of forum rules, publishing a non IML question in an IML forum :-)

2) My response is binary (1/0)

3) By correlated I mean for example ear drops. It's applied on both ears for each patient, but two ears within the same patients are assumed to be correlated (or can't be assumed not to be). The patient, is then a cluster.

SAS Super FREQ
Posts: 4,245

## Re: Simulating logistic regression data with correlated observations

I think the post it is appropriate here, although the Statistical Procedures Community also get lots of questions like this.

You don't need to cross-post to get a particular person to see the post, just use the "at sign" (@) to "mention" them in the post.  For example, I can get to see a post by typing '@xia' at which creates a menu that prompts me to select his name.

For clarity, let's contuinue this discussion at https://communities.sas.com/message/286841#286841

Super User
Posts: 10,787

## Re: Simulating logistic regression data with correlated observations

Sorry. What are your INdependent variables ( X variables) ? Only A binary independent variable ? That could not be possible .

As your description, let me think it is a  PROC SURVEYLOGISTIC  question ?

Simulating data is absolutely good thing for @Rick Wicklin .

Discussion stats
• 4 replies
• 298 views
• 0 likes
• 3 in conversation