Hi all, I have a dataset with a binary outcome (The prescription of drug A Yes/No). My dataset is at patient level, meaning there are unique patients in the dataset. We wanted to consider how neighborhoods could affect the use of drug A, so we merged our data by census tracts to neighborhood-level factors (proportion in the census tract living below the poverty level, proportion with a high school degree). The dataset is now set up such that patients in the same census tracts have the same neighborhood level values (see dataset below). I want to run a logistic regression to predict the use of drug A, but I would like to account for the repeated values as a result of the census tract. How do I do this? PatientID age sex diabetes arthritis Drug A census_tract prop_below_poverty prop_with_highschool 1 47 male 0 1 1 47157002000 15.0 47.0 2 51 female 0 0 1 47157002000 15.0 47.0 3 34 female 1 1 0 47157002000 15.0 47.0 4 65 male 1 0 0 47157008500 8.6 75.0 5 27 male 1 0 1 47157008500 8.6 75.0 6 34 male 0 0 0 47157008500 8.6 75.0 7 70 female 1 1 1 47157008500 8.6 75.0 8 62 male 1 0 1 47157021136 12.1 62.0 Drug A = dependent variable /outcome. (Was determined at patient level) Diabetes ( 0 = no diabetes, 1 = has diabetes) arthritis (0 = no arthritis, 1 has arthritis) prop_below_poverty and prop_with_highschool (continuous variables calculated as percentages) Thank you.
... View more