Hello all,
I have a lot of questions and any help would be great appreciated.
I have a project to create data analysis on data from the internet. I choose YRBS 2021 data. I imported it into SAS, used their formats, and some of my own. I have SAS coding questions and statistical analysis questions.
The YRBS data created all the questions into binary variables as well. Which is all i'm using essentially for my data. I'm using these:
Mentalhealth | Char | 8 | Poor Mental Health | |
Q1 | Char | 1 | $AGEF. | How old are you |
Q2 | Char | 1 | $GENDERF. | What is your sex |
QN18 | Num | 8 | VIOLENCEF. | Ever saw someone get physically attacked, beaten, stabbed, or shot in their neighborhood |
QN69 | Num | 8 | NOFRUITF. | Did not eat fruit |
QN85 | Num | 8 | MENTALHF. | Reported that their mental health was most of the time or always not good |
QN86 | Num | 8 | SLEEPF. | Got 8 or more hours of sleep |
QNILLICT | Num | 8 | DRUGSF. | Ever used select illicit drugs |
QNOBESE | Num | 8 | OBESEF. | Had obesity |
QNPA0DAY | Num | 8 | NOPAF. | Did not participate in at least 60 minutes of physical activity on at least 1 day |
QNVEG0 | Num | 8 | NOVEGF. | Did not eat vegetables |
RACEETH | Char | 2 | $RACEF. | Race/Ethnicity |
Mentalhealth is the only variable I created myself, it Is created from qn85. It Is my dependent variable, which I want to be binary. Something weird is CDC coded all the binary stuff as numeric values in SAS, is that right?
I was under the impression all of these variables are character variables as they are categorical?
-In addition, I was doing binary logistic regression, is that right?
-Do I need to recode all these variables to be character variables?
-When I try to add a format to my new variable mentalhealth it doesn't stick and makes the whole column blank but all my other formats work.
-What do I use to detect outliers since it is all qualitative data?
-How do I perform summary statistics on all qualitative data?
I would appreciate any help. I'm totally lost. We've only gone over quantitative variables in class and i'm not sure how we are expected to know how to do this. I've been research online for a week now.
Thanks again!
Mentalhealth is the only variable I created myself, it Is created from qn85. It Is my dependent variable, which I want to be binary. Something weird is CDC coded all the binary stuff as numeric values in SAS, is that right?
I was under the impression all of these variables are character variables as they are categorical?
Apparently, most of them are numeric. If you created the response variable from QN85, then QN85 should not be in the model.
-In addition, I was doing binary logistic regression, is that right? YES
-Do I need to recode all these variables to be character variables? Not necessary. If they truly are categories, but coded as numbers, you can use a CLASS statement in PROC LOGISTIC to handle them as categories; no need to convert to character.
-When I try to add a format to my new variable mentalhealth it doesn't stick and makes the whole column blank but all my other formats work. I don't understand the question
-What do I use to detect outliers since it is all qualitative data? If the independent and dependent variables are all categories, there are no outliers.
-How do I perform summary statistics on all qualitative data? proportion of mental health by level of qualitative variable. You could use PROC FREQ
Added proc print of variables and forgot to add, my hypothesis statement and the variables (I can add what code I have so far if needed)
𝐻𝑜: 𝛽𝑖 =0: Experiencing violence is irrelevant in predicting poor mental health in high school students grade 9-12 in the United States
Dependent (Outcome/response) Variable: Y Mental illness | Dichotomous (binary) | |
Independent (Predictor) Variable: X Experienced Violence
| Dichotomous (binary) | |
Co-Variates |
| |
Age | Categorical | |
Race | Categorical | |
Sex | Dichotomous | |
Obesity | Dichotomous | |
Less than 7-8h sleep | Dichotomous | |
Use Drugs | Dichotomous | |
Eats Fruits & Vegetables | Dichotomous | |
Get Physical Activity | Dichotomous |
Mentalhealth is the only variable I created myself, it Is created from qn85. It Is my dependent variable, which I want to be binary. Something weird is CDC coded all the binary stuff as numeric values in SAS, is that right?
I was under the impression all of these variables are character variables as they are categorical?
Apparently, most of them are numeric. If you created the response variable from QN85, then QN85 should not be in the model.
-In addition, I was doing binary logistic regression, is that right? YES
-Do I need to recode all these variables to be character variables? Not necessary. If they truly are categories, but coded as numbers, you can use a CLASS statement in PROC LOGISTIC to handle them as categories; no need to convert to character.
-When I try to add a format to my new variable mentalhealth it doesn't stick and makes the whole column blank but all my other formats work. I don't understand the question
-What do I use to detect outliers since it is all qualitative data? If the independent and dependent variables are all categories, there are no outliers.
-How do I perform summary statistics on all qualitative data? proportion of mental health by level of qualitative variable. You could use PROC FREQ
Yes, I had not removed qn85 variable yet. Thank you. I guess I will remove my modified new variable I made. I thought it might mess up the data through SAS if it was in there as numeric. I guess I didn't understand what the categorization was. And in doing so, I don't need help on the format then, because the format worked on qn85.
That is what I thought on the outlier thing but wasn't positive. Thank you.
This is very helpful. Thank you very much.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.