Hello,
i am having trouble figuring out how to pull out a specific variable to run my scatterplot for. I have to run the code for the entire data set, and then two variables.
Do I need to use a weighted statement or a freq statement? i've used both but i'm not sure if i'm on the right track.
This is the code i used for my original data set:
proc corr data=dmft plots=(scatter);
Var industrial sugar dmft;
Title 'correlation of sugar and decaying teeth in industrial and nonindustrial countries';
run;
I then need to run the code specifically for the industrial variable and nonindustrial variable but they are put in the code as if industrial=2 then industrial=0.
is there a link to more information on this somewhere?
A Freq statement is used when you have a variable that indicates each record in your data set represents more than one identical records.
A weight statement would be used when you have a variable that indicates each record should have a weight applied, such as from a selection probability.
Proc Corr expects to see variables that represent continuous, or at least sort of continuous variables. If you only have 2 levels for a variable it likely is in appropriate to include on a VAR statement.
I suspect that you might be asking for analysis of different levels of the Industrial variable so you can see the differences of the other variables between the levels.
That would be BY group processing.
Sort your data by Industrial.
Then add a BY statement with the Industrial variable.
Proc sort data=dmft; by industrial; run; proc corr data=dmft plots=(scatter); by industrial; Var sugar dmft; Title 'correlation of sugar and decaying teeth in industrial and nonindustrial countries'; run; title;
BY group processing lets you repeat analysis or other actions in many SAS procedures by the values of one or more variables but requires sorting of the data.
I'm having a great deal of difficulty understanding the question. It's not clear what the desired output is. Can you show us screen captures? (Use the camera icon to include the screen capture in your reply)
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.