You want to fit a model to the Training data set, and then apply the fitted model from the training data set to the validation data set. This is not what you have done ... you have fit a whole new model to the validation data set.
Here is an example of how to apply the fitted model to the validation data set: http://support.sas.com/kb/39/724.html
I hope this is what you mean because I am getting lost past this step (using the SAS support article)
/* using the SAS support article as a guide */
/* 1. fit the model to the training data set */
/* 2. Include a SCORE statement to apply the fitted model to VALID*/
ods graphics on;
proc logistic data=train;
model camp_flag(event="1") = rit / outroc=troc;
score data=valid out=valpred outroc=vroc;
roc; roccontrast;
run;
Looks good to me.
I hope we're near a crescendo. Can I take a dataset like the example below and use of these existing data sets (valpred, vroc or troc) to fill in the missing values with a predicted value, or give a probability that the event will occur?
data newstuff;
input RIT camp_flag;
datalines;
240 .
200 .
150 .
;
As I said earlier, the SCORE statement will give you predicted values on the new data set. Example in the documentation.
Do you a favorite article on this topic you could point me to?
@Reeza wrote:
You use PROC SCORE or PROC PLS to score your new data set. PLS has more options these days as its the 'newest' procedure. Remember to specify the option for logistic regression though otherwise it doesn't exponentiate the estimate.
Do you mean PROC PLM?
So if a student has already attended they cannot attend again or they’re not going to be recommended to attend even if their test scores warrant it?
@GreggB wrote:
They would attend only once. To be sure I can unduplicate by Student ID to make sure.
I think I read about what you're saying - the data is divided into 2 sets using ranuni. One set is used to create the model and the other half is used for prediction?
The summer camp is for grade 3 only. The only way a student would attend twice would be if they are retained in grade 3 and they score low enough both times to be flagged for attendance at the summer camp. Since all the data sets have a unique student ID I can easily find scenarios like this if they occurred
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.