05-04-2017 01:57 AM
I have a data set that contains the variables "eligibility" and "predicted eligibility". These variables are assigned values 1 and 0. I wish to find the percentage of the times that the two variables match (i.e eligibility=0 and predicted eligibility=0 + eligibility=1 and predicted eligibility=1 all divided by the total)-this is to be called the match rate. The eligibility variable always remains the same, however the predicted eligibility variable depends on a probablity (called prob) that is used in assigning the predicted eligibility as being either 0 (if found_predict>=prob) or 1 (if found_predict<prob), where found_predict is a constant variable. I want to set up an array in a do loop that finds the matching rate for each probability. That is, i want code that looks at values for prob between 0.3 and 0.6 say (in increments of 0.01) and determines the match rate for each.
05-04-2017 02:39 AM
When you posted your question, right below the message window you could see this:
Stop right there! Before pressing POST, tick off this checklist. Does your post …
|✔ Have a descriptive subject line, i.e., How do I ‘XYZ’?||✔ Use simple language and provide context? Definitely mention what version you’re on.||✔ Include code and example data? Consider using the SAS Syntax feature.|
Please respect #3.
A macro that converts a SAS dataset into datastep code for posting can be found here.
05-04-2017 05:49 PM - edited 05-04-2017 05:50 PM
Nice. This information should automatically come to posters' attention.
Is there a way to make Kurt's recommendations appear whenever someone opens a new thread and is seeking programing help, like a check list that users confirm they have taken into account before they can click "Post"?
05-04-2017 09:11 AM
It is a bit confusing to figure out what you are trying to do. It is conceivable that arrays are a good tool for the job, or an irrelevant tool for the job. At any rate, here are a couple of pieces of the puzzle. To create your flag indicating whether actual eligibility matches predicted:
match = (eligibility = predicted_eligibility);
To get its average value across observations (many ways to do this):
proc summary data=want;
output out=stats (keep=match_rate) mean=match_rate;