Did you start with basic linear regression and imputing the missing with the mean or random variables to get a distribution around your estimate? That's probably a good starting point to get you a baseline.
You will need to standardize your data, in particular enrollment needs to be standardized to include the population values. A store in NY will by default sign up more people than one in Kentucky. So you need to make some decisions on that.
Then to account for missing or to score those as zero. I lean towards scoring them as zero because that's a nudge to the those teams to increase their data quality BUT if people are less likely to give up information in certain stores it seems wrong to penalize stores. Also, different jurisdictions could have different rules around what you can collect - no idea of the where your stores are but basically, context of the problem does matter. Would you still penalize stores for missing in these cases?
After linear regression I would probably try PROC PLS next but its a bit more complex, so if the accuracy isn't there I'd pick the simpler model.
@Picanion wrote:
I tried Factor analysis, but its not giving any conclusive result. can you elaborate how you meant to use it.
I have used-- enrollment and other variables in absolute term.
... View more