10-20-2016 10:38 PM
I have data on customer purchase history. I want to score each of these customers based on the attributes. For this, I want to calculate the score by assigning weights to variables, (ex: 10% to v1, 20% to v2, 50% to v3 etc.,) and then sum up these weights. The resultant score should tell me how good a customer is. For instance, a score above 500 means they are good/loyal customers and we can expect good sales from them next time. While the threshold can be decided once we get a score, I want to know how I can approach this problem.
I decided to run PCA, from which I can get the PCA scores and hence use coefficients as weights.
For example, if I select the first principal component and take the coefficients,
replacing v1, v2 , v3 with the values of the attributes, I can get a score of each observation.
I am not sure if this is a clever approach. Is there a better way to optimize the weights and calculate the score of each customer? Any thoughts are appreciated.
10-20-2016 11:31 PM
That is a clever approach.
But PCA is only applied for continuous variables.
And you also missed the second Primary Component, which maybe occupy very big variance of data.
Maybe you could includ these two primary component or three......
Suppose for the first PC,which occupy %60
Suppose for the second PC,which occupy %40
the final score maybe : Y=0.6*Y1+0.4*Y2 ?
10-20-2016 11:58 PM
Or you could use Log-Linear Model.
Check the documentation of PROC CATMOD
Example 32.4: Log-Linear Model, Three Dependent Variables
Note: remove the non-significant variables before applying your model.
10-21-2016 03:22 AM - edited 10-21-2016 03:45 AM
10-25-2016 03:17 PM
10-25-2016 03:01 PM
10-21-2016 03:35 AM
Or Check this:
Overview: PRINQUAL Procedure
The PRINQUAL procedure performs principal component analysis (PCA) of qualitative, quantitative, or
mixed data. PROC PRINQUAL is based on the work of Kruskal and Shepard (1974); Young, Takane, and