04-16-2018 05:41 AM
I would like to perform some basic statistical test to establish whether certain customer segments are more price sensitive than other. For each customer segment (CustomerSegmentId) I have samples of how many units were bought of one specific product (NumberOfUnits) at each price (Price). The data structure is as follows:
CustomerSegmentId Price ProductId NumberOfUnits
Certain customer segments have much lower samples than others, making it an unbalanced problem. This means that I should use PROC GLM rather than PROC ANOVA using code along those lines:
proc glm data = SomeData; class CustomerSegmentId ProductId; model NumberOfUnits = Price CustomerSegmentId ProductId; run; quit;
I know that this community does not exist to answer statistical questions but the only site I am aware of Cross Validated:
is not very responsive (please suggest other sites).
Is the above a good starting point? Also how do I perform post hoc tests to answer questions as to whether CustomerSegmentId=1 is more price sensitive than CustomerSegmentId=2?
I also had a look at choice set approaches, which use for example logistic regression. Unfortunately, I only have observational data in this format:
TargetProductId ComparableProductId TargetPriceProductPrice ComparableProductPrice CustomerSegmentId TargetProductBought
1 2 23 25 1 0
1 3 23 25.50 1 0
1 4 23 21 2 1
Here we look at a target product at the time and we can establish if another comparable product of a customer was viewed. We know the price of the target product and the comparable product. We also know if the target product was bought by the customer belonging to a certain segment (TargetProductBought = binary).
Perhaps one could fit a logistic regression model using these product pair data (there would also be independent variables for each customer segment etc.)? I am aware of great publications by Warren F. Kuhfeld, e.g.:
but I am not sure whether my data described above could be used.
Any feedback would be very much appreciated. Thanks!