Thanks a lot for all the useful comments. I do agree that it is danger to remove points without a scientific ground. I would like to propose a new way to see if we agree on it... At first, I want to emphasize that, for the problem we have, the coefficient A must be negative, otherwise it doesn't make business sense at all. As we all know, the reason why the data still yields a positive coefficient A is that there is unknown factor affecting the data. Second, generally, like the points {(low X value, low Y value), (high X value, high Y value)}, these kinds of points usually are most influential ones causing A being positive. For the problem we have, these points do not follow business common sense because a high X value should relate to a low Y value. However, it does exist in the data because of other unknown factor plays a major impact here. I am wondering that, instead of removing these points, we should add a new event variable for each of these points to account for the unknown factor, by adding the event variable into the model iteratively for the most influential points sequentially, we can eventually get a negative coefficient A. Does anyone know any existing methods for doing this kind of work? Thanks a lot.
... View more