Thank you for your reply Paige. As far as I know (correct me if I am wrong), but since the row rank of a matrix always equals to the column rank, I guess collinear observations and collinear columns are actually one thing. In this case, thinking in terms of columns, I am sure that collinearity comes from categorical variables. However, given that the original dataset is already large enough, I cannot manually generate all categorical variables and then check their collinearity. In addition, I have tried to run the Stata code in the original post in a small sample of ~8.0M observations. There are no missing values, and also no duplicate observations. When I run the regression there are only ~7.8M obs left. So I am pretty sure that it is possible.
... View more