09-09-2015 10:18 PM
My long term question is how to duplicate a particular analysis in R but first I'm trying to understand the output from SAS. Here is the analysis code:
* Blackcap example from the corrHLfit function in the R spaMM library;
* Input data;
data test;
input latitude longitude migStatus means pos;
cards;
36.1291 -5.3469 0.0 161.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=ml;
model migStatus = means / solution;
repeated / type=sp(matern)(latitude longitude);
run;
/* Output:
Covariance Parameter
Estimates
Cov Parm Estimate
SP(MATERN) 1.0000
Smoothness 0.5000
Residual 0.5263
*/
I don't understand the "Cov Parm" estimates. Even when I change the data, I get the same estimates for SP(MATERN) and Smoothness. And given that the values certainly don't look like typical estimates (1.0 and 0.5), I'm wondering if these are just set to a default value and not fitted using maximum likelihood. Can someone explain this output to me?
Thanks,
Jim
09-10-2015 08:48 AM
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
09-10-2015 08:33 AM
Some quick testing seems to indicate that since all of the observations are "lumped" into the same subject, there is no movement away from the initial values. I tried the following:
data test;
input latitude longitude migStatus means pos;
id=mod(pos,7);
cards;
36.1291 -5.3469 0.0 61.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=reml;
class id;
model migStatus = means / solution;
repeated /subject=id type=sp(matern)(latitude longitude);
run;
This introduces subject to subject variability (note that it is all made up at this point). The results from this were:
Covariance Parameter Estimates | ||
Cov Parm | Subject | Estimate |
SP(MATERN) | id | 0.9001 |
Smoothness | id | 0.5211 |
Residual |
| 0.7293 |
09-10-2015 08:48 AM
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
09-10-2015 10:07 AM
Excellent! Thank you! That makes everything match up between SAS and R.
For others who might have this issue I offer the following Rosetta Stone:
SAS parameter = corrHLfit parameter
SP(MATERN) = 1/rho
Smoothness = nu
Residual = lambda
SAS counts the fixed effects and the covariance terms as parameters in the construction of the AIC statistic whereas the corrHLfit function in R only considers the fixed effects (i.e., adding in twice the number of parameters).
Thanks again!
Jim
09-11-2015 09:20 AM
With a repeated statement: If you do not specify a subject, then each observation is considered a subject. Thus, there could be no correlation structure. If you specify subject=intercept, the entire dataset is considered to be one large subject, so that every observation could be correlated (depending on the correlation structure parameters).
Need further help from the community? Please ask a new question.