My long term question is how to duplicate a particular analysis in R but first I'm trying to understand the output from SAS. Here is the analysis code:
* Blackcap example from the corrHLfit function in the R spaMM library;
* Input data;
data test;
input latitude longitude migStatus means pos;
cards;
36.1291 -5.3469 0.0 161.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=ml;
model migStatus = means / solution;
repeated / type=sp(matern)(latitude longitude);
run;
/* Output:
Covariance Parameter
Estimates
Cov Parm Estimate
SP(MATERN) 1.0000
Smoothness 0.5000
Residual 0.5263
*/
I don't understand the "Cov Parm" estimates. Even when I change the data, I get the same estimates for SP(MATERN) and Smoothness. And given that the values certainly don't look like typical estimates (1.0 and 0.5), I'm wondering if these are just set to a default value and not fitted using maximum likelihood. Can someone explain this output to me?
Thanks,
Jim
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
Some quick testing seems to indicate that since all of the observations are "lumped" into the same subject, there is no movement away from the initial values. I tried the following:
data test;
input latitude longitude migStatus means pos;
id=mod(pos,7);
cards;
36.1291 -5.3469 0.0 61.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=reml;
class id;
model migStatus = means / solution;
repeated /subject=id type=sp(matern)(latitude longitude);
run;
This introduces subject to subject variability (note that it is all made up at this point). The results from this were:
Covariance Parameter Estimates | ||
Cov Parm | Subject | Estimate |
SP(MATERN) | id | 0.9001 |
Smoothness | id | 0.5211 |
Residual |
| 0.7293 |
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
Excellent! Thank you! That makes everything match up between SAS and R.
For others who might have this issue I offer the following Rosetta Stone:
SAS parameter = corrHLfit parameter
SP(MATERN) = 1/rho
Smoothness = nu
Residual = lambda
SAS counts the fixed effects and the covariance terms as parameters in the construction of the AIC statistic whereas the corrHLfit function in R only considers the fixed effects (i.e., adding in twice the number of parameters).
Thanks again!
Jim
With a repeated statement: If you do not specify a subject, then each observation is considered a subject. Thus, there could be no correlation structure. If you specify subject=intercept, the entire dataset is considered to be one large subject, so that every observation could be correlated (depending on the correlation structure parameters).
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.