My long term question is how to duplicate a particular analysis in R but first I'm trying to understand the output from SAS. Here is the analysis code:
* Blackcap example from the corrHLfit function in the R spaMM library;
* Input data;
data test;
input latitude longitude migStatus means pos;
cards;
36.1291 -5.3469 0.0 161.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=ml;
model migStatus = means / solution;
repeated / type=sp(matern)(latitude longitude);
run;
/* Output:
Covariance Parameter
Estimates
Cov Parm Estimate
SP(MATERN) 1.0000
Smoothness 0.5000
Residual 0.5263
*/
I don't understand the "Cov Parm" estimates. Even when I change the data, I get the same estimates for SP(MATERN) and Smoothness. And given that the values certainly don't look like typical estimates (1.0 and 0.5), I'm wondering if these are just set to a default value and not fitted using maximum likelihood. Can someone explain this output to me?
Thanks,
Jim
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
Some quick testing seems to indicate that since all of the observations are "lumped" into the same subject, there is no movement away from the initial values. I tried the following:
data test;
input latitude longitude migStatus means pos;
id=mod(pos,7);
cards;
36.1291 -5.3469 0.0 61.4000 1
15.0522 -23.6010 0.0 162.2857 2
43.5226 4.7189 1.0 162.6111 3
28.6742 -17.7859 0.5 163.4400 4
32.6743 -16.9105 0.5 162.6667 5
28.1600 -17.1980 0.5 163.6667 6
28.4772 -16.4479 0.5 163.4000 7
41.5335 2.2991 1.0 162.6818 8
40.6653 -4.0871 1.5 162.3667 9
48.2850 16.9086 2.0 162.8800 10
-0.1671 37.0154 2.5 163.5000 11
47.8160 8.9887 2.0 163.7049 12
41.7425 12.4035 2.0 163.1667 13
55.7559 37.6197 2.5 163.8333 14
run;
* Run analysis;
proc mixed data=test method=reml;
class id;
model migStatus = means / solution;
repeated /subject=id type=sp(matern)(latitude longitude);
run;
This introduces subject to subject variability (note that it is all made up at this point). The results from this were:
Covariance Parameter Estimates | ||
Cov Parm | Subject | Estimate |
SP(MATERN) | id | 0.9001 |
Smoothness | id | 0.5211 |
Residual |
| 0.7293 |
I would never have guessed this, but try
proc mixed data=test method=reml;
model migStatus = means / solution;
repeated /subject=intercept type=sp(matern)(latitude longitude);
run;
This seems to solve both problems of moving off the initial values and getting rid of the non-positive definite Hessian output warning. A subject= option is definitely needed, it was just lucky that I looked at the anisotropic spherical example above the Matern examples in the documentation.
Steve Denham
Excellent! Thank you! That makes everything match up between SAS and R.
For others who might have this issue I offer the following Rosetta Stone:
SAS parameter = corrHLfit parameter
SP(MATERN) = 1/rho
Smoothness = nu
Residual = lambda
SAS counts the fixed effects and the covariance terms as parameters in the construction of the AIC statistic whereas the corrHLfit function in R only considers the fixed effects (i.e., adding in twice the number of parameters).
Thanks again!
Jim
With a repeated statement: If you do not specify a subject, then each observation is considered a subject. Thus, there could be no correlation structure. If you specify subject=intercept, the entire dataset is considered to be one large subject, so that every observation could be correlated (depending on the correlation structure parameters).
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.