04-05-2024
WillTheKiwi
Pyrite | Level 9
Member since
10-28-2013
- 120 Posts
- 0 Likes Given
- 10 Solutions
- 5 Likes Received
-
Latest posts by WillTheKiwi
Subject Views Posted 616 04-04-2024 07:21 PM 716 04-02-2024 08:39 PM 1972 02-28-2024 02:06 PM 2461 02-23-2024 03:10 PM 2619 02-22-2024 05:50 PM 2652 02-22-2024 04:40 PM 2667 02-22-2024 04:14 PM 1234 10-19-2022 09:03 PM 1258 10-19-2022 05:45 PM 1283 10-19-2022 05:24 PM -
Activity Feed for WillTheKiwi
- Posted Re: No standard error for the overdispersion factor in a simple Poisson regression with Proc Glimmix on Statistical Procedures. 04-04-2024 07:21 PM
- Posted No standard error for the overdispersion factor in a simple Poisson regression with Proc Glimmix on Statistical Procedures. 04-02-2024 08:39 PM
- Posted Re: how to use non-parametric way to analysis one sample? on Statistical Procedures. 02-28-2024 02:06 PM
- Got a Like for Re: SAS Studio interface responding too slowly. 02-24-2024 07:34 AM
- Got a Like for Re: SAS Studio interface responding too slowly. 02-23-2024 03:14 PM
- Posted Re: SAS Studio interface responding too slowly on SAS Software for Learning Community. 02-23-2024 03:10 PM
- Posted Re: SAS Studio interface responding too slowly on SAS Software for Learning Community. 02-22-2024 05:50 PM
- Posted Re: SAS Studio interface responding too slowly on SAS Software for Learning Community. 02-22-2024 04:40 PM
- Posted SAS Studio interface responding too slowly on SAS Software for Learning Community. 02-22-2024 04:14 PM
- Posted Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=) on Statistical Procedures. 10-19-2022 09:03 PM
- Posted Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=) on Statistical Procedures. 10-19-2022 05:45 PM
- Posted Re: Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=) on Statistical Procedures. 10-19-2022 05:24 PM
- Posted Bug in SGPLOT: symbol= does not work in markerattrs=(symbol=) on Statistical Procedures. 10-19-2022 04:39 PM
- Posted Re: Interpreting the random-effect solution in a mixed model on Statistical Procedures. 12-18-2021 02:00 PM
- Posted Re: Interpreting residuals when the count is zero in a Poisson regression with Proc Glimmix on Statistical Procedures. 12-02-2021 05:31 PM
- Posted Interpreting residuals when the count is zero in a Poisson regression with Proc Glimmix on Statistical Procedures. 12-01-2021 10:25 PM
- Posted Temporary extension of available time in SAS Studio ODA on SASware Ballot Ideas. 11-09-2021 11:20 AM
- Got a Like for Re: Interpreting the random-effect solution in a mixed model. 11-09-2021 10:18 AM
- Posted Re: Interpreting the random-effect solution in a mixed model on Statistical Procedures. 11-08-2021 07:43 PM
- Posted Re: Which estimation method for missing data using PROC GLIMMIX? on Statistical Procedures. 10-25-2021 02:19 PM
-
My Liked Posts
Subject Likes Posted 3 02-23-2024 03:10 PM 1 11-08-2021 07:43 PM 1 10-18-2016 04:23 PM
04-04-2024
07:21 PM
Thanks for your feedback, Steve. I meant to say that I had already tried random _residual_/group=Sex, and weirdly, Glimmix ignored the statement altogether by not estimating the inflation factors and by making no warning or other statement in the LOG about ignoring the statement. However, your suggestion of adding covtest did the trick. So I used: random _residual_/group=Sex; covtest homogeneity/cl(alpha=0.1); ods output covparms=cov; The cov dataset had the inflation factors and their standard errors and confidence limits. I tried to get a single inflation factor by dropping off the group=Sex, but it didn't work, and again there was no indication of why in the LOG. So I tricked it by making a dummy class variable that had the same value for every observation in the data set, then using random _residual_/group=Dummy. It worked perfectly. But it seems to me that something needs to be done to Proc Glimmix and to its documentation so others don't have this problem. I also tried the analysis with Proc Genmod, as you suggested, but it gave a different value for the scale parameter, way too large, and squaring it didn't make it right. Maybe it was because I actually have under-dispersion with these data, but the values given by Glimmix were correct, because they were exactly the same as the variance divided by the mean for the males and for the females. So again, THANK YOU!
... View more
04-02-2024
08:39 PM
I have previously used Proc Glimmix with repeated measurement, hence I have used random SubjectID; or similar, and I have allowed for overdispersion with random _residual_;. Previously the overdispersion factor has appeared in the covariance parameters with a standard error (and confidence limits), but now I am just running a simple Poisson regression with one observation per subject, so there are no random effects. The overdispersion factor now appears in the parameter estimates as "Residual", along with the fixed effects, but it does not have a standard error. How come, and what can I do to get a standard error (and of course, confidence limits? Here's my code: proc glimmix data=dat2;
class Sex;
model AntiN = Sex/s noint link=log dist=poisson;
estimate "Mean Males" Sex 1 0 /exp alpha=0.1;
estimate "Mean Females" Sex 0 1 /exp alpha=0.1;
estimate "Effect of Sex, F/M" Sex -1 1 /exp alpha=0.1;
random _residual_;
run; I seem to recall encountering this problem many years ago, but I can't find anything in my files about it. Thank you! Will
... View more
02-28-2024
02:06 PM
I stopped using non-parametric analyses about 25 years ago, when I realized that only severe non-normality of the dependent variable is an issue, and in such cases it can be addressed with transformation (usually log) or a generalized linear model. Normality of the sampling distribution of the outcome statistic is an issue, if you use a t statistic to make inferences, but normality is practically guaranteed by the Central Limit Theorem, and you can't test for it. I laugh out loud whenever I see an author reporting that they tested their data for normality, and getting significance, opted for non-parametric analyses. Will
... View more
02-23-2024
03:10 PM
3 Likes
Running Chrome in incognito mode solved the problem, so thanks heaps to Greg Wootton. I then tried switching off all browser extensions. The culprit was EndNote Click. Presumably it was added or updated lately and I didn't notice. Whatever, it's incompatible with the CODE window in Studio. Thanks again, Greg. Someone jumped the gun by accepting as a solution a previous message suggesting to contact SASoda@sas.com. Will
... View more
02-22-2024
05:50 PM
Thanks for the quick reply. I just by-passed my router by using my cellphone's hotspot, but no joy. I can't try another computer, but my European colleague has the same problem. I'll try technical support. There is no link to it from the SAS site, not that I can find, but I have an email address from some years ago.
... View more
02-22-2024
04:40 PM
This is URGENT. It's practically unusable. Whlle you're at it, you could try fixing another sporadic problem, whereby when attempting to highlight several lines in a program, all lines below get highlighted. Oh, and another thing: sometimes you can't see where the cursor is, nor can you see the highlighting on code that is nevertheless highlighted.
... View more
02-22-2024
04:14 PM
Not sure if this is the right community. I have noticed in the last day or two that the CODE window in SAS studio is very slow to respond to mouse clicks and scrolling. I tried it on Chrome and Edge, and I tried rebooting my laptop, all to no avail. I am on the Asia/Pacific server. A colleague working on a European server has noticed the same problem. Will
... View more
10-19-2022
09:03 PM
Thanks, Reeza, Maybe the documentation should be clearer on this point. Getting filled outlined symbols is something we do often. "If you want filled outlined markers, you must use filledoutlinedmarkers, and markerattrs will then work only for size and symbol, but not color. Use markeroutlineattrs=(color=black) markerfillattrs=(color=whatever) to specify outline and fill colors."
... View more
10-19-2022
05:45 PM
Thanks heaps for the quick reply. I tried deleting filledoutlinedmarkers and adding markeroutlineattrs and markerfillattrs, but I got a symbol filled with black: scatter x=&X.Delta y=&Y.DeltaCon/markerattrs=(symbol=trianglefilled size=12) markeroutlineattrs=(color=black) markerfillattrs=(color=white) But it didn't quite work. When I added filledoutlinedmarkers back in, it worked: scatter x=&X.Delta y=&Y.DeltaCon/filledoutlinedmarkers markerattrs=(symbol=trianglefilled size=12) markeroutlineattrs=(color=black) markerfillattrs=(color=white) So problem solved, but the documentation needs updating. In spite of what the documentation states, symbol= does not work within markerattrs=. Will
... View more
10-19-2022
05:24 PM
Sorry, in case it matters, I am using SAS Studio in SAS ODA.
... View more
10-19-2022
04:39 PM
According to the documentation at https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/grstatproc/p0i3rles1y5mvsn1hrq3i2271rmi.htm I can write: markerattrs=(color=lightgreen size=10 symbol=trianglefilled) When I do this in my program, I have: Notice that I have inserted it as a screen capture, and that the word "symbol" has not rendered in blue. Sure enough, SGPLOT has not recognized this option word and has not plotted filled triangles. I am running it with styleattrs, but starring that off makes no difference. This appears to be a pretty fundamental bug, so maybe I am doing something wrong, I can't figure it out. I don't know what else to try. I have searched the web and the FAQs before posting this help request. Will
... View more
12-18-2021
02:00 PM
I have done further simulations to check what happens when I have another variable that tracks the change scores. The simulations are for real data, where the sport scientists at Olympiatoppen are monitoring changes in biomechanical measures in one kind of test (on a dynamometer) in their athletes, and they want to know the extent to which the changes in those measures tests track changes in the more usual fitness or performance tests (jump height, sprint speed and so on). As I noted previously, the random-effect solutions provide measures of change scores, and their SD were smaller (sometimes by quite a lot) than the SD coming from the covparms. But surprisingly, when I used the random-effect solutions as predictors in a model with the performance test measure as the dependent, I got unbiased estimates of the relationships that I had set up in the data between the biomech and performance measures. So I was wrong about needing to correct for attenuation.
... View more
12-02-2021
05:31 PM
Thanks, Steve. Yes, I was thinking of geting around to playing with Proc FMM, but first I want to see if including all my predictors reduces the overdispersion enough not to have to worry about it. Certainly, at the moment the overdispersion is huge, owing partly to the high proportion of zeros. Meantime, as per your suggestion, I got it going in Proc Genmod. There was no "straight" residual in Genmod, I got the same predicted values as in Glimmix (with or without overdispersion) and the Pearson residual in Genmod was the same as the chi-squared residual in Glimmix (without overdispersion) for all observations, including those with zero counts (which differed between observations with zero counts). The Pearson or chi-squared residuals are derived presumably by dividing -1 by the appropriate sampling SD. It looks like -1 is the lowest possible value for the straight resid; even one medal for a country with a huge population gives a value a bit less negative than -1. So the value of -1 might be something to do with getting the right estimate for the Pearson residual when the count is zero, but it's still weird, because you would think that the residual when the observed count is zero should be more negative for larger predicted values. So I am stumped.
... View more
12-01-2021
10:25 PM
I've got a really simple model, in which I am predicting total counts (of countries' medals at Pyeongchang) with countries' populations. I am interested in interpreting the residuals as a measure of "sportiness" . (I will also be adjusting for GDP per capita, latitude, and a few other things, but let's keep it simple right now.), Here's the essential code: proc glimmix data=all;
class Mean Country;
model Total=Mean &pred/s noint link=log dist=Poisson;
random _residual_/subject=Country;
lsmeans Mean/ilink cl alpha=α
estimate "&pred %/%" &pred 1/alpha=α
output out=pred resid=Resid resid(ilink)=ResidBT student=StudentResid predicted=Pred;
ods output covparms=cov; I've got lots of zero medal counts for countries that sent a team to the Olympics but got no medals. The countries with zero counts have a residual of -1. I cannot find anything in SAS or on the web to explain why that value has been chosen and what it means. Here's the first few lines of the pred dataset to illustrate: ResidBT is the difference between the observed count and the back-transformed predicted count, and I have checked manually with the parameter estimates that it's correct. What I can't figure out is how to work with Resid: it is not exactly the same as log(Total) minus Pred, and, of course, you can't use log(Total) anyway when Total=0. But why -1, and how are the other resids estimated for non-zero counts? Thanks, guys. Will
... View more
11-08-2021
07:43 PM
1 Like
In case anyone is still interested in this topic, I have resolved the problem of the variance of the random-effect solution (as given by ods output solutionr= in Proc Mixed) being somewhat (and sometimes a lot) less than the variance given by its corresponding covparm (as given by ods output covparms=). I did it by developing another simulation for a study of repeated measurements, where each subject (athletes) has several tests in one season followed by several tests in the next season. The random-effect structure is given by random int Season/subject=Athlete;. This model estimates differences between athletes (the solution for the random Intercept/subject=Athlete, changes within athletes between seasons (the solution for random Season/subject=Athlete), and changes within athletes within seasons (the residuals). The individual solution values for the random effects all have standard errors, and it turns out that the variance of the solution plus the mean standard error squared is equal to the variance given by the covparms. Even the residuals must have standard errors, but I don't know how to estimate those and I assume they account for the difference between the variance of the residuals and residual variance given by the covparms. See below for the code for this simulation. I included lots of code to check that the random-effect solution produces unbiased estimates, and that the 90% compatibility intervals for each solution value did indeed include the true (simulated) value 90% of the time. I could have done it all with the previous code for individual responses, but I wanted to simulate repeated testing of athletes for a project I am helping with at Olympiatoppen (the Norwegian Olympic Sports Center). I'll be using the residuals and possibly the random-effect solution for Season as linear predictors in another mixed model. Their linear effects will be attenuated by the standard errors in each value. These standard errors vary a little bit, depending on the number of repeated measurements each athlete has, but the mean will be given by the difference in the variances described above. I don't have to calculate it, because all I am interested in is the correction for attenuation of a linear predictor that has error; the correction factor is given by 1/ICC (ICC = intraclass correlation coefficient), and the ICC is given by (pure variance)/(observed variance), which here is the variance of the random-effect solution divided by the covparm variance. I won't be allowing negative variances with the upcoming data, so there won't be any problem with interpreting the random-effect solution. Will *simulates repeated tests within each of two seasons;
*some of these macro variables are irrelevant here, copied from another program;
%let Ssize=100; *number of subjects;
%let BtwnSD=4; *true differences between subjects, to be estimated by a random effect;
%let Mean=100; *true grand mean, irrelevant here;
%let WthnErr=2; *error within subjects between tests within seasons, to be estimated by the residual;
%let BtwnErr=1; *extra error within subjects between seasons, to be estimated by a random effect;
%let MaxTestsWthn=4; *maximum number of tests in each season;
%let alpha=0.1;
%let nob=; *make it nob=bound to allow negative variance;
%let convcrit=CONVH=1E-8 convf=1E-8; *make the values 1E-7, 1E-6 or smaller if fail to converge;
%let tvalue=3.5; *residual standardized threshold for outliers;
%let logflag=0; *set to 1 for log transformation of the dependent;
%let deceff=1; *decimal places for effects and SDs;
data dat1;
TestID=0;
do Athlete=1 to &Ssize;
YtrueRand=&BtwnSD*rannor(0);
do Season=1;
BtwnSeasErr=&BtwnErr*rannor(0);
do TestNo=1 to 2+int((&MaxTestsWthn-1)*ranuni(0));
WthnSeasErr=rannor(0)*&WthnErr;
Yobsvd=&Mean+YtrueRand+BtwnSeasErr+WthnSeasErr;
TestID=TestID+1;
output;
end;
end;
do Season=2;
BtwnSeasErr=&BtwnErr*rannor(0);
if ranuni(0)>0.5 then do; *this gives 50% chance of any values in Season 2;
*if ranuni(0)>0 then do; *always have values in Season 2;
*do TestNo=1 to 2+int((&MaxTestsWthn-1)*ranuni(0)); *on average, as many as in Season 1;
do TestNo=1 to 1+int((&MaxTestsWthn-1)*ranuni(0)); *sometimes only one test in Season 2;
WthnSeasErr=rannor(0)*&WthnErr;
Yobsvd=&Mean+YtrueRand+BtwnSeasErr+WthnSeasErr;
TestID=TestID+1;
output;
end;
end;
end;
end;
title "Values for first three athletes";
proc print data=dat1;
where Athlete<4;
format YtrueRand BtwnSeasErr WthnSeasErr Yobsvd 5.2;
run;
ods select none;
title "Repeated tests over two seasons, mixed model";
proc mixed data=dat1 covtest cl alpha=&alpha &nob &convcrit;
class Athlete Season;
model Yobsvd=/s outp=pred residual alphap=&alpha ddfm=sat ;
random int Season/subject=Athlete s cl alpha=α * type=un;*random intercepts and slopes model;
*lsmeans Mean Gender/cl alpha=α
*estimate "Male/female" Gender 1 -1/cl alpha=α
ods output covparms=cov;
ods output estimates=est;
ods output lsmeans=lsm;
ods output solutionr=solr;
ods output solutionf=solf;
ods output classlevels=clev;
*by Sport;
*where Sport="Endur";
run;
ods select all;
data cov1;
set cov;
DegFree=2*Zvalue**2;
title2 "Random effects as variances";
title3 "True values: Intercept=&BtwnSD**2; Season=&BtwnErr**2; Residual=&WthnErr**2";
proc print data=cov1;
format _numeric_ 5.&deceff DegFree 5.0;
run;
title2 "Variance of residuals";
proc means data=pred noprint;
var resid;
output out=pred1(drop=_type_ _freq_) n=NoOfObs var=Variance;
run;
proc sort data=solr;
by Effect;
title2 "First 10 observations of Intercept random-effect solution";
proc print data=solr(obs=10);
where Effect="Intercept";
run;
title2 "First 10 observations of Season random-effect solution";
proc print data=solr(obs=10);
where Effect="Season";
run;
data solr0;
set solr;
SEsq=StdErrPred**2;
title2 "Variance of random-effect solutions";
proc means data=solr0 noprint;
var estimate SEsq;
by Effect;
output out=solr1(drop=_type_ _freq_ d) n=NoOfObs var=Variance mean=d SEsq;
where estimate ne 0;
run;
*proc print;run;
data solresid;
set solr1 pred1;
if Effect="" then Effect="Residual";
data all;
merge solresid cov1(keep=estimate degfree rename=(Estimate=CovParmVariance));
VarianceAdjustDF=Variance*(NoOfObs-1)/DegFree;
VarPlusSEsq=Variance+SEsq;
EstimatedSEsq=CovParmVariance-Variance;
CorrAttenFactor=CovParmVariance/Variance;
title2 "Variance of random-effect solutions and residuals, and covparm variance";
title3 "SEsq is the mean of StdErrPred squared of the solutions";
title4 "The covparm variance is evidently the sum of the variance and SEsq";
title5 "Hence the estimate of SEsq for the residual variance (I can't estimate error in each residual)";
title6 "When a random-effect solution or residuals are used as linear predictors, their effects";
title7 "will need to be corrected upwards by CorrAttenFactor = CovParmVariance/Variance = 1/ICC.";
title8 "Variance adjusted for DF does not work: it's always greater than the covparm variance";
proc print;
var Effect NoOfObs Variance SEsq VarPlusSEsq CovParmVariance EstimatedSEsq
DegFree VarianceAdjustDF CorrAttenFactor;
format _numeric_ 5.&deceff CorrAttenFactor 5.2 DegFree 5.0;
run;
title2 "Check individual true values of Season against random-effect solution";
*proc print data=solr;run;
data season;
set solr;
if Effect="Season";
keep Athlete Estimate Season Lower Upper;
*proc print;run;
proc sort data=dat1;
by Athlete Season;
data dat2;
set dat1;
if lag(Athlete) ne Athlete or lag(Season) ne Season;
keep Athlete Season BtwnSeasErr;
*proc print;
run;
data season1;
merge season dat2;
by Athlete Season;
if BtwnSeasErr ne .; *needed when a subject has no test in Season 2;
if BtwnSeasErr<Lower or BtwnSeasErr>Upper then Type0error="***";
*proc print;run;
title3 "Type0 error (i.e., when 90%CI for individual estimates does not include true value)";
title4 "Expected value is 10%";
proc freq;
tables Type0error/missing nocum;
run;
title3 "Expected true BtwnSeasErr: StdDev=&BtwnErr; variance=&BtwnErr**2";
proc means n mean std var maxdec=&deceff;
var Estimate BtwnSeasErr;
run;
title3 "True value (BtwnSeasErr) vs solution (Estimate)";
title4 "Line of identity shown in black, indicating no bias in the estimated value";
title5 "(Rerun the analysis to show that any apparent bias is just sampling variation.)";
ods graphics / reset width=14cm height=12cm imagemap attrpriority=none;
proc sgplot data=season1; *uniform=all;
styleattrs
datacolors=(red blue white)
datacontrastcolors=(black)
datasymbols=(circlefilled); *needs a group= for styleattrs to work;
*scatter x=pred y=Resid/markerattrs=(size=6 symbol=circlefilled) filledoutlinedmarkers
group=PossessionType;
scatter x=Estimate y=BtwnSeasErr/markerattrs=(size=8 color=black symbol=circlefilled) filledoutlinedmarkers MARKERFILLATTRS=(color=lightgreen); * group=Gender;
scatter x=BtwnSeasErr y=BtwnSeasErr/markerattrs=(size=3 color=black symbol=circlefilled);
*scatter x=pred y=Resid/markerattrs=(size=8 color=black symbol=circlefilled) filledoutlinedmarkers MARKERFILLATTRS=(color=lightgreen);
*scatter x=StrokeNo y=Resid/markerattrs=(size=3 color=black symbol=circlefilled) filledoutlinedmarkers MARKERFILLATTRS=(color=black);
refline 0/axis=y lineattrs=(pattern=dot thickness=1 color=black);
refline 0/axis=x lineattrs=(pattern=dot thickness=1 color=black);
reg x=Estimate y=BtwnSeasErr/degree=1 lineattrs=(pattrn=solid thickness=1 color=blue) legendlabel="linear" nomarkers;
*reg x=pred y=Resid/degree=2 lineattrs=(thickness=1 color=red) legendlabel="quadratic" nomarkers;
*by Sport;
run;
ods graphics / reset;
... View more