BookmarkSubscribeRSS Feed
Rahulpaul
Calcite | Level 5

I am trying to Simulate a Device Performance by checking the Rate ratio between the two periods. The assumption is that 

 90% of patients will achieve performance success in both Evaluable Periods of this study.  With a sample of 50 subjects with at least one Kt/V measure in each Evaluable Period, McNemar’s test will provide 80% power to ensure the lower limit of the observed one-sided 97.5% confidence interval for the absolute difference in the proportion of subjects who achieve performance success (P2 – P1) will be greater than -17% when the expected difference between periods is zero and the proportion of subjects who achieve performance success in one period only (discordant) is 0.1. Here for the NLMIXED procedure I am getting an error of

"ERROR: Quadrature accuracy of 0.000100 could not be achieved with 31 points. The achieved accuracy was 1.000000."

 

The code is :

title2 "nlmixed";
proc nlmixed data=selected_at_w_1_3_6;
/* by parameter_scenario_no;*/
where also parameter_scenario_no=2;
where also simul_seqno=1;
parms mu1=1.72 mu2=1.28 g11=0.1 g21=0.47 g22=0.1; /* for starting value consider using data set */
* other parameterisation than glimmix;
eta = b1*(1-IsPeriod2) + b2*IsPeriod2;
expeta = exp(eta);
p = expeta/(1+expeta);
model aval01 ~ binomial(1,p);
random b1 b2 ~ normal([mu1,mu2],[g11,g21,g22]) subject=subjidn out = check;
run;

 

Any help in this regards will be great.

 

Thanks 

 The data is likedata.JPG

 

 

9 REPLIES 9
Rick_SAS
SAS Super FREQ

Could you please post the sample data as a DATA step instead of as an image? Thanks.

Rahulpaul
Calcite | Level 5

Hi @Rick_SAS sorry for not doing the same earlier here is the simulated data step;

 

options msglevel=i;
options nocenter;
options formchar= "|----|+|---+=|-/\<>*";
options linesize=200 pagesize=100;
options mprint;
options source source2;
options compress=yes;

libname this "."; /* Current directory - used for storing temporary datasets to be used across programs */

%let gl_outputstem=%scan(%sysfunc(getoption(sysin)),-2,%str(./\));
%let gl_outputstem=&gl_outputstem.-dt-%sysfunc(putn(%sysfunc(date()),e8601da.));

ods noproctitle;


options orientation=portrait papersize=A4;
ods graphics on / width=18cm height=22cm ANTIALIASMAX=1000 border=no /* IMAGEFMT=EMF */; /* size for portrait */

*options orientation=landscape papersize=A4;
*ods graphics on / width=25cm height=12cm ANTIALIASMAX=1000 border=no /* IMAGEFMT=EMF */; /* size for landscape */

footnote "DRAFT, CONFIDENTIAL - %scan(%sysfunc(getoption(sysin)),-2,%str(./\))";


ods pdf file="&gl_outputstem..pdf" style=pearl;
ods pdf exclude all;

options linesize=180;

data parameter_scenarios;
parameter_scenario_no=0;

parameter_scenario_no+1;

N_subject= 50;
N_simul= 500;

mu1= 2.97;
mu2= 2.97;
mu1= 2.3;
mu2= 2.3;
/* The square root of the variance */
sigma11= (3.11 - 2.97) / 1.96 * sqrt(22); /* page 4 in Bernardo, 2016 - around 0.34 */
sigma22= sigma11;
/* Assuming independence between subjects measurements, so no parameter correlation */
/* This is to simple, but will have to do for now.
Better: simulate person parameters period 1 and period 2 with
correlation between period 1 and 2. Add variation for each measurement.
Use Cholevsky decomposition of variance matrix to simulate 2-dim normal distributed data.
*/
output;

parameter_scenario_no+1;
mu2= 2.00;
output;

run;

proc print data=parameter_scenarios;
run;

data simul;
if _n_=1 then do;
call streaminit( 22021561 );
end;

set parameter_scenarios;

do simul_seqno=1 to N_simul;
do subjidn=1 to N_subject;
do PeriodN=1 to 2;
do visitn=PeriodN*100+1 to PeriodN*100+21;
if periodN=1 then do;
aval= RAND('NORMAL', mu1, sigma11 ) ;
end;
else do;
aval= RAND('NORMAL', mu2, sigma22 ) ;
end;
keep parameter_scenario_no;
keep simul_seqno;
keep subjidn;
keep PeriodN;
keep visitn;
keep aval;
output;
end;
end;
end;
end;
run;

data selected_at_w_1_3_6;
set simul;
by parameter_scenario_no simul_seqno subjidn PeriodN visitn;
where (periodN=1 and visitn in (%eval(101+3), %eval(101+10), %eval(101+20))) or
(periodN=2 and visitn in (%eval(201+3), %eval(201+10), %eval(201+20)));
IsPeriod2= (periodN=2);
Succes_limit=2; /* <-- change here MJA 2018-04-27 */
aval01= aval>= Succes_limit;
run;

 

Any Help in this regards will be very kind of you!!

 

Rick_SAS
SAS Super FREQ

I don't know why your model does not converge, but here are a few thoughts:

1. Usually the simulation generates data from the model that is being fit.  You define the response as  

aval01= (aval>= 2);

whereas I expected to see

aval01 = logistic(b1*(1-IsPeriod2) + b2*IsPeriod2);

with known values of b1 and b2. By using a hard cutoff value, you are trying to fit a model the does not fit the data, which can often lead to lack of convergence.

2. You should use the BOUNDS statement to makes sure g11 ad g22 > 0.

3. You can use Start to Stop by Increment statement to try to find a good starting parameter value. For example:

parms mu1=0.1 to 1.8 by 0.2
mu2=0.1 to 1.4 by 0.2
g11=0.1
g21=-0.2 to 0.2 by 0.1
g22=0.1;

 

As I said, I don't know why your model doesn't converge, but in general there is no reason to expect that the model will converge when the data are not drawn from the model..

Rahulpaul
Calcite | Level 5

HI @Rick_SAS thanks for your inputs on the code however here are my reasoning for the data simulation:

1) With aval01= (aval>= 2) I am creating a binary variable which is based on the aval which has a success limit of 2.

 

 2) I have used the same data with proc glimmix the code below:

proc print data=selected_at_w_1_3_6 width=min;
where also subjidn=1;
where also simul_seqno=1;
by parameter_scenario_no;
id parameter_scenario_no;
var subjidn visitn aval IsPeriod2 aval01;
run;

proc sgpanel data=selected_at_w_1_3_6;
where also simul_seqno=1;
panelby parameter_scenario_no / novarname LAYOUT=COLUMNLATTICE;
scatter x=visitn y=subjidn / group=aval01;
run;

title2 "glimix";
proc glimmix data=selected_at_w_1_3_6 METHOD=QUAD;
by parameter_scenario_no;
where also parameter_scenario_no=2;
where also simul_seqno=1;
class subjidn;
class PeriodN;
model aval01= IsPeriod2 / dist=binomial link=logit solution;;
random PeriodN / subject=subjidn type=un;
run;

 

Here I am able to get the estimates of the covariances and I want to use these estimates to calculate the power and sample size of the study? Any Idea? 

 

 

Rick_SAS
SAS Super FREQ

If you have faith in the GLIMMIX model, you can try using the parameter estimates from that procedure in the NLMIXED procedure.

However, when I run your code I get

NOTE: Estimated G matrix is not positive definite

and see covariance parameter estimates that look suspect:

 

Covariance Parameter Estimates
Cov Parm Subject Estimate Standard
Error
UN(1,1) subjidn 0 .
UN(2,1) subjidn -0.3560 .
UN(2,2) subjidn 0

.

 

You might want to look at the paper "Tips and Strategies for Mixed Modeling with SAS/STAT® Procedures" to see if any of the ideas in there are applicable to your situation.

 

I don't have any additional ideas, but maybe someone like @sld will be able to provide insight.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

I agree with @Rick_SAS that the covariance parameter estimates indicate a serious problem with the model and/or data. Missing SEs are always a bad sign.

 

Consider the output from this code:

 

data test;
 set selected_at_w_1_3_6;
 where also parameter_scenario_no=2;
 where also simul_seqno=1;
 run;
proc tabulate data=test;
class subjidn periodn isperiod2 aval01;
table periodn, isperiod2;
table subjidn, periodn*isperiod2;
table subjidn, periodn*aval01;
run;
proc glimmix data=test method=laplace;
class subjidn periodn isperiod2;
model aval01 = isperiod2 / dist=binary;
random intercept isperiod2 / subject=subjidn;
lsmeans isperiod2 / ilink;
run;

PeriodN and IsPeriod2 contain the same information, just named differently; using both in the model specification seems unnecessary. I would not assume by default that you need a covariance structure for heterogeneous variances; for the binary distribution, variance is a function of the mean, and you would have to ponder what heterogenous variances between periods might mean before you built it into the model.

 

There are 3 observations for each subjidn at each level of PeriodN or ISPeriod2.  I'm guessing that you have 3 binary trials clustered within each period in each subject. You could restructure your dataset so that you could use a binomial distribution (n_successes/n_events) or specify clustering of the binary observations as in the code above.

 

I'm not inclined to spend time working my way through your simulation code, but my suspicion is that you have not incorporated variance at the subjidn level or at the period-within-subjidn level, and consequently the variance estimates for these two design units are nonexistent and thus set to zero by GLIMMIX. If my assumption is not correct, you can let me know and I'll give it some more thought. 

 

Rick_SAS
SAS Super FREQ

@sld nailed it. I hadn't noticed the form of the GLIMMIX model because the NLMIXED model didn't have that mistake. 

 

My last advice: Running a simulation should start with writing down the model you are trying to generate in terms of [i,jk] notation, such as the response for the i_th patient in the j_th period and the k_th visit. That model will help you to write the simulation as well as to write the syntax to fit the model. Then generate ONE sample and work out all the difficulties BEFORE you add the outermost 'simulation loop'.  That is the approach I advocate in my book Simulating Data with SAS, although I only provide a few examples for mixed models.

Rahulpaul
Calcite | Level 5

Hi, @Rick_SAS thanks for all the help regarding the model and error. Really appreciate it a lot.

Rahulpaul
Calcite | Level 5

Hi @sld Thanks for the insight on the simulation data, I have taken your advice and have tried to change the data now. Hope to get a positive result now. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2507 views
  • 1 like
  • 3 in conversation