BookmarkSubscribeRSS Feed
R_Fraser
Calcite | Level 5

Dear ALL,

I am testing some code to compute the variance via Taylor series linearization for a stratified cluster design. However my answer is only correct up to 2 significant digits. Can anyone offer suggestions as to why my answer is not exactly the same as when done with PROC SURVEYREG. See link for formulas SAS/STAT(R) 9.22 User's Guide. I initially thought that the inv( ) function was the culprit but that does not seem to be the case here.

/* TEST CODE */

proc surveyreg data=exexx;

  model ptincom = cage white;

  weight suppwgt;

  cluster keyfitz;

  strata strata;

  output out = simex residual = res;

run;

proc iml;

use simex;

read all var {cage white} into X;

read all var {ptincom} into y;

read all var {suppwgt} into w;

read all var {strata} into strata;

read all var {keyfitz} into psu;

read all var {res} into e;

close simex;

X = J(nrow(y),1) || X;

M = J(3,3,0);

ustrata = unique(strata);

do h = 1 to ncol(ustrata);                /* strata index*/

       hx = loc(strata = ustrata);

       ww = w[hx];

       ee = e[hx];

       XX = X[hx,];

       clx = psu[hx];

       uclx = unique(clx);

       U = J(3,ncol(uclx),.);

       do i = 1 to ncol(uclx);           /* cluster (PSU) index*/

            dx = loc(clx = uclx);

            d = ww[dx]#ee[dx];

            P = t(d#XX[dx,]);

            eb = P[,+];                /* (p x 1) sum columns */

            U[,i] = eb;

       end;

  Z = U[,:];             /* row means */

  nh = ncol(U);           /* number of PSUs in each stratum */

  R = nh/(nh - 1)*(U - Z)*t(U - Z);

  M = M + R;

end;

n = nrow(X);

M = (n-1)/(n-3)*M;

B = t(X)*(w#X); /* X' D X where D = diag(w) */

V = inv(B)*M*inv(B); /* covariance estimation  */

se = sqrt(vecdiag(V)); /* estimated standard error */

print se;

quit;

Regards,

Raphael

3 REPLIES 3
Rick_SAS
SAS Super FREQ

Interesting program.   I am not an expert in survey statistics, but I don't see anything obviously wrong.  Do you have any missing values in your data? Any degenerate strata? That might affect DOF computations.

There is a SAS macro that does computations similar to SURVEYREG. You might try the macro and see what results it gives.  If it matches the PROC, then look at the code it calls and see if you can find where it differs from your code.

You didn't include data, so no way for anyone to run your code.  The following simulation code creates a data set that may or may not be similar to your data. For this data set, the standard errors from SURVEYREG and IML match, so I assume that there is something special about your data.

data exexx;

call streaminit(1234);

do strata = 1 to 5;

   do keyfitz = 1 to 4;

      do i = 1 to 3+rand("poisson",5);

         cage=rand("Normal");

         white=rand("Normal");

         ptincom=3+0.5*cage-1.9*white + rand("Normal");

         suppwgt=strata+10;

         output;

      end;

   end;

end;

run;

R_Fraser
Calcite | Level 5

It turns that the problem is with the data and not the code. Thank you so much.

Raphael

ballardw
Super User

Do any of the variables used in the surveyreg except the weight variable have a format assigned? If so, might it cause some grouping?

If you have values in the data of 12345.678 and 12345.999 and the format is f5. then both could be treated as 12345 in any calculations.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 3 replies
  • 1008 views
  • 7 likes
  • 3 in conversation