Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-18-2015 11:59 AM

Dear ALL,

I am testing some code to compute the variance via Taylor series linearization for a stratified cluster design. However my answer is only correct up to 2 significant digits. Can anyone offer suggestions as to why my answer is not exactly the same as when done with PROC SURVEYREG. See link for formulas SAS/STAT(R) 9.22 User's Guide. I initially thought that the inv( ) function was the culprit but that does not seem to be the case here.

/* TEST CODE */

proc surveyreg data=exexx;

model ptincom = cage white;

weight suppwgt;

cluster keyfitz;

strata strata;

output out = simex residual = res;

run;

proc iml;

use simex;

read all var {cage white} into X;

read all var {ptincom} into y;

read all var {suppwgt} into w;

read all var {strata} into strata;

read all var {keyfitz} into psu;

read all var {res} into e;

close simex;

X = J(nrow(y),1) || X;

M = J(3,3,0);

ustrata = unique(strata);

do h = 1 to ncol(ustrata); /* strata index*/

hx = loc(strata = ustrata

ww = w[hx];

ee = e[hx];

XX = X[hx,];

clx = psu[hx];

uclx = unique(clx);

U = J(3,ncol(uclx),.);

do i = 1 to ncol(uclx); /* cluster (PSU) index*/

dx = loc(clx = uclx*);*

d = ww[dx]#ee[dx];

P = t(d#XX[dx,]);

eb = P[,+]; /* (p x 1) sum columns */

U[,i] = eb;

end;

Z = U[,:]; /* row means */

nh = ncol(U); /* number of PSUs in each stratum */

R = nh/(nh - 1)*(U - Z)*t(U - Z);

M = M + R;

end;

n = nrow(X);

M = (n-1)/(n-3)*M;

B = t(X)*(w#X); /* X' D X where D = diag(w) */

V = inv(B)*M*inv(B); /* covariance estimation */

se = sqrt(vecdiag(V)); /* estimated standard error */

print se;

quit;

Regards,

Raphael

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to R_Fraser

05-19-2015 03:07 PM

Interesting program. I am not an expert in survey statistics, but I don't see anything obviously wrong. Do you have any missing values in your data? Any degenerate strata? That might affect DOF computations.

There is a SAS macro that does computations similar to SURVEYREG. You might try the macro and see what results it gives. If it matches the PROC, then look at the code it calls and see if you can find where it differs from your code.

You didn't include data, so no way for anyone to run your code. The following simulation code creates a data set that may or may not be similar to your data. For this data set, the standard errors from SURVEYREG and IML match, so I assume that there is something special about your data.

data exexx;

call streaminit(1234);

do strata = 1 to 5;

do keyfitz = 1 to 4;

do i = 1 to 3+rand("poisson",5);

cage=rand("Normal");

white=rand("Normal");

ptincom=3+0.5*cage-1.9*white + rand("Normal");

suppwgt=strata+10;

output;

end;

end;

end;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

05-31-2015 10:50 AM

It turns that the problem is with the data and not the code. Thank you so much.

Raphael

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to R_Fraser

05-19-2015 03:37 PM

Do any of the variables used in the surveyreg except the weight variable have a format assigned? If so, might it cause some grouping?

If you have values in the data of 12345.678 and 12345.999 and the format is f5. then both could be treated as 12345 in any calculations.