Statistical programming, matrix languages, and more

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

All,

I'm trying to extend the following macro (from Peter Song, Univ Michigan; http://www-personal.umich.edu/~pxsong/qif_package/QIFv02.sas), which fits a quadratic inference function (QIF; a type of marginal generalized linear model akin to generalized estimating equations), to accommodate a negative binomial distribution.  The macro is not too long, but I won't copy it here since I've provided a link.  Fair warning: I'm brand spanking new the world of IML, so (1) please pardon my ignorance, and (2) it's okay to let me know I'm in over my head...

Problem 1:  Implementing a negative binomial distribution QIF

Proposed solution: The negative binomial application should be similar to the Poisson except for the specification of the variance function.  Thus, in every place where a Poisson distribution is referenced in the macro, I've added a new section corresponding to the NEGBIN.  These sections are identical in all cases (e.g., the calculation of pearson and deviance residuals, the calculation of ui) except in how the variance function is defined.  For the Poisson, this particular section of the macro looks like this:

            %else %if &dist = POISSON  %then %do;

                ui = exp( (xi*beta) );

                fui = log(ui);

                fui_dev = diag(ui);

                vui = diag(sqrt(1/ui));

            %end;

I think it should look like this for NB:

            %else %if &dist = NEGBIN %then %do;

                ui = exp( (xi*beta) );

                fui = log(ui);

                fui_dev = diag(ui)*diag(1+ui);

                vui = diag(sqrt(1/ui))*diag(sqrt(1/(1+ui)));
            %end;

Problem 2: Rounding errors preclude calculation of SVD and GINV

Proposed solution: ??? 

Some (hopefully) relevant log output...

ERROR: No convergence of singular value decomposition due to rounding errors.

ERROR: Execution error as noted previously. (rc=100)

operation : GINV at line 5471 column 1

operands  : arsumc

arsumc     30 rows     30 cols    (numeric)

statement : ASSIGN at line 5471 column 1

I've attached a *.SAS file containing the modified macro, my data, and the particular QIF call that produced the above error.  I used the Poisson in this call to avoid any potential errors in my adaptation to the negative binomial...

Thanks very much for any help,


Adam Smith

Department of Natural Resources Science

University of Rhode Island

Attachment

Accepted Solutions
Solution
‎01-10-2012 03:35 PM
SAS Super FREQ
Posts: 3,386

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

In the example that you provide, the arsumc matrix is a 30x30 matrix with most elements about 1E130.

That's the source of the error reported by GINV.

if iteration=1 then do;

   _min = min(arsumc);

   _max = max(arsumc);

   _mean = arsumc[:];

  print  _min    _max   _mean;

end;

_min=6113822.2

_max=1.589E133

_mean=2.959E131

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,386

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

The neqative binomial (NB) model should have a parameter, k, so I don't think your formula is correct.

Your first question is statistical, rather than having to do with matrices and linear algebra. I'm not an expert in GEEs or using generalized linear models, but I discussed this with a colleague. As best we can tell, the macro writer is using the following definitions for a model with link function g():

Ui: mean

Fui: ginv(mean), i.e. linear predictor

Fui_dev: diagonal matrix of weights for Fisher scoring = 1/(v(mu)*dg(mu)**2) (not sure about this one: gamma has negative sign?)

Vui: diagonal matrix of inverse of square root of variance.

If these are right, for a log-linked NB with dispersion parameter k, you might try

Ui = xi*beta

Fui = log(ui)

Fui_dev = diag(ui/(1+k*ui))

Vui = diag( 1/sqrt(ui+k*ui##2))

As I've said, this is a guess. You might get better answers from the SAS Discussion Forum on SAS/STAT and Statistical Procedures.

For your second question, I'm away from my office, so can't reproduce the error. Perhaps when you correct specify the NB model, this second error will go away. Or someone else might be able to help.

New Contributor
Posts: 4

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

Hi Rick,

Thanks very much for your reply.  I'm hoping I can feel my way through this...

The neqative binomial (NB) model should have a parameter, k, so I don't think your formula is correct.

Typically, yes, I agree.  However, when the REPEATED statement is used in GENMOD (invoking the GEE), ML estimates of the scale (or dispersion for NegBin) disappear, perhaps into the "nuisance" variation associated with the clustering?   If k is required, however, I don't have any idea how to get ML estimates of it in IML?

As best we can tell, the macro writer is using the following definitions for a model with link function g():

Ui: mean

Fui: ginv(mean), i.e. linear predictor

Fui_dev: diagonal matrix of weights for Fisher scoring = 1/(v(mu)*dg(mu)**2) (not sure about this one: gamma has negative sign?)

Vui: diagonal matrix of inverse of square root of variance.

Agree in Ui and Fui.  I'll have to defer to you on the others, although the negative sign in the gamma confused me as well.

If these are right, for a log-linked NB with dispersion parameter k, you might try

Ui = xi*beta

Fui = log(ui)

Fui_dev = diag(ui/(1+k*ui))

Vui = diag( 1/sqrt(ui+k*ui##2))

Tried it, and understandably it's looking for a matrix "k".  I'll run it by the other forum as well.

For your second question, I'm away from my office, so can't reproduce the error. Perhaps when you correct specify the NB model, this second error will go away. Or someone else might be able to help.

Alas, no.  It's not specific to the NegBin, but occurs with other distributions (e.g., Poisson) as well.

Thanks again,
Adam

Solution
‎01-10-2012 03:35 PM
SAS Super FREQ
Posts: 3,386

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

In the example that you provide, the arsumc matrix is a 30x30 matrix with most elements about 1E130.

That's the source of the error reported by GINV.

if iteration=1 then do;

   _min = min(arsumc);

   _max = max(arsumc);

   _mean = arsumc[:];

  print  _min    _max   _mean;

end;

_min=6113822.2

_max=1.589E133

_mean=2.959E131

New Contributor
Posts: 4

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

Thanks Rick...  arsumc gets out of hand quickly...on iteration 1 in fact.  I've been running through the meat of the macro variable by variable, and looked into the math behind QIF as best I can (link here, if anyone is interested), and I don't think (1) that the macro is up for my specific GEE model (i.e., NegBin with an offset term) and (2) I've got a handle on the linear algebra or IML coding to tackle it.  Guess I'll look into alternatives.

Thanks so much for the help.

SAS Super FREQ
Posts: 3,386

Negative binomial in IML, and rounding error difficulties when computing GINV (and thus SVD)

One general bit of advice: it is usually a poor idea to interlace macro code and IML. It is almost always unnecessary, and it makes debugging a real pain. By restructuring the logic, you can usually avoid %IF/%THEN and other macro logic and use IML statements instead.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 545 views
  • 3 likes
  • 2 in conversation