BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AdamSmith
Calcite | Level 5

All,

I'm trying to extend the following macro (from Peter Song, Univ Michigan; http://www-personal.umich.edu/~pxsong/qif_package/QIFv02.sas), which fits a quadratic inference function (QIF; a type of marginal generalized linear model akin to generalized estimating equations), to accommodate a negative binomial distribution.  The macro is not too long, but I won't copy it here since I've provided a link.  Fair warning: I'm brand spanking new the world of IML, so (1) please pardon my ignorance, and (2) it's okay to let me know I'm in over my head...

Problem 1:  Implementing a negative binomial distribution QIF

Proposed solution: The negative binomial application should be similar to the Poisson except for the specification of the variance function.  Thus, in every place where a Poisson distribution is referenced in the macro, I've added a new section corresponding to the NEGBIN.  These sections are identical in all cases (e.g., the calculation of pearson and deviance residuals, the calculation of ui) except in how the variance function is defined.  For the Poisson, this particular section of the macro looks like this:

            %else %if &dist = POISSON  %then %do;

                ui = exp( (xi*beta) );

                fui = log(ui);

                fui_dev = diag(ui);

                vui = diag(sqrt(1/ui));

            %end;

I think it should look like this for NB:

            %else %if &dist = NEGBIN %then %do;

                ui = exp( (xi*beta) );

                fui = log(ui);

                fui_dev = diag(ui)*diag(1+ui);

                vui = diag(sqrt(1/ui))*diag(sqrt(1/(1+ui)));
            %end;

Problem 2: Rounding errors preclude calculation of SVD and GINV

Proposed solution: ??? 

Some (hopefully) relevant log output...

ERROR: No convergence of singular value decomposition due to rounding errors.

ERROR: Execution error as noted previously. (rc=100)

operation : GINV at line 5471 column 1

operands  : arsumc

arsumc     30 rows     30 cols    (numeric)

statement : ASSIGN at line 5471 column 1

I've attached a *.SAS file containing the modified macro, my data, and the particular QIF call that produced the above error.  I used the Poisson in this call to avoid any potential errors in my adaptation to the negative binomial...

Thanks very much for any help,


Adam Smith

Department of Natural Resources Science

University of Rhode Island

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

In the example that you provide, the arsumc matrix is a 30x30 matrix with most elements about 1E130.

That's the source of the error reported by GINV.

if iteration=1 then do;

   _min = min(arsumc);

   _max = max(arsumc);

   _mean = arsumc[:];

  print  _min    _max   _mean;

end;

_min=6113822.2

_max=1.589E133

_mean=2.959E131

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

The neqative binomial (NB) model should have a parameter, k, so I don't think your formula is correct.

Your first question is statistical, rather than having to do with matrices and linear algebra. I'm not an expert in GEEs or using generalized linear models, but I discussed this with a colleague. As best we can tell, the macro writer is using the following definitions for a model with link function g():

Ui: mean

Fui: ginv(mean), i.e. linear predictor

Fui_dev: diagonal matrix of weights for Fisher scoring = 1/(v(mu)*dg(mu)**2) (not sure about this one: gamma has negative sign?)

Vui: diagonal matrix of inverse of square root of variance.

If these are right, for a log-linked NB with dispersion parameter k, you might try

Ui = xi*beta

Fui = log(ui)

Fui_dev = diag(ui/(1+k*ui))

Vui = diag( 1/sqrt(ui+k*ui##2))

As I've said, this is a guess. You might get better answers from the SAS Discussion Forum on SAS/STAT and Statistical Procedures.

For your second question, I'm away from my office, so can't reproduce the error. Perhaps when you correct specify the NB model, this second error will go away. Or someone else might be able to help.

AdamSmith
Calcite | Level 5

Hi Rick,

Thanks very much for your reply.  I'm hoping I can feel my way through this...

The neqative binomial (NB) model should have a parameter, k, so I don't think your formula is correct.

Typically, yes, I agree.  However, when the REPEATED statement is used in GENMOD (invoking the GEE), ML estimates of the scale (or dispersion for NegBin) disappear, perhaps into the "nuisance" variation associated with the clustering?   If k is required, however, I don't have any idea how to get ML estimates of it in IML?

As best we can tell, the macro writer is using the following definitions for a model with link function g():

Ui: mean

Fui: ginv(mean), i.e. linear predictor

Fui_dev: diagonal matrix of weights for Fisher scoring = 1/(v(mu)*dg(mu)**2) (not sure about this one: gamma has negative sign?)

Vui: diagonal matrix of inverse of square root of variance.

Agree in Ui and Fui.  I'll have to defer to you on the others, although the negative sign in the gamma confused me as well.

If these are right, for a log-linked NB with dispersion parameter k, you might try

Ui = xi*beta

Fui = log(ui)

Fui_dev = diag(ui/(1+k*ui))

Vui = diag( 1/sqrt(ui+k*ui##2))

Tried it, and understandably it's looking for a matrix "k".  I'll run it by the other forum as well.

For your second question, I'm away from my office, so can't reproduce the error. Perhaps when you correct specify the NB model, this second error will go away. Or someone else might be able to help.

Alas, no.  It's not specific to the NegBin, but occurs with other distributions (e.g., Poisson) as well.

Thanks again,
Adam

Rick_SAS
SAS Super FREQ

In the example that you provide, the arsumc matrix is a 30x30 matrix with most elements about 1E130.

That's the source of the error reported by GINV.

if iteration=1 then do;

   _min = min(arsumc);

   _max = max(arsumc);

   _mean = arsumc[:];

  print  _min    _max   _mean;

end;

_min=6113822.2

_max=1.589E133

_mean=2.959E131

AdamSmith
Calcite | Level 5

Thanks Rick...  arsumc gets out of hand quickly...on iteration 1 in fact.  I've been running through the meat of the macro variable by variable, and looked into the math behind QIF as best I can (link here, if anyone is interested), and I don't think (1) that the macro is up for my specific GEE model (i.e., NegBin with an offset term) and (2) I've got a handle on the linear algebra or IML coding to tackle it.  Guess I'll look into alternatives.

Thanks so much for the help.

Rick_SAS
SAS Super FREQ

One general bit of advice: it is usually a poor idea to interlace macro code and IML. It is almost always unnecessary, and it makes debugging a real pain. By restructuring the logic, you can usually avoid %IF/%THEN and other macro logic and use IML statements instead.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 5 replies
  • 1521 views
  • 3 likes
  • 2 in conversation