Statistical programming, matrix languages, and more

Should the COMB function allow r>n?

Reply
Frequent Contributor
Posts: 122

Should the COMB function allow r>n?

Just for fun I wrote the IML module below to return an n x n Pascal matrix.  The lower triangular part of the matrix contains Pascal's triangle, and it is zero above the main diagonal.

start Pascal( n );
  a = row( j(n, n) ) >= col( j(n, n) );
  return( a # comb( (row(a) - 1) # a, (col(a) - 1) # a ) );
finish;

It occurred to me that the COMB function makes this much harder than it should be, as ideally I would like COMB(n,r) to return zero when r>n, rather than returning a missing value. Compare this to Wolfram Alpha which does return zero, as it evaluates factorials of negative integers to be infinite (Google does the same when you enter something like '1 choose 2').

If you would like to try the module, the Pascal matrix certainly has an interesting inverse and its back-diagnonals sum to the Fibonacci numbers.

SAS Super FREQ
Posts: 3,234

Re: Should the COMB function allow r>n?

An interesting question. Thanks for asking. I will let others weigh in before I post my opinions.

Although you do not have a programming question, I also think that your function is very interesting. You have created a single elegant matrix expression that evaluates n##2 quantities. However, in this case the elegance is also causing a problem, since you have to use extra care to avoid invalid arguments.

Although I like your concise formulation, an alternative programming solution is to use a simple loop, which avoids the situation that you have broached:

start PascalTriangle(n);

   m = j(n,n,0);    /* pad with zeros */

   do k = 1 to n;

      m[k,1:k] = comb(k-1, 0:k-1);  /* fill nonzero elements */

   end;

   return(m);

finish;

There is also a way to form the triangle by using a single vector computation without a loop:

start BinomCoef(n);

   m = shape(1:n##2, n, n);

   idx = vech(m);

   row = row(m)-1;

   col = col(m)-1;

   P = j(n,n,0);

   P[idx] = comb( row[idx], col[idx] );

   return(P);

finish;

Both algorithms are very fast. Notice that you can't compute a Pascall triangle with more than 1029 rows because the binomial coefficients overflow.

Grand Advisor
Posts: 9,748

Re: Should the COMB function allow r>n?

I would say NO to the question. The COMB function is basically the N choose R. This would be a different function entirely as I'm not sure how a negative factorial should be calculated (in the classic N choose R definition).

And what is the interpretation of the function when R>N? Possibly a new function but the utility in general is ??

SAS Super FREQ
Posts: 3,234

Re: Should the COMB function allow r>n?

I think Ian's interpretation is the r_th binomial coefficient. That is, the coefficient of x^r in the expansion of (x+1)^n.

Frequent Contributor
Posts: 122

Re: Should the COMB function allow r>n?

Yes, I do like Rick's suggestion of the zero coefficients for the higher powers, plus there is a completely natural way of getting there using the gamma function. This, I imagine, is the mechanism behind the Wolfram Alpha and Google implementations of the n choose r function.  If I write my own version of the COMB function:

start icomb(n, r);
  return( round( gamma(n + 1) / (gamma(r + 1) # gamma(n - r + 1 + 1E-12))));
finish;

all I need to do is add on a tiny amount to prevent any attempt at evaluating the gamma function exactly at zero, or at one of the negative integers, round the result and I have a function that can handle r>n.  Then I can have that elusive one line solution for the generation of the Pascal matrix.

Start Pascal( n );
  return( icomb( row(j(n,n)) - 1, col(j(n,n)) - 1 ) );
finish;

print ( Pascal(12) ) [f=3.0];

As for utility of the modified function, I admit it is limited as I have been using the COMB function for some years and this is the first time I have tried using it with r>n, but my expectation was that it would perform like ICOMB above.

ICOMB may not be robust and only work with smaller n, and my original Pascal function contains 3n^2 unnecessary multiplications in its desperate attempt to retain 'elegance' while circumventing the invalid argument issues, so thanks to Rick for the alternative suggestions.  Actually I prefer the shorter function with the loop rather than the fully vectorized version!

Post a Question
Discussion Stats
  • 4 replies
  • 695 views
  • 1 like
  • 3 in conversation