## Vectorized solution to finding *first* element in a row meeting criterion

Greetings!

I have a  n x p matrix of probabilities called Z.

I'd like to identify the first column within each row of Z where the element < 0.1.

These column subscripts will be output to a n x 1 vector called R composed of the column number from Z meeting the criterion within each row.

If the criterion is not met within a row, then insert a 0 for that row.

example:

Z = {0.2 0.3 0.05,

0.01 0.01 0.01,

0.2 0.3 0.5}

R={3,1,0}

I can think of do-loop approaches, but Z is potentially very large (millions) and I would like a vectorized solution. The origin of this problem is Bayesian sample size analysis for sequential updating, as in http://www.fharrell.com/post/bayes-seq/

Garnett

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Vectorized solution to finding *first* element in a row meeting criterion

There's probably a more efficient solution but this resets the rows of R to 0 where no elements < 0.1

``````proc iml;
Z = {0.2 0.3 0.05,
0.01 0.01 0.01,
0.2 0.3 0.5,
0.2 0.3 0.5};         /* added extra row with no elements < 0.1 */

G = (Z < 0.1);             /* binary indicator matrix */

zrows = loc(G[ , +] = 0);  /* vector of rows in G with no elements < 0.1 */

R = G[ , <:>];             /* index of max value in each row */

R[zrows,] = 0;             /* reset R to 0 where no elements < 0.1 */
print R;
quit;``````
5 REPLIES 5

## Re: Vectorized solution to finding *first* element in a row meeting criterion

Interesting question. You can form the binary indicator matrix for the condition you want to detect, then use the row maximum operator to return the first '1' in each row.

``````proc iml;
Z = {0.2 0.3 0.05,
0.01 0.01 0.01,
0.2 0.3 0.5};
G = (Z < 0.1); /* binary indicator matrix */
R = G[ , <:>]; /* index of max value in each row */

``````

This should easily handle millions of rows.

## Re: Vectorized solution to finding *first* element in a row meeting criterion

Thanks, Rick!

It looks like the default behavior of <:> where there are 'ties' across columns is to index the first column in which the value appears.

Is that correct? If so, it's just what I need.

Also, the approach you give identifies a value of 1 for the row where none of the columns meet criterion. I can work with this, but it will be difficult to distinguish the rows where the first column is truly the first appearance of the criterion, as opposed to rows where none of the columns meet criterion.

Thanks again!

## Re: Vectorized solution to finding *first* element in a row meeting criterion

Sorry about the emoji, I meant to write < : >

## Re: Vectorized solution to finding *first* element in a row meeting criterion

There's probably a more efficient solution but this resets the rows of R to 0 where no elements < 0.1

``````proc iml;
Z = {0.2 0.3 0.05,
0.01 0.01 0.01,
0.2 0.3 0.5,
0.2 0.3 0.5};         /* added extra row with no elements < 0.1 */

G = (Z < 0.1);             /* binary indicator matrix */

zrows = loc(G[ , +] = 0);  /* vector of rows in G with no elements < 0.1 */

R = G[ , <:>];             /* index of max value in each row */

R[zrows,] = 0;             /* reset R to 0 where no elements < 0.1 */
print R;
quit;``````

## Re: Vectorized solution to finding *first* element in a row meeting criterion

That's it!

I'm more or less a novice at IML, and really need a better understanding of subscript reduction operators.

Thanks!

From The DO Loop