Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-29-2022 11:23 AM
(589 views)

I am comparing arrays and need to take values from array A and find the index of the matching value or next lower value in array B. B is ordered if that helps the solution.

So using LOC would give the the index of the exact match, but if there isn't an exact match, I want the next lower value.

A = {10, 3, 19}

B= {1, 3, 7, 42}

Wanted Result = {3, 2, 3}

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

OK, then you want the largest index of B for which B <= A[i] for each element of A.

This might be empty, so you also need to handle that case:

```
proc iml;
A = {10, 3, 19, 0};
B= {1, 3, 7, 42}; /* assume B is sorted in increasing order */
Result = j(nrow(A), 1, .);
do i = 1 to nrow(A);
idx = loc(B <= A[i]);
if ncol(idx) > 0 then /* did we find any? */
Result[i] = idx[ ncol(idx) ]; /* keep the largest */
end;
print A Result;
```

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

OK, then you want the largest index of B for which B <= A[i] for each element of A.

This might be empty, so you also need to handle that case:

```
proc iml;
A = {10, 3, 19, 0};
B= {1, 3, 7, 42}; /* assume B is sorted in increasing order */
Result = j(nrow(A), 1, .);
do i = 1 to nrow(A);
idx = loc(B <= A[i]);
if ncol(idx) > 0 then /* did we find any? */
Result[i] = idx[ ncol(idx) ]; /* keep the largest */
end;
print A Result;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think you can avoid the need to check for an empty result if you vectorize it as follows:

```
A = {10, 3, 19} ;
B= {1, 3, 7, 42};
result = (t(B) <= repeat(A,1,nrow(B)))[,+];
print result;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes and no. Since 0 is not a valid index, you still have to check if you intend to use these values to index into the B vector.

I thought about vectorization. It requires more memory instead of the loop, so it really depends on the size of these vectors. Here's the code that I wrote but did not post:

```
/* replace with loops with matrix computations (requires more memory */
MA = repeat(A`, nrow(B), 1); /* j_th column is A[j] */
MB = repeat(B, 1, nrow(A)); /* i_th row is B[i] */
L = (MB <= MA); /* 1 iff B[i] <= A[j] */
print L;
Result = L[+,]; /* sum the elements for which B[i] <= A[j] */
print Result;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Indeed size matters, so you need to be aware that nrow(A)*nrow(B) is not too large if vectorization is to be used. I compared

`result = (t(B) <= repeat(A,1,nrow(B)))[,+];`

to

`result = (repeat(t(B),nrow(A),1) <= repeat(A,1,nrow(B)))[,+];`

using FULLSTIMER and I am seeing that the latter uses about 50% more memory for a moderately large problem (10000 x 32), so there is some saving comparing vector to matrix, over matrix to matrix.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you! Hours spent on this, but alot of learning...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Rick_SAS so the original requirement shifted slightly, wondering if it can be a slight mod to this. I am taking these index values and using them to find the value in another vector at those indecies. Now, the requirement is if there isn't an exact match, use the higher and lower value in the other vector.

So for example, since 10 and 19 are not exact matches, I would average the 3rd and 4th values in the other vector. Can you please help with that?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I suspect you can solve the new problem if you give it some effort.

There are two "edge cases" that you need to worry about because there are no "adjacent values" to average:

- What happens if an element of A is less than all elements of B? Is the result missing? Or maybe the 1st element of the Other vector?
- What happens if an element of A is greater all elements of B? Is the result missing? Or maybe the last element of the Other vector?

In the program below, I've coded one way to handle these edge cases, but you might want to ask your client what they prefer.

```
proc iml;
A = {10, 3, 19, 0, 45};
B= {1, 3, 7, 42}; /* assume B is sorted in increasing order */
Other = {100, 200, 300, 400, 500}; /* assume Other is same size as B */
nB = nrow(B);
nA = nrow(A);
Result = j(nA, 1, .);
do i = 1 to nA;
idx = loc(B = A[i]);
if ncol(idx) > 0 then do; /* did we find an exact match? */
Result[i] = Other[idx]; /* if so, index into the Other vec */
end;
else do; /* not an exact match; average adjacent vals */
idx = loc(B <= A[i]);
/* 3 cases: EMPTY, the last index, or valid index that is not last */
if ncol(idx)=0 then /* A[i] < all elements of B */
Result[i] = .; /* or you could use Other[1] instead */
else do;
if all(idx < nB) then do; /* average the Other elements */
j = idx[ ncol(idx) ];
Result[i] = (Other[j] + Other[j+1]) / 2;
end;
else
Result[i] = Other[nB]; /* or missing? */
end;
end;
end;
print A Result;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.