turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-03-2017 08:43 AM

I need to look-up the values from matrix 'a' in the first column of matrix 'b' and return the values in the *second* column of matrix 'b' to a new matrix named 'c'. The rows in matrix 'b' are a key-value pairs, basically.

This code achieves that, but I would like to know whether there is a vectorized aproach to this instead of a do loop.

```
proc iml;
a = {100 202 303,
107 200 303,
101 205 302,
102 200 300};
b = {100 1,
101 8,
102 1,
104 6,
107 4,
108 9,
200 4,
202 6,
205 2,
300 10,
301 5,
302 2,
303 8,
304 10};
/*the Do LOOP way*/
c = j(4,3,.);
do i = 1 to 4;
do j = 1 to 3;
idx = loc(b[,1] = a[i,j]);
c[i,j] = b[idx,2];
end;
end;
print c;
quit;
```

The expected output in matrix 'c':

c 1 6 8 4 4 8 8 2 2 1 4 10

Accepted Solutions

Solution

08-03-2017
09:50 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MDaniel

08-03-2017 09:48 AM

Ian deduced the problem correctly. I assumed that the number of elements that match was always NCOL(a).. In your random data, that assumption does not hold. If you need to handle the case where there are fewer elements that match than there are columns of a, use the following:

```
do i = 1 to nrow(a);
idx = loc( element(b[,1], a[i,]) ); /* location of elements in b[,1] that match a[i,] */
if ncol(idx)>0 then
c[i,1:ncol(idx)] = T( b[idx,2] ); /* copy elements from b[,2] */
end;
```

If you can have MORE elements that match than columns of a, then you either need to truncate the results to NCOL(a) elements of make some other modification.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MDaniel

08-03-2017 08:53 AM

Yes. You want to use the ELEMENT function to find the vector of locations in b[,1] that match the elements in a[i,] for each row of a.

You can read about the LOC-ELEMENT technique in the article "Finding observations that satisfy multiple conditions: The LOC-ELEMENT technique"

```
c = j(nrow(a),ncol(a),.);
do i = 1 to nrow(a);
idx = loc( element(b[,1], a[i,]) ); /* location of elements in b[,1] that match a[i,] */
c[i,] = T( b[idx,2] ); /* copy elements from b[,2] */
end;
print c;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

08-03-2017 09:38 AM

Thank you for you answer, Rick.

I am trying to extend and test your solution for a larger number of rows and columns for matrix 'a' and I ran into a problem. For 100 rows nad 5 columns, I receive this error: **ERROR: (execution) Matrices do not conform to the operation.**

Please see below for code that generates some random data - I'm still not fully familiar with ELEMENT function, but I can't find anything wrong with the code.

```
proc iml;
rows = 100;
cols = 5;
min = 100;
max = 1000;
call randseed(555);
a = j(rows,cols,.);
call randgen(a, "Uniform"); /* ~ U(0,1) */
a = min + floor( (1+Max-Min)*a );
x1 = min:max;
x1 = shape(x1,max-min+1,1);
x2 = j(max-min+1,1,.);
call randgen(x2, "Uniform"); /* ~ U(0,1) */
x2 = 10 + floor( (1+50-10)*x2 );
b = x1||x2;
/*vectorization*/
c = j(nrow(a),ncol(a),.);
do i = 1 to nrow(a);
idx = loc( element(b[,1], a[i,]) ); /* location of elements in b[,1] that match a[i,] */
c[i,] = T( b[idx,2] ); /* copy elements from b[,2] */
end;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MDaniel

08-03-2017 09:43 AM

I think Rick's solution assumed that each row of the matrix a, has unique key values that are strictly increasing. If that's likely to be true for your real data set, then you may still be OK.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

08-03-2017 09:49 AM - edited 08-03-2017 09:49 AM

I've identified the problem - if a value appears more than once per line in matrix 'a', the error will be generated. The real dataset definitely has distinct values per row, so this will work.

Solution

08-03-2017
09:50 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MDaniel

08-03-2017 09:48 AM

Ian deduced the problem correctly. I assumed that the number of elements that match was always NCOL(a).. In your random data, that assumption does not hold. If you need to handle the case where there are fewer elements that match than there are columns of a, use the following:

```
do i = 1 to nrow(a);
idx = loc( element(b[,1], a[i,]) ); /* location of elements in b[,1] that match a[i,] */
if ncol(idx)>0 then
c[i,1:ncol(idx)] = T( b[idx,2] ); /* copy elements from b[,2] */
end;
```

If you can have MORE elements that match than columns of a, then you either need to truncate the results to NCOL(a) elements of make some other modification.