BookmarkSubscribeRSS Feed
MDaniel
Obsidian | Level 7

I have a question that seems really simple but I've struggling a lot with it: How can one perform, in IML, the equivalent of Excel's VLOOKUP without using a loop?

For instance, I'd like to find the index of elements of a in b, and use that index to return the corresponding element in c.

proc iml;
	a = {104,106,101,104};
	b = {101,102,103,104,105,106};
	c = {"A", "B","C","D","E","F"};
quit;

Expected output is a column matrix containing elements: {"D","F","A","D"}. I can get this result using a loop, but for very large data it slows the IML process to a screeching halt, as loops usually do.

3 REPLIES 3
Rick_SAS
SAS Super FREQ

You didn't show how you are using the loop, not specify the size of the A and B vectors. In general, the operation you are requesting is of the order N*M where A has N elements and B has M elements because for each element of A you have to search through all elements of B.

 

If you can treat the vectors as sets in which order doesn't matter, you can use the LOC and ELEMENT functions to obtain the values in B that correspond to elements of A:

 

proc iml;
a = {104,106,101,104};
b = {101,102,103,104,105,106};
c = {"A", "B","C","D","E","F"};

idx = element(b, a);
ans = c[loc(idx)];
print ans;    /* answer as a set; order does not matter */

However, if you want to preserve order and permit duplicate values, then the following loop is probably the method I'd use:

 

ans = j(nrow(a), 1, " ");
do i = 1 to nrow(a);
   ans[i] = c[ loc(a[i] = b) ];
end;
print ans;

Another efficient approach would be to sort A and B (and C, sorted by B) and then do a match merge in Base SAS. That would probably be the fastest.

Tom
Super User Tom
Super User

If you want to translate 104 to 'D' then use a format.

proc format ;
  value lookup
   101 = 'A'
   102 = 'B'
   103 = 'C'
   104 = 'D'
   105 = 'E'
   106 = 'F'
  ;
run;

proc iml;
 a = {104,106,101,104};
 ans =putn(a,'lookup.');
 print ans ;
quit;

 Result.:

ans

D
F
A
D

 It is easy to use the CTNLIN= option on PROC FORMAT to generate a format from a dataset.

Ksharp
Super User

Yeah. I am also looking for such function in  IML. Do loop you mean this?

 

proc iml;
a = {104,106,101,104,102,102,102,103,105,105};
b = {101,102,103,104,105,106};
c = {"A", "B","C","D","E","F"};

want=j(nrow(a),1,'                   ');
do i=1 to nrow(b);
 idx=loc(a=b[i]);
 want[idx]=c[i];
end;

print want; 
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 3 replies
  • 1065 views
  • 5 likes
  • 4 in conversation