Statistical programming, matrix languages, and more

Searching for values from a smaller set to fill in a vector

Reply
Contributor
Posts: 22

Searching for values from a smaller set to fill in a vector

Hello.  I was wondering if there is a shortcut to achieving what I am trying to do, or if it is advisable to avoid the do loop.

I was thinking that the setdif or choose functions could help.

Thank you for any suggestions.

proc iml;

*one of two smaller sets;

one={1,3,6,8};

*union of two sets;

two={1,2,3,4,5,6,7,8};

*variable related to 'one';

three={0.1,0.2,0.3,0.4};

newmat=j(nrow(two),1,0);

do j=1 to nrow(two);

do k=1 to nrow(one);

*new set created with the same number of rows as set 'two'

containing either set three or filled in with zeros;

if two[j,]=one[k,] then newmat[j,]=three[k,];

end;

end;

print newmat;

quit;

run;

SAS Super FREQ
Posts: 3,418

Re: Searching for values from a smaller set to fill in a vector

It looks like the logic is "If the j_th element of Two is in the set One, then use the elements in Three." Try using the ELEMENT function:

newmat=j(nrow(Two),1,0);

idx = loc( element(Two, One) ); /* indices of elements in Two that match (make sure ncol(idx)>0) */

v = Two[idx];                   /* values of matching elements */

do i = 1 to ncol(idx);          /* for each value...  */

   n = idx;                  /* location in NewMat to put answer*/

   /* look up index in One. Use value from Three */

   newmat = Three[ loc(One=v) ];

end;

print newmat;

For more on SAS/IML functions that operate on sets, see Testing for equality of sets - The DO Loop

And the LOC function is always your friend when you are looking up values: Finding data that satisfy a criterion - The DO Loop

Occasional Contributor
Posts: 10

Re: Searching for values from a smaller set to fill in a vector

[ Edited ]

Hi Rick,

 

I came accross your answer since I am confronted with a similar issue. I can easily transscript this to alphanumerical as well as having column 2 of matrix two containing the values I like to know.

 

proc iml;

   one = {a, c, e, h};

   two = {a aa, b bb, c cc, d dd, e ee, f ff, g gg, h hh};

 

   idx = loc (element(two, one));

 

  v = two(idx);

 

  do i = 1 to ncol(idx);

      n = idx;

      newmat = two[loc(one=v),2];

   end;

 

print newmat;

quit;

 

However, my real word looks different:

- 'one' contains strings, e.g.:

one = {hat, cap, bottle, bottleholder};

 

- 'two' contains even longer strings and multiple matches, e.g.:

two = {'mouse with glasses' '223 567', 'house with bottleholder in black' '345 987', 'baseballcap in blue' '678 912', 'pink baseballcap' 345 123', 'cap from plant material' '678 123' };

 

Now I would like to find in 'two' all the "number strings" connected to the "word strings" which match the strings in 'one', including "baseballcap ..." such that the result would look something like

newmat_result = {'hat' '' '' '', 'cap' '678 912' '345 123' '678 123', bottle '' '' '', bottleholder '345 987' '' ''};

 

I worked with regular expressions before, but those were handwritten and not extracted from one matrix to match another matrix.

 

After days of searching and trying, I hope that I find somebody with an idea to solve this issue.

 

Thank you for any pointers!

 

Regards from Germany

Gerit

PS: I made this reply into a post

 

Contributor
Posts: 22

Re: Searching for values from a smaller set to fill in a vector

Thank you for your help, Rick.  I need to get the union of two sets first.  So, my code structure actually looks like the following:

proc iml;

*one of two smaller sets;

one={1,3,6,8};

*two of two smaller sets;

two={2,4,5,7};

*union of two sets;

uonetwo=t(union(one,two));

*weight from set one;

wone={0.1,0.2,0.3,0.4};

*weight from set two;

wtwo={0.5,0.6,0.7,0.8};

*new matrices of zeros from union set;

newmat1=j(nrow(uOneTwo),1,0);

newmat2=j(nrow(uOneTwo),1,0);

idx1 = loc( element(uOneTwo, One) );/* indices of elements in uOneTwo that match (make sure ncol(idx)>0) */

idx2 = loc( element(uOneTwo, Two) );/* indices of elements in uOneTwo that match (make sure ncol(idx)>0) */

v1 = uOneTwo[idx1];                   /* values of matching elements */

v2 = uOneTwo[idx2];                   /* values of matching elements */

do i = 1 to ncol(idx1);          /* for each value...  */

   n = idx1;                  /* location in NewMat to put answer*/

   /* look up index in One. Use value from Three */

   newmat1 = wOne[ loc(One=v1) ];

end;

do i = 1 to ncol(idx2);          /* for each value...  */

   n = idx2;                  /* location in NewMat to put answer*/

   /* look up index in One. Use value from Three */

   newmat2 = wTwo[ loc(Two=v2) ];

end;

print NewMat1;

print NewMat2;

quit;

run;

SAS Super FREQ
Posts: 3,418

Re: Searching for values from a smaller set to fill in a vector

I can't tell from your latest response: is this question answered or do you still have a question?

Contributor
Posts: 22

Re: Searching for values from a smaller set to fill in a vector

I think this question is answered.  Thank you for your help!

Ask a Question
Discussion stats
  • 5 replies
  • 411 views
  • 2 likes
  • 3 in conversation