DATA Step, Macro, Functions and more

SET on multiple indices

Reply
N/A
Posts: 0

SET on multiple indices

Hi all,

I have a problem:
I have a sas dataset, say A which has 2 indices defined on say Var1 and Var2.
Now I am merging it with dataset B on those 2 variables using the indices, basically my intent is:
Data new;
merge A (in=a) B(in=b);
by Var1 Var2;
if b;
run;


I am trying like:
Data new;
set B;
set A key=(Var1 Var2);
if _IORC_ = 0;
run;


The above code gives error, after some research I found that key=< > can only have a single index.

The other option would be using the DBKEY but again that works only for database tables and not for sas datasets.

Please let me know whether this is possible, that is using SET with multiple indices in the Key= parameter, otherwise what else can be done.

Thanks in advance.

PS I am not using merge because the indexed approach is more efficient. Message was edited by: tangentray
Respected Advisor
Posts: 3,799

Re: SET on multiple indices

Posted in reply to deleted_user
You have the right idea but the wrong syntax.

Create compound index on B.

data b(index=(v12=(var1 var2)));

then set use

set b key=v12/unique;

also you will need to set _ERROR_=0 after the set to suppress inplied put _ALL_; produce when _ERROR_=1.
Contributor
Posts: 31

Re: SET on multiple indices

Posted in reply to deleted_user
Hey tangentray,
_Null_ is right about the compound index thing. But if you have these 2 indexes already created and maintained and you're not allowed to create extra indexes, then i believe creating compound index of both the variables for just this thing/matching would be a problem for you.
If you're doing a lookup on a large dataset ( data B) then i believe format technique of lookup can help( check out paper on SGF 2009 :: Proc Format, a Speedy Alternative to Sort Sort Merge :: http://support.sas.com/resources/papers/proceedings09/064-2009.pdf )

Back when i was learning about the set with Key= option, I wrote something that matches with your multiple index problem ( though not exactly the solution you would want as KEY= Multiple simple indexS is not wht i wrote, but KEY=single simple index with multiple SET statement ). I believe this code would increase the CPU time( if you've read the set ith key= theory you would knw why the increase in time), but the increase is depended on the data and the index variables being sorted/unsorted before they were made index.

data master(index=(PartNumber quantity ));
input PartNumber quantity;
datalines;
100 10
200 20
300 30
400 40
500 50
;

data description(index=(PartNumber quantity ));
input PartNumber quantity partdescription $;
datalines;
400 40 Nuts
300 30 Bolts
200 10 Screws
600 90 Washers
500 50 Bashers
;

data extract;
set description; /* smaller dsn, transaction dsn*/
set master(drop=quantity) key=PartNumber ; /* larger dsn*/

length errormessage $200.;
drop errormessage;

select (_iorc_);
when(%sysrc(_sok)) do; /* A match was found */
PartNumberX=PartNumber;
set master key=quantity ;
select (_iorc_);
when(%sysrc(_sok)) do; /* A match was found */
if PartNumberx=PartNumber then
output;
else delete;
end;
when (%sysrc(_dsenom)) do; /* No match was found */
_error_ = 0;
end;
otherwise do;
errormessage = iorcmsg();
put "ATTENTION: unknown error condition: "
errormessage;
end;
end;
end;
when (%sysrc(_dsenom)) do; /* No match was found */
_error_ = 0;
end;
otherwise do;
errormessage = iorcmsg();
put "ATTENTION: unknown error condition: "
errormessage;
end;
end;
run;
proc print;run;

Thanks!
Ask a Question
Discussion stats
  • 2 replies
  • 153 views
  • 0 likes
  • 3 in conversation