Index key lookup

ScottBass · Posted 12-20-2013 01:17 AM

Hi,

Please see SAS(R) 9.3 Statements: Reference, examples #7 and #8.

Sample Code:

data dups (index=(x / nomiss));

do x=1 to 10;

y=x*2;

z=x**2;

output;

end;

run;

data nodups (index=(x));

do x=2 to 10 by 2;

output;

end;

run;

data final1;

set nodups;

set dups key=x;

run;

data final2;

set nodups;

set dups key=x / unique;

run;

data final3;

set dups;

set nodups key=x;

foo=0+y;

run;

data final4;

set dups;

set nodups key=x / unique;

foo=0+y;

run;

I quote Example 8:

This example uses the KEY= option to perform a table lookup. The DATA step reads a primary data set that is named INVTORY, which is indexed on PARTNO, and a lookup data set named PARTCODE. PARTCODE contains quantities of new stock (variable NEW_STK). The UNIQUE option ensures that, if there are any duplicate observations in INVTORY, values of NEW_STK are added only to the first observation of the group.

I'm probably just misunderstanding the documentation, but shouldn't I expect different results for final3 and final4? If these results are as you expect, can you further explain the underlined bit above?

Thanks,

Scott

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

data_null__ · Posted 12-20-2013 07:45 AM

If you add a variable to NODUPS you will see a difference.

ScottBass · Posted 12-20-2013 07:56 PM

Thanks Data.

I changed my test code as below and now see the difference. From reading the example in the doc, I would have expected to see the difference in Final3 and Final4, not Final1 and Final2. Am I just mis-reading the doc, or is the example wrong?

I was pleased to see that repeated calls to set dups key=x retrieved the next record matching the key, rather than continually retrieving the first record. This, coupled with _iorc_, would allow building a Cartesian product if you know or expect the lookup dataset to have non-unique keys.

With my "real" data, I've got a small transaction dataset, but millions of records in SCD2 dimension tables. I also need this lookup to be fast, as it's for an online (stored process) application. This approach was blindingly quick - much faster than hash object join or SQL (even with indexes). Like 0.5 seconds vs. 45 seconds using hash objects.

I've been programming SAS a while, and this technique has been in the doc even longer - too bad I never connected the dots until now . As you say in your profile, I'm (still) learning SAS.

I think this approach will be very useful for our ETL/Data Warehouse processing.

Thanks again Data.

data dups (index=(x / nomiss));

do x=1 to 10;

y=x*2;

z+1;output;

end;

run;

data nodups (index=(x));

do x=2 to 10 by 2;

a=x**4;

output;

end;

run;

data final1;

set nodups;

do i=1 to 4;

set dups key=x;

rc=_iorc_;

output;

end;

run;

data final2;

set nodups;

do i=1 to 4;

set dups key=x / unique;

rc=_iorc_;

output;

end;

run;

data final3;

set dups;

set nodups key=x;

foo=0+y;

bar=0+a;

run;

data final4;

set dups;

set nodups key=x / unique;

foo=0+y;

bar=0+a;

run;

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

Index key lookup

Re: Index key lookup

Re: Index key lookup

Catch up on SAS Innovate 2026

Index key lookup

Re: Index key lookup

Re: Index key lookup

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away