DATA Step, Macro, Functions and more

How to use two sets to lookup when it have repeated observations

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

How to use two sets to lookup when it have repeated observations

data m01(index=(a));

input a b;

cards;

1 3

2 4

3 5

;

run;

data m02(index=(a));

input a c;

cards;

1 6

2 7

2 8

;

run;

data m03 m04;

     set m01;

     set m02 key=a;

     select(_iorc_);

          when(%sysrc(_sok)) do;

                    output m03;

          end;

          when(%sysrc(_dsenom)) do;

                   _error_=0;

                   output m04;

          end;

          otherwise do;

                  _error_=0;

                  stop;

          end;

     end;

run;

proc print data=m03;run;

The result is:

                     obs  a  b  c

                         1  1  3  6

                         2  2  4  7

But,what I want is this:

                    obs  a  b  c

                        1  1  3  6

                        2  2  4  7

                        3  2  4  8

Well,I cannot find the way to the programming problem.

Please.No MERGE,SQL,or HASH,Just sets;

I am Jacob from China,from Shanghai.

/*    Add at 21 March  2012     */

In fact,I met a problem at work.I have a SAS data set,it has 15 million observations and it has duplicate observations.

I can not use the HASH,for out of memory;

I do not want to use MERGE,for the SORT procedure cost too much time;

and it is forbid to write SQL-code to submit to DB2.

So I am trying to find another efficient lookup method.


Accepted Solutions
Solution
‎03-20-2012 08:35 PM
Super User
Super User
Posts: 7,039

Re: How to use two sets to lookup when it have repeated observations

Look at example 13.3 in this support file for Michael Raithel's book The Complete Guide to SAS Indexes.

http://ftp.sas.com/samples/A60409

Here is how you could adjust his example to your datasets.

/*-------------------------------------------------------------------------------*/

/* Example 13.3:  Duplicate Index Key Variable Values in the Master SAS Data Set */

/*-------------------------------------------------------------------------------*/

data m03;

  set m01 ;

  do until (_iorc_=%sysrc(_dsenom));

    set m02 key=a ;

    select (_iorc_);

      when (%sysrc(_sok)) output;

      when (%sysrc(_dsenom)) _error_ = 0;

      otherwise do;

        errormessage = iorcmsg();

        put "ATTENTION: unknown error condition: " errormessage;

      end;

    end;

  end;

run;

View solution in original post


All Replies
Respected Advisor
Posts: 3,156

How to use two sets to lookup when it have repeated observations

I didn't see you don't like Merge. Anyway, here is my brute force solution with only 'set';

data m01;

input a b;

cards;

1 3

2 4

3 5

;

run;

data m02;

input a c;

cards;

1 6

2 7

2 8

;

data want (drop=a1);

set m02;

do i=1 to nobs;

set m01 (rename=a=a1) point=i nobs=nobs;

if a=a1 then

   do;

    output;

return;

end;

end;

run;

proc print;run;

Regards,

Haikuo

Super User
Posts: 5,498

Re: How to use two sets to lookup when it have repeated observations

If you want an easy solution, change how you construct M02, limiting it to one observation per value of A.  For example:

data m02 (index=(a));

   set m02;

   by a;

   if last.a;

run;

Then your final step should work just fine.

Note that there are complications that you have not even tried to illustrate, using your current approach.  For example, try constructing M01 having 2 observations with A=2, and examine the values you get for C.  (Then try it with 3 observations with A=2.)  In your final solution, you may end up adding the UNIQUE option after KEY=A.

Good luck.

...

Now that I've re-read the problem, this solution won't work.  The poster who suggested you use MERGE is correct.  This problem is ideal for MERGE, possibly adding the use of IN= variables.  In the long run, you can't avoid learning MERGE when you program in SAS, so this might be the right time when you have a problem that requires it.  (Or is the whole point of the problem to learn just how cumbersome the solution would be if you are not allowed to use MERGE when it is the right tool for the job?)

Respected Advisor
Posts: 3,156

How to use two sets to lookup when it have repeated observations

I must have missed something, it seems to me that using %sysrc(_sok) is straightforward to get what you want;

data m01 (index=(a));

input a b;

cards;

1 3

2 4

3 5

;

run;

data m02;

input a c;

cards;

1 6

2 7

2 8

;

data want ;

set m02;

set m01 key=a;

if %sysrc(_sok)=0 then output ;

run;

Contributor
Posts: 25

How to use two sets to lookup when it have repeated observations

Great! I'II try "if %sysrc(_sok)=0 then do;_error_=0;output;end;",and test the efficiency.

Thank you very very much!

Reagrds,

Jacob.

Super User
Super User
Posts: 7,039

How to use two sets to lookup when it have repeated observations

The macro %SYSRC is just a tool for translating a mnemonic character string into the corresponding error code.

The error code for _SOK is 0.

So %SYSRC(_SOK)=0 will always by true.

Respected Advisor
Posts: 3,156

How to use two sets to lookup when it have repeated observations

Thanks, Tom. One thing learned.

Regards,

Haikuo

Solution
‎03-20-2012 08:35 PM
Super User
Super User
Posts: 7,039

Re: How to use two sets to lookup when it have repeated observations

Look at example 13.3 in this support file for Michael Raithel's book The Complete Guide to SAS Indexes.

http://ftp.sas.com/samples/A60409

Here is how you could adjust his example to your datasets.

/*-------------------------------------------------------------------------------*/

/* Example 13.3:  Duplicate Index Key Variable Values in the Master SAS Data Set */

/*-------------------------------------------------------------------------------*/

data m03;

  set m01 ;

  do until (_iorc_=%sysrc(_dsenom));

    set m02 key=a ;

    select (_iorc_);

      when (%sysrc(_sok)) output;

      when (%sysrc(_dsenom)) _error_ = 0;

      otherwise do;

        errormessage = iorcmsg();

        put "ATTENTION: unknown error condition: " errormessage;

      end;

    end;

  end;

run;

Contributor
Posts: 25

How to use two sets to lookup when it have repeated observations

Tom,

Thanks.

This is it!

《The Complete Guide to SAS Indexes》is very,very good!

Best Wish!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 259 views
  • 6 likes
  • 4 in conversation