SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

HASH Join on Table Containing Duplicate values

Reply
Occasional Contributor
Posts: 16

HASH Join on Table Containing Duplicate values

Hi,

 

I'm doing join between 2 tables. A and B. B has duplicate records . First, I need the rows to repeat so duplicate is not a problem. Once the join is done I have to check on some conditions and if they are met I need to delete those rows. I'm using the below code but it is not deleting the right rows.

 

First condition is one Variable which is "EQ". If EQ=0 then I want to delete that row from hash.

second condition is if EQ>10 then leave the Ist row but delete any other duplicate rows.

 

set WORK.lookup_2 ;
do _iorc_ = b.find(key:A,key:B) by 0 while (_iorc_ = 0) ;


if eq=0 then _iorc_ = b.removedup();
_iorc_ = b.find_next() ;

Respected Advisor
Posts: 4,797

Re: HASH Join on Table Containing Duplicate values

@jpm2478

Not really sure what you're asking for but may-be below code will give you some ideas.

data lookup;
  key=1;
  do value=2 to 10 by 2;
    output;
  end;
  stop;
run;

data _null_;
  if 0 then set lookup;
  dcl hash h1(dataset:'lookup', multidata:'y');
  h1.defineKey('key');
  h1.defineData('value');
  h1.defineDone();
  dcl hash h2(dataset:'lookup', multidata:'y');
  h2.defineKey('key');
  h2.defineData('value');
  h2.defineDone();
  
  key=1;

  /* removed item with value=6 */
  do while(h1.do_over()=0);
    if value=6 then
      do;
        h1.removedup();
        leave;
      end;
  end;
  h1.output(dataset:'h1');

  /* remove items with value <> 6 */
  do while(h2.do_over()=0);
    if value ne 6 then
      do;
        h2.removedup();
      end;
  end;
  h2.output(dataset:'h2');

  stop;
run;
Ask a Question
Discussion stats
  • 1 reply
  • 98 views
  • 1 like
  • 2 in conversation