DATA Step, Macro, Functions and more

Using hash to aggregate strings

Accepted Solution Solved
Reply
Super Contributor
Posts: 340
Accepted Solution

Using hash to aggregate strings

Hello,

I would like to know how a hash can be used to aggregate character variables across lines. This is, it should be the hash version of this program:

Data A;
  Input Nr N $;
  Datalines;
1 AA
1 BB
2 AA
2 CC
2 DD
3 A_
3 B_
3 C_
;

Data B;
  Length Text $200.;
  Retain Text;
  Do i=1 By 1 Until (Last.Nr);
    Set A;
By Nr;
If First.Nr Then Text=N;
Else Text=Catt(Text,N);
  End;
Run;

Thx&kind regards


Accepted Solutions
Solution
‎01-22-2015 08:23 AM
Super Contributor
Posts: 298

Re: Using hash to aggregate strings

Posted in reply to user24feb

Actually you may not need the Hash. But with Hash, you need not SORT the data set before. See the code below.

data _null_;

   if _n_ = 1 then do;

   length Text $100;

      if 0 then set A;

      declare hash h(ordered:'a');

      h.definekey('Nr');

      h.definedata('Nr','Text');

      h.definedone();

   end;

   do until(last);

      set A end = last;

      if h.find() ^= 0 then Text = '';

      call catx('|', Text, N);

      h.replace();

   end;

   if last then h.output(dataset:'Want');

   stop;

run;

View solution in original post


All Replies
Super User
Posts: 5,431

Re: Using hash to aggregate strings

Posted in reply to user24feb

Why?

Do you have a problem with the current solution?

Data never sleeps
Super Contributor
Posts: 340

Re: Using hash to aggregate strings

Actually, I am playing around a little. However, it is not entirely "fun and games" for me. I have to do this with a great volume of data and a hash might give some speed advantage over the data step (which of course I am not sure of and had to be tested before; I am also aware that SAS is optimized for data steps, but I'd like to try it out; I am also not sure how the hash version could look like).

Solution
‎01-22-2015 08:23 AM
Super Contributor
Posts: 298

Re: Using hash to aggregate strings

Posted in reply to user24feb

Actually you may not need the Hash. But with Hash, you need not SORT the data set before. See the code below.

data _null_;

   if _n_ = 1 then do;

   length Text $100;

      if 0 then set A;

      declare hash h(ordered:'a');

      h.definekey('Nr');

      h.definedata('Nr','Text');

      h.definedone();

   end;

   do until(last);

      set A end = last;

      if h.find() ^= 0 then Text = '';

      call catx('|', Text, N);

      h.replace();

   end;

   if last then h.output(dataset:'Want');

   stop;

run;

Super Contributor
Posts: 308

Re: Using hash to aggregate strings

Posted in reply to user24feb

Hello,

data b;

length nrn $ 20;

if 0 then set a;

if _n_=1 then
do;
  declare hash a(dataset:"a", multidata: 'Y');
  a.definekey('nr');
  a.definedata('n');
  a.definedone();
end;

set a;
by nr;

if first.nr then do;
      if a.find() = 0 then do;
    nrn=n;
         a.has_next(result: r);
         do while(r ne 0);
             rc = a.find_next();
             nrn=cats(nrn) || n;
             a.has_next(result: r);
         end;
      end;
output;
end;

run;

Super User
Posts: 10,035

Re: Using hash to aggregate strings

Posted in reply to user24feb
Data A;
  Input Nr N $;
  Datalines;
1 AA
1 BB
2 AA
2 CC
2 DD
3 A_
3 B_
3 C_
;
run;
data want;
 if _n_ eq 1 then do;
  length text  $ 200;
  if 0 then set a;
  declare hash h();
  h.definekey('Nr');
  h.definedata('text');
  h.definedone();
 end;
set a;
by Nr;
 rc=h.find();
 text=catt(text,N);
 h.replace();
 if last.Nr then do;output;h.clear();end;
 drop n rc;
run;

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 276 views
  • 4 likes
  • 5 in conversation