DATA Step, Macro, Functions and more

Using hash to aggregate strings

Accepted Solution Solved
Reply
Super Contributor
Posts: 336
Accepted Solution

Using hash to aggregate strings

Hello,

I would like to know how a hash can be used to aggregate character variables across lines. This is, it should be the hash version of this program:

Data A;
  Input Nr N $;
  Datalines;
1 AA
1 BB
2 AA
2 CC
2 DD
3 A_
3 B_
3 C_
;

Data B;
  Length Text $200.;
  Retain Text;
  Do i=1 By 1 Until (Last.Nr);
    Set A;
By Nr;
If First.Nr Then Text=N;
Else Text=Catt(Text,N);
  End;
Run;

Thx&kind regards


Accepted Solutions
Solution
‎01-22-2015 08:23 AM
Super Contributor
Posts: 254

Re: Using hash to aggregate strings

Actually you may not need the Hash. But with Hash, you need not SORT the data set before. See the code below.

data _null_;

   if _n_ = 1 then do;

   length Text $100;

      if 0 then set A;

      declare hash h(ordered:'a');

      h.definekey('Nr');

      h.definedata('Nr','Text');

      h.definedone();

   end;

   do until(last);

      set A end = last;

      if h.find() ^= 0 then Text = '';

      call catx('|', Text, N);

      h.replace();

   end;

   if last then h.output(dataset:'Want');

   stop;

run;

View solution in original post


All Replies
Super User
Posts: 5,256

Re: Using hash to aggregate strings

Why?

Do you have a problem with the current solution?

Data never sleeps
Super Contributor
Posts: 336

Re: Using hash to aggregate strings

Actually, I am playing around a little. However, it is not entirely "fun and games" for me. I have to do this with a great volume of data and a hash might give some speed advantage over the data step (which of course I am not sure of and had to be tested before; I am also aware that SAS is optimized for data steps, but I'd like to try it out; I am also not sure how the hash version could look like).

Solution
‎01-22-2015 08:23 AM
Super Contributor
Posts: 254

Re: Using hash to aggregate strings

Actually you may not need the Hash. But with Hash, you need not SORT the data set before. See the code below.

data _null_;

   if _n_ = 1 then do;

   length Text $100;

      if 0 then set A;

      declare hash h(ordered:'a');

      h.definekey('Nr');

      h.definedata('Nr','Text');

      h.definedone();

   end;

   do until(last);

      set A end = last;

      if h.find() ^= 0 then Text = '';

      call catx('|', Text, N);

      h.replace();

   end;

   if last then h.output(dataset:'Want');

   stop;

run;

Super Contributor
Posts: 305

Re: Using hash to aggregate strings

Hello,

data b;

length nrn $ 20;

if 0 then set a;

if _n_=1 then
do;
  declare hash a(dataset:"a", multidata: 'Y');
  a.definekey('nr');
  a.definedata('n');
  a.definedone();
end;

set a;
by nr;

if first.nr then do;
      if a.find() = 0 then do;
    nrn=n;
         a.has_next(result: r);
         do while(r ne 0);
             rc = a.find_next();
             nrn=cats(nrn) || n;
             a.has_next(result: r);
         end;
      end;
output;
end;

run;

Super User
Posts: 9,681

Re: Using hash to aggregate strings

Data A;
  Input Nr N $;
  Datalines;
1 AA
1 BB
2 AA
2 CC
2 DD
3 A_
3 B_
3 C_
;
run;
data want;
 if _n_ eq 1 then do;
  length text  $ 200;
  if 0 then set a;
  declare hash h();
  h.definekey('Nr');
  h.definedata('text');
  h.definedone();
 end;
set a;
by Nr;
 rc=h.find();
 text=catt(text,N);
 h.replace();
 if last.Nr then do;output;h.clear();end;
 drop n rc;
run;

Xia Keshan

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 271 views
  • 4 likes
  • 5 in conversation