BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Cornelis
Fluorite | Level 6

Hello SAS users,

 

I have a data set where I want to group together.

There are two input columns: Start and End (times). 

Row 1 and 2 are connected with each other because Start (row 2) = End (row 1). The desired output must have a column 'Group' with value 2 where '2' refers to _N_ = 2. 

The Start and End in row 3 cannot be grouped with the values in observation 4 and therefore have Group = 3.

Row 4 - 7 can be grouped together, therefore Group = 7.

Table.png

I suppose that Retain may help, but it does not provide me the solution, so far.

 

Could you demonstrate the solution?

 

Best regards,

 

Cornelis

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

If your actual data is as simple as you show us then something like below should do. If it is more like what @Ksharp anticipates then things will require a bit more coding.

data have;
  input from to;
datalines;
1 2
2 3
5 7
8 11
11 12
12 14
14 15
16 19
21 22
;

data inter;
  set have;
  group_id + from ne lag(to); 
run;

proc sql;
  select 
    from
    ,to
    ,group_id
    ,count(*) as n_rows
  from inter
  group by group_id
  order by from
  ;
quit;

 

View solution in original post

4 REPLIES 4
Ksharp
Super User
data have;
input from to;
obs+1;
datalines;
69 77
69 77
31 12
16 29
52 21
52 97
52 97
99 37
99 37
68 24
;
run;

data key;
 set have;
 key=from;output;
 key=to;output;
 keep obs key;
run;



data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household );
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;


data key2;
 if _n_=1 then do;
  if 0 then set want;
  declare hash k(dataset:'want');
  k.definekey('node');
  k.definedata('household');
  k.definedone();
 end;
set key;
call missing(household);
rc=k.find(key:key);
drop rc node;
run;
proc sql;
create table key3 as
select *,max(obs) as group from key2
 group by household;
quit;

data final_want;
if _n_=1 then do;
 if 0 then set key3;
 declare hash k(dataset:'key3');
 k.definekey('key');
 k.definedata('group');
 k.definedone();
end;
set have;
call missing(group);
rc=k.find(key:from);
keep obs from to group;
run;

Cornelis
Fluorite | Level 6
Hi Kscharp, great that you have demonstrated a code which can offer complex situation. Not very efficient in my situation, but I appreciate your effort very much and see this as a good learning point for myself.
Patrick
Opal | Level 21

If your actual data is as simple as you show us then something like below should do. If it is more like what @Ksharp anticipates then things will require a bit more coding.

data have;
  input from to;
datalines;
1 2
2 3
5 7
8 11
11 12
12 14
14 15
16 19
21 22
;

data inter;
  set have;
  group_id + from ne lag(to); 
run;

proc sql;
  select 
    from
    ,to
    ,group_id
    ,count(*) as n_rows
  from inter
  group by group_id
  order by from
  ;
quit;

 

Cornelis
Fluorite | Level 6
Thank you for your fast reply, that is a very efficient code.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1035 views
  • 2 likes
  • 3 in conversation