BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Cornelis
Fluorite | Level 6

Hello SAS users,

 

I have a data set where I want to group together.

There are two input columns: Start and End (times). 

Row 1 and 2 are connected with each other because Start (row 2) = End (row 1). The desired output must have a column 'Group' with value 2 where '2' refers to _N_ = 2. 

The Start and End in row 3 cannot be grouped with the values in observation 4 and therefore have Group = 3.

Row 4 - 7 can be grouped together, therefore Group = 7.

Table.png

I suppose that Retain may help, but it does not provide me the solution, so far.

 

Could you demonstrate the solution?

 

Best regards,

 

Cornelis

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

If your actual data is as simple as you show us then something like below should do. If it is more like what @Ksharp anticipates then things will require a bit more coding.

data have;
  input from to;
datalines;
1 2
2 3
5 7
8 11
11 12
12 14
14 15
16 19
21 22
;

data inter;
  set have;
  group_id + from ne lag(to); 
run;

proc sql;
  select 
    from
    ,to
    ,group_id
    ,count(*) as n_rows
  from inter
  group by group_id
  order by from
  ;
quit;

 

View solution in original post

4 REPLIES 4
Ksharp
Super User
data have;
input from to;
obs+1;
datalines;
69 77
69 77
31 12
16 29
52 21
52 97
52 97
99 37
99 37
68 24
;
run;

data key;
 set have;
 key=from;output;
 key=to;output;
 keep obs key;
run;



data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household );
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;


data key2;
 if _n_=1 then do;
  if 0 then set want;
  declare hash k(dataset:'want');
  k.definekey('node');
  k.definedata('household');
  k.definedone();
 end;
set key;
call missing(household);
rc=k.find(key:key);
drop rc node;
run;
proc sql;
create table key3 as
select *,max(obs) as group from key2
 group by household;
quit;

data final_want;
if _n_=1 then do;
 if 0 then set key3;
 declare hash k(dataset:'key3');
 k.definekey('key');
 k.definedata('group');
 k.definedone();
end;
set have;
call missing(group);
rc=k.find(key:from);
keep obs from to group;
run;

Cornelis
Fluorite | Level 6
Hi Kscharp, great that you have demonstrated a code which can offer complex situation. Not very efficient in my situation, but I appreciate your effort very much and see this as a good learning point for myself.
Patrick
Opal | Level 21

If your actual data is as simple as you show us then something like below should do. If it is more like what @Ksharp anticipates then things will require a bit more coding.

data have;
  input from to;
datalines;
1 2
2 3
5 7
8 11
11 12
12 14
14 15
16 19
21 22
;

data inter;
  set have;
  group_id + from ne lag(to); 
run;

proc sql;
  select 
    from
    ,to
    ,group_id
    ,count(*) as n_rows
  from inter
  group by group_id
  order by from
  ;
quit;

 

Cornelis
Fluorite | Level 6
Thank you for your fast reply, that is a very efficient code.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 507 views
  • 2 likes
  • 3 in conversation