BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PhilC
Rhodochrosite | Level 12

I have data serialized by a variable N, and I have a dataset that is virtually an incidence matrix.  How might I find the eleven (11) groupings of records in HAVE that are connected to common undirected graphs defined by the incidence_matrix dataset?  

Caveats:

  1. I only have SAS STAT.  Using any other SAS package is not an option, but please include your idea anyway.
  2. Do not assume big data.  Expect less than 10k rows for HAVE in real life, and a matrix about as sparse.  

 

data have ;
  do N=1 to 15;
    output;
  end;
stop;

data incidence_matrix(index=(N));
  input N N2;
datalines;
1 2
2 10
5 6
5 7
;
run;

 

 want:  a column that groups the records.  (Answering @Reeza) User should be able to perform sorts and use by statements with the new column.  An example dataset: 

N X
1 20
2 20
3 3
4 4
5 30
6 30
7 30
8 8
9 9
10 20

:

:

15 15
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have ;
  do N=1 to 15;
    output;
  end;
stop;
run;
data incidence_matrix(index=(N));
  input N N2;
datalines;
1 2
2 10
5 6
5 7
;
run;
data have;
 set have incidence_matrix;
 rename n=from n2=to;
run;



/***************************************/
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

View solution in original post

7 REPLIES 7
Reeza
Super User
What do you expect as output from this as input?
Ksharp
Super User
data have ;
  do N=1 to 15;
    output;
  end;
stop;
run;
data incidence_matrix(index=(N));
  input N N2;
datalines;
1 2
2 10
5 6
5 7
;
run;
data have;
 set have incidence_matrix;
 rename n=from n2=to;
run;



/***************************************/
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
PhilC
Rhodochrosite | Level 12

Thanks for your time @Ksharp.  Hash tables are probably the best way to do this given all of my limitations, but, more than this, the coders I work here with are not ready for hash tables, to be honest.  I want to avoid it. 

 

But I did. My original problem was a time period overlap problem, and I had experience working in other platforms with graph theory and network propagation so I found my self asking this question.  Work you did with @yabwon in September 2019 helped me in the end.  thanks again.

 

Ksharp
Super User

If you don't like hash table. I remembered @PGStats  wrote a macro about it . searching it at this communities.

Or @Reeza know it also .

PhilC
Rhodochrosite | Level 12

Challenge to the next poster on this topic.  Offer example data referring to this site's coverage of the same topic.

 

Solved: Get a list of all relationships - SAS Support Communities

Solved: Is family construction possible? - SAS Support Communities

Reeza
Super User
SAS/OR BOM also offers a solution.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1307 views
  • 4 likes
  • 3 in conversation