DATA Step, Macro, Functions and more

Strongly connected component in oriented graph.Procedures(tools) on SAS?

Accepted Solution Solved
Reply
Regular Contributor
Posts: 161
Accepted Solution

Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi all,

Does SAS has existing procedures(tools) that perform searching in Oriented graph, that can has infinity loops?

More about this garphs:

http://en.wikipedia.org/wiki/Strongly_connected_component

Such algoritms have already realized on Java(C++), for example:

http://algowiki.net/wiki/index.php?title=Kosaraju%27s_algorithm

but I did'nt find any solutions ON SAS.

I've found macroes(that mostly used data step hash tables) that perform DFS(deep deep first search) in oriented graph without infinity loops, but it's not solwe the task.

Now I'm almost sure that I'll need implement self-made macro that will use Data step hash table etc. for achiving these goals, but I just wont to be sure that such algoritm isn't realised yet on SAS(becouse if it is realized , I'll spend a lot of time to develop bicycle , that is not very goodSmiley Happy ).

Thank you in adwancedSmiley Happy!


Accepted Solutions
Solution
‎05-02-2012 05:39 PM
Regular Contributor
Posts: 241

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

@Yura2301 I would love to see your implementation of Tarjan's. In any case, here is a much easier way -- just calling the strongComp() function in RBGL package of R. I am running sas 9.3 and R 2.13.0 on windows 7.

  data one;
    input (from to) (:$1.) @@;
    weight = 1;
  cards;
  1 2  2 6  2 8  6 2  6 8  6 9
  3 3  4 4  5 7  8 2  8 6  9 6
  ;
  run;

  proc iml;
    run ExportDataSetToR("work.one", "edgelist");
    submit / r;
      library("graph")
      library("RBGL")

      g <- graphBAM(edgelist, edgemode="directed")

      # find the strong components and return a data.frame
      sc <- strongComp(g)
      cs <- c()
      ns <- c()
      for (i in 1:length(sc)) {
         len <- length(sc[])
         cs <- c(cs, rep(i, times=len))
         ns <- c(ns, sc[])
      }
      strongComp <- data.frame(node=ns, comp=cs)
    endsubmit;

    call ImportDataSetFromR("work.two", "strongComp");
  quit;

  /* check */
  proc print data=two noobs;
  run;
  /* on lst
  node    comp
    2       1
    6       1
    8       1
    9       1
    1       2
    3       3
    4       4
    7       5
    5       6
  */

View solution in original post


All Replies
Contributor
Posts: 30

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi Yura, I expect this is too late to be useful, but I recently wrote a SAS implementation of Kosaraju's algorithm using IML.

I can't post it publicly because it's a homework assignment for an ongoing course, but if you email me I'd be happy to send you a copy. My username is geoffrey dot brent and domain is abs gov au.


Regular Contributor
Posts: 161

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Posted in reply to GeoffreyBrent

Hi Geofferey,

First of all thanks for your answer.

I've already succeed in implementation Tarjan's algorithm(it's a little bit improved version of Kosaraju's algorithm.)

I realized the task not using IML tool with it benefits, I didn't know about the tool before, so I've implemented the task using standard SAS language elements + recursion.

But I'll with please will look to your version of implementation of the algorithm using IML.

For me interesting is performance side, lines of codes that takes your version etc.Actually SAS isn't very useful for such kind of tasks that needs  stack, lists, recursion etc., but realize such tasks is real but with worse performance then ,for example, C,java or javascript provide.

When I'll have free time i'll look to your version.

But for me it was real surprise that sas hasn't realized at list DFS(Depth-first search), becouse sas is the BI tool and such kind of tasks , as for me, is a common in many business process, it's not only theoretical-mathematical tasks, but it's my opinion base on my own work backlog...

Anyway-thanks!

Contributor
Posts: 30

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

The course has finished now, so I'll attach the code to this post.

It's a bit scrappy since I was writing it for a one-use assignment, but it should be enough to give you the idea. I tested it on a large data set (~ 1 million edges) and it produced the correct answer, although it took several hours to run, much slower than the students who were working in Python etc. I'd be interested to hear how it compares with your implementation.

Attachment
Regular Contributor
Posts: 161

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Posted in reply to GeoffreyBrent

Hi Geofferey,

Thanks for your code, sorry for such late reply, I had a lot of tasks, as usual.

I compared my and your codes, my(using sas procuderes, macros etc.) has less lines, but yours(proc iml) works fuster, but in my case it was'nt very critical becouse source relation table isn't bigSmiley Happy

I also was surprized that you realized the task without recursion, but then I have found that proc iml doesn't support recursion at all so it's ok))).

Unfortunatelly I can't show you my code becouse of security rules in our company but shortly - My code is just set of macros plus recursion plus I made some kind of "Checkpoints" in order to provide possibility debug the macro a little bit.

Thanks!

Solution
‎05-02-2012 05:39 PM
Regular Contributor
Posts: 241

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

@Yura2301 I would love to see your implementation of Tarjan's. In any case, here is a much easier way -- just calling the strongComp() function in RBGL package of R. I am running sas 9.3 and R 2.13.0 on windows 7.

  data one;
    input (from to) (:$1.) @@;
    weight = 1;
  cards;
  1 2  2 6  2 8  6 2  6 8  6 9
  3 3  4 4  5 7  8 2  8 6  9 6
  ;
  run;

  proc iml;
    run ExportDataSetToR("work.one", "edgelist");
    submit / r;
      library("graph")
      library("RBGL")

      g <- graphBAM(edgelist, edgemode="directed")

      # find the strong components and return a data.frame
      sc <- strongComp(g)
      cs <- c()
      ns <- c()
      for (i in 1:length(sc)) {
         len <- length(sc[])
         cs <- c(cs, rep(i, times=len))
         ns <- c(ns, sc[])
      }
      strongComp <- data.frame(node=ns, comp=cs)
    endsubmit;

    call ImportDataSetFromR("work.two", "strongComp");
  quit;

  /* check */
  proc print data=two noobs;
  run;
  /* on lst
  node    comp
    2       1
    6       1
    8       1
    9       1
    1       2
    3       3
    4       4
    7       5
    5       6
  */

Regular Contributor
Posts: 161

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Posted in reply to chang_y_chung_hotmail_com

Hi Chang,

Unfortunately I am working on SAS 9.1Smiley Happy

So your solutions in my case looks like unreachable for a long time)))

Anyway thanks!

Frequent Contributor
Posts: 99

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi Yura,

              I am trying to implement Tarjan's algorithm in SAS. Is their a way for you to share your code. Or provide more guidance so I can create the code.

  I appreciate all your help.

Regards,

Amit

Regular Contributor
Posts: 161

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi Amit,

I've sent request to my Bosses regarding permission of sharing the code, but suppose answer will be negative.

In any case I'll be able to provide you simple guidance how to realize the algorithm on SAS.

First of all, if you use SAS 9.3 - there are realized solution that very simplify the task,Chang demonstrate how to use it in post higher.

On earler version of SAS(9.1.3) it doesn't realized(my case).

Secondly Geofferey realized the same algorithm using proc IML, he attached his code in one of the post higher, long time ago I run it(just to compare performance side with my variant), and looks Geofferies code works correct, but again I didn't tested it carefully. Regarding performance - on small graphs execution time is approximately the same, in huge graphs Geofferies algorithm works a little bit fuster, but my solution use recursion instead of hardiocded max iteration amount in Geofferies variant(proc IML doesn't support recursion),so our realizations are very very different.

My implementation use SAS macro loops, data manipulation ,recursion etc.,so as I wrote higher -  if I'll not be able post the code I'll share guidance hot to realize it quick and simple.
Actually ,as I remember, It takes approximately 4 days(poor realization and tests). More time I spent on collecting source data and loads resutls data etc., but it's another question.

Thanks!

Regular Contributor
Posts: 161

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi again Amit,

Unfortunately I can't share the code but I'll shortly describe you main stages of implementation the macro on SAS.

So first of all algorithm is already realized on many programing languages, I used realization on Javascript as template:

https://gist.github.com/1440602

You can easy integrate this realization into some simple HTML page and use it for checking /testing your realization on SAS.

Regarding SAS realization - I just substitute standard operations on Stack(push, pop etc.) and javascript array functions (contains) by SAS macro with needed parameters and output as sas data set(Stack I replaced by sas table with two values - pos and value, using them easy make push pop etc.).

Main challenge was realize recursion - the problem is that SAS recursion isn't standard recursion as in java script, SAS just executes same code but previously set macrovars isn't cleans when new recursive execution starts etc. So I often use macro names like &&i&vertexName etc...

I also often used goto statement,I know it bad practice but as I remember in some steps it was almost impossible realize needed functionality without it.

Also , to simplify testing before starts algorithm I renamed vertex(nodes) in graph by simple name as 1,2,3 etc., as it is in javascript realization.

Testing also takes also a lot efforts - because of recursion using log isn't very helpful, even small graph generates huge log(if you'll use mprint mlogic symbolgen options), so I used html output in needed checkpoints...

Results of macro execution I compared with result that output javascript code that I described higher, I also changed a little bit this javascript code – it output not only targrt results but also some checkpoints info(stacks contents after each recursions execution etc.)

As I wrote on my previous post  - pure implementation takes not so much time, but testing and integration with business requirement took a lot.

If you'll have any question regarding concrete steps –write and I’ll try to answer…

Thanks!

Frequent Contributor
Posts: 99

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi Yura,

              Thanks for the guidance. Will try to replicate the java implementation.

Amit

Super User
Posts: 10,035

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Maybe I can do it ,or maybe not .

Can you post some sample data and what result you want.

Trusted Advisor
Posts: 1,301

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

You should be able to get the information from PROC OPTNET using the concomp statement.  You will need to have SAS/OR installed as well as SAS network algortihms for this to work.  Concomp works with two types of algorithms, dfs and union-find.  Tarjan's algorithm is a dfs style search algorithm.

http://support.sas.com/documentation/cdl/en/ornoaug/65289/PDF/default/ornoaug.pdf

data LinkSetIn;

input from $ to $ @@;

datalines;

A B A C B C C H D E D F D G F E G I K L

;

proc optnet

data_links = LinkSetIn

out_nodes = NodeSetOut;

concomp;

run;

node concomp

A 1

B 1

C 1

H 1

D 2

E 2

F 2

G 2

I 2

K 3

L 3

Frequent Contributor
Posts: 99

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

Hi ,

      I appreciate your advice and guidance. But  I dont have SAS /OR hence I cant implement your solution.

Super User
Posts: 10,035

Re: Strongly connected component in oriented graph.Procedures(tools) on SAS?

I think I can do it .I used Broad First Search algorithm

data have;
    input (_start _end) (:$1.) @@;
  cards;
  1 2  2 6  2 8  6 2  6 8  6 9
  3 3  4 4  5 7  8 2  8 6  9 6
  ;
  run;

data want(keep=path);
if _n_ eq 1 then do;
if 0 then set have;
length path _path $ 100;
declare hash ha(hashexp:20,dataset:'have',multidata:'Y');
ha.definekey('_start');
ha.definedata('_start','_end');
ha.definedone();

declare hash pa(ordered:'Y');
declare hiter hi_path('pa');
pa.definekey('count');
pa.definedata('path');
pa.definedone();
end;

set have;


count=1;
path=catx(' ',_start,_end);
pa.add();
do while(hi_path.next()=0);
_path=path;
_start=scan(path,-1,' '); 
rc=ha.find();
 do while(rc=0);
  if not find(path,strip(_end)) then do;
                                      count+1;
                                      path=catx(' ',path,_end);
                                      pa.add(); 
                                      path=_path;
                                        end;
     else  output;
  rc=ha.find_next();
 end;
end;

pa.clear();
run;
proc sort data=want nodupkey; by path;run;
data want;
 set want;
 array a{100} $ a1-a100;
 do i=1 to 100;
  a{i}=scan(path,i);
 end;
 call sortc(of a{*});
 want=catx(' ',of a{*});
 keep path want ;
run;
proc sort data=want ; by want;run;
proc sql;
create table x as
 select want from want group by want having count(*)=countw(want);
quit; 
data xx(keep=want );
if _n_ eq 1 then do;
if 0 then set x(rename=(want=_want));
declare hash ha(hashexp:20,dataset:'x(rename=(want=_want))');
declare hiter hi('ha');
ha.definekey('_want');
ha.definedata('_want');
ha.definedone();
end;
set x;
found=1;
do while(hi.next()=0);
do i=1 to countw(_want);
 if not find(want,strip(scan(_want,i))) then found=0;
end;
end;
if found then output;
run;
 



proc sort data=have out=temp nodupkey;by _start;run;
data final_want(keep=want group);
 set temp end=last;
 do i=1 to nobs;
  set xx nobs=nobs point=i;
  if not find(want,strip(_start)) then do;group+1;want=_start;output;end; 
 end;
 if last then do;
                 do j=1 to nobs;
                 set xx nobs=nobs point=j;
                 group+1;output;  stop;
                 end;
               end;
run;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 18 replies
  • 1440 views
  • 3 likes
  • 6 in conversation