how to find the shortest path between two nodes without using proc optgraph?

Reply
Regular Contributor
Posts: 152

how to find the shortest path between two nodes without using proc optgraph?

[ Edited ]

Hi Experts,

 

If I have dataset like below

 start end

A      B

B      C

C      D

D      E

D      B

B      E

 

 

 

If I want to know the shortest path of any start and end node like A to E, how do I achieve that? (The sample data above consist of infinite loop from A--->B--->C--->D--->B---->E   (BCDBCDBCD loop)

the output table suppose to be something like:

start          end           

A                B

B                E

 

 


Thanks

Super User
Posts: 19,057

Re: how to find the shortest path between two nodes without using proc optgraph?

What does your data look like? Please post sample data and expected output.

 

I'm fairly certain this question has also been asked and answered on here in the past year so some searching may get you answers. 

Regular Contributor
Posts: 152

Re: how to find the shortest path between two nodes without using proc optgraph?

Hi expert, I have modified the question. thanks

Respected Advisor
Posts: 4,811

Re: how to find the start node and end node between two nodes without using proc optgraph?

What is the underlying problem?

 

Why should

 

start end
A B
B C
C A

 

not give

 

start end count
B A 2
A B 1

 

What output would you expect for
start end
A B
B C
B D
C F
D E
E F

PG
Regular Contributor
Posts: 152

Re: how to find the start node and end node between two nodes without using proc optgraph?

Hi PG,

 

The output would you expect for
start end
A B
B C
B D
C F
D E
E F

 

is

 

start end   count of links
A F                (3 because A---B---C---F)

A F                 (4  A--B---D--E---F)

Super User
Posts: 19,057

Re: how to find the start node and end node between two nodes without using proc optgraph?

Can we assume data is ordered as in example? That's a big issue. 

Super Contributor
Posts: 339

Re: how to find the start node and end node between two nodes without using proc optgraph?

I would use Proc Optnet instead:

 


Data A;
  Input start $ end $;
  Datalines;
A B
B C
C D
A F
F G
M N
N M
;

* Find loops;
Proc Optnet Data_Links=A (Rename=(start=from end=to)) Graph_Direction=directed;
  Cycle MinLength=2 MaxLength=5 Out=A_Cycles;
Run;

* Find distances;
Proc OptNet Data_Links=A (Rename=(start=from end=to)) Graph_Direction=directed;
  Shortpath Out_Paths=A_ShortPath;
Run;

Proc SQL;
  Create Table A_ShortPath_Short As Select source,sink,Max(order) as count From A_ShortPath Group By source,sink;
Quit;
Super User
Posts: 9,865

Re: how to find the start node and end node between two nodes without using proc optgraph?

OK. How about this one :





data have;
input _start $ _end $;
cards;
A      B
B      C
C      D
A      F
F      G
;
run;
 
proc sql;
create table ancient as
 select _start,_end from have
  where _start not in (select distinct _end from have);
run;
data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 400 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set ancient;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add();
   path=_path;
 end;
  rc=ha.find_next();
end;
end;
pa.clear();
run;
data want;
 set want;
 start=scan(path,1,'|');
 end=scan(path,-1,'|');
 count=countw(path,'|')-1;
run;


Regular Contributor
Posts: 152

Re: how to find the start node and end node between two nodes without using proc optgraph?

Thanks for your reply, I have modified the question a bit. Is there any good to achieve that?

Super User
Posts: 9,865

Re: how to find the shortest path between two nodes without using proc optgraph?


SURE.






data have;
input _start $ _end $;
cards;
A      B
B      C
A      D
D      C
A      F
F      G
G      J
J      C
;
run;
 
proc sql;
create table ancient as
 select _start,_end from have
  where _start not in (select distinct _end from have);
quit;
data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

declare hash want(multidata:'y');
want.definekey('n_word');
want.definedata('path');
want.definedone();
end;


set ancient end=last;
retain min 999999;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();
 if rc ne 0 then do;
  found=1;
  n_word=countw(path,'|');want.add();
  if n_word lt min then min=n_word;
 end;
 
if not found then do;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add();
   path=_path;
  end;
  rc=ha.find_next();
 end;
end;

end;
pa.clear();

if last then do;
 n_word=min;
 rc=want.find();
 do while(rc=0);
  output;
  rc=want.find_next();
 end;
end;
run;


Regular Contributor
Posts: 152

Re: how to find the shortest path between two nodes without using proc optgraph?

Hi Ksharp,

tested you program on much larger datasets however it does not work as expected.

the following program produce 1 milliton records dummary data

 

%let NObs = 1000000;

data network1(keep=from i);

call streaminit(123);

Min = 1;

Max = 10000;

do i = 1 to &NObs;

u = rand("Uniform"); /* decimal values in (0,1) */

from = floor( (1+Max)*u ); /* integer values in 0..Max */

output;

end;

run;

data network2(keep=to i);

call streaminit(811);

Min = 1;

Max = 10000;

do i = 1 to &NObs;

u = rand("Uniform"); /* decimal values in (0,1) */

to = floor( (1+Max)*u ); /* integer values in Min..Max */

output;

end;

run;

data network;

merge network1 network2;

by i;

_start=compress(put(from,8.)); _end=compress(put(to,8.));

drop i from to;

run;

Super User
Posts: 9,865

Re: how to find the shortest path between two nodes without using proc optgraph?

When you say that does not work, Can you post where is wrong? 
Point me which the shortest path is not right ?

Let's Take a little small table as an example 1000.
Which shortest path is not right ?

%let NObs = 1000;
data network1(keep=from i);
call streaminit(123);
Min = 1;
Max = 10000;
do i = 1 to &NObs;
u = rand("Uniform"); /* decimal values in (0,1) */
from = floor( (1+Max)*u ); /* integer values in 0..Max */
output;
end;
run;
data network2(keep=to i);
call streaminit(811);
Min = 1;
Max = 10000;
do i = 1 to &NObs;
u = rand("Uniform"); /* decimal values in (0,1) */
to = floor( (1+Max)*u ); /* integer values in Min..Max */
output;
end;
run;
data have;
merge network1 network2;
by i;
_start=compress(put(from,8.)); _end=compress(put(to,8.));
drop i from to;
run;

Super User
Posts: 9,865

Re: how to find the shortest path between two nodes without using proc optgraph?

If you check LOG you will see:

NOTE: There were 0 observations read from the data set WORK.ANCIENT.

That means you don't have any START value , you have loop .
Therefore, you can't get any paths .

Regular Contributor
Posts: 152

Re: how to find the shortest path between two nodes without using proc optgraph?

my mistake, yes it works if the data not in a infinite loop. just wondering is there any way to identify the shortest path for particular starting and end node as input data?
Regular Contributor
Posts: 152

Re: how to find the shortest path between two nodes without using proc optgraph?

I modified my question a bit further, If I want to check the shortest path between node A to B or mutiple node pairs, Is it possible to do in hash table?

Ask a Question
Discussion stats
  • 24 replies
  • 795 views
  • 4 likes
  • 8 in conversation