turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- how to find the shortest path between two nodes w...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-09-2016 12:22 AM - edited 09-12-2016 07:49 PM

Hi Experts,

If I have dataset like below

start end

A B

B C

C D

D E

D B

B E

If I want to know the shortest path of any start and end node like A to E, how do I achieve that? (The sample data above consist of infinite loop from A--->B--->C--->D--->B---->E (BCDBCDBCD loop)

the output table suppose to be something like:

start end

A B

B E

Thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-09-2016 12:28 AM

What does your data look like? Please post sample data and expected output.

I'm fairly certain this question has also been asked and answered on here in the past year so some searching may get you answers.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

09-09-2016 12:39 AM

Hi expert, I have modified the question. thanks

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-09-2016 12:58 AM

What is the underlying problem?

Why should

start end

A B

B C

C A

not give

start end count

B A 2

A B 1

What output would you expect for

start end

A B

B C

B D

C F

D E

E F

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

09-09-2016 01:09 AM

Hi PG,

The output would you expect for

start end

A B

B C

B D

C F

D E

E F

is

start end count of links

A F (3 because A---B---C---F)

A F (4 A--B---D--E---F)

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-09-2016 01:28 AM

Can we assume data is ordered as in example? That's a big issue.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-09-2016 03:01 AM

I would use Proc Optnet instead:

```
Data A;
Input start $ end $;
Datalines;
A B
B C
C D
A F
F G
M N
N M
;
* Find loops;
Proc Optnet Data_Links=A (Rename=(start=from end=to)) Graph_Direction=directed;
Cycle MinLength=2 MaxLength=5 Out=A_Cycles;
Run;
* Find distances;
Proc OptNet Data_Links=A (Rename=(start=from end=to)) Graph_Direction=directed;
Shortpath Out_Paths=A_ShortPath;
Run;
Proc SQL;
Create Table A_ShortPath_Short As Select source,sink,Max(order) as count From A_ShortPath Group By source,sink;
Quit;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-09-2016 05:27 AM

OK. How about this one : data have; input _start $ _end $; cards; A B B C C D A F F G ; run; proc sql; create table ancient as select _start,_end from have where _start not in (select distinct _end from have); run; data want(keep=path); if _n_ eq 1 then do; length path _path $ 400 ; if 0 then set have; declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y'); ha.definekey('_start'); ha.definedata('_end'); ha.definedone(); declare hash pa(ordered:'y'); declare hiter hi_path('pa'); pa.definekey('n'); pa.definedata('n','path'); pa.definedone(); end; set ancient; count=1;n=1;_n=1; path=catx('|',_start,_end); pa.add(); do while(hi_path.next()=0); if n ne 1 then pa.remove(key:_n);_n=n; _path=path; _start=scan(path,-1,'|'); rc=ha.find();if rc ne 0 then output; do while(rc=0); if not findw(path,strip(_end),'|') then do; if length(path)+length(_end)+1 gt lengthc(path) then do; putlog 'ERROR: The length of path and _path are set too short'; stop; end; count+1;n=count; path=catx('|',path,_end); pa.add(); path=_path; end; rc=ha.find_next(); end; end; pa.clear(); run; data want; set want; start=scan(path,1,'|'); end=scan(path,-1,'|'); count=countw(path,'|')-1; run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-11-2016 11:40 PM

Thanks for your reply, I have modified the question a bit. Is there any good to achieve that?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-12-2016 02:06 AM

SURE. data have; input _start $ _end $; cards; A B B C A D D C A F F G G J J C ; run; proc sql; create table ancient as select _start,_end from have where _start not in (select distinct _end from have); quit; data want(keep=path); if _n_ eq 1 then do; length path _path $ 800 ; if 0 then set have; declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y'); ha.definekey('_start'); ha.definedata('_end'); ha.definedone(); declare hash pa(ordered:'y'); declare hiter hi_path('pa'); pa.definekey('n'); pa.definedata('n','path'); pa.definedone(); declare hash want(multidata:'y'); want.definekey('n_word'); want.definedata('path'); want.definedone(); end; set ancient end=last; retain min 999999; count=1;n=1;_n=1; path=catx('|',_start,_end); pa.add(); do while(hi_path.next()=0); if n ne 1 then pa.remove(key:_n);_n=n; _path=path; _start=scan(path,-1,'|'); rc=ha.find(); if rc ne 0 then do; found=1; n_word=countw(path,'|');want.add(); if n_word lt min then min=n_word; end; if not found then do; do while(rc=0); if not findw(path,strip(_end),'|') then do; if length(path)+length(_end)+1 gt lengthc(path) then do; putlog 'ERROR: The length of path and _path are set too short'; stop; end; count+1;n=count; path=catx('|',path,_end); pa.add(); path=_path; end; rc=ha.find_next(); end; end; end; pa.clear(); if last then do; n_word=min; rc=want.find(); do while(rc=0); output; rc=want.find_next(); end; end; run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-12-2016 02:42 AM

Hi Ksharp,

tested you program on much larger datasets however it does not work as expected.

the following program produce 1 milliton records dummary data

%let NObs = 1000000;

**data** network1(keep=from i);

call streaminit(**123**);

Min = **1**;

Max = **10000**;

do i = **1** to &NObs;

u = rand("Uniform"); /* decimal values in (0,1) */

from = floor( (**1**+Max)*u ); /* integer values in 0..Max */

output;

end;

**run**;

**data** network2(keep=to i);

call streaminit(**811**);

Min = **1**;

Max = **10000**;

do i = **1** to &NObs;

u = rand("Uniform"); /* decimal values in (0,1) */

to = floor( (**1**+Max)*u ); /* integer values in Min..Max */

output;

end;

**run**;

**data** network;

merge network1 network2;

by i;

_start=compress(put(from,**8.**)); _end=compress(put(to,**8.**));

drop i from to;

**run**;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-12-2016 03:02 AM

When you say that does not work, Can you post where is wrong? Point me which the shortest path is not right ? Let's Take a little small table as an example 1000. Which shortest path is not right ? %let NObs = 1000; data network1(keep=from i); call streaminit(123); Min = 1; Max = 10000; do i = 1 to &NObs; u = rand("Uniform"); /* decimal values in (0,1) */ from = floor( (1+Max)*u ); /* integer values in 0..Max */ output; end; run; data network2(keep=to i); call streaminit(811); Min = 1; Max = 10000; do i = 1 to &NObs; u = rand("Uniform"); /* decimal values in (0,1) */ to = floor( (1+Max)*u ); /* integer values in Min..Max */ output; end; run; data have; merge network1 network2; by i; _start=compress(put(from,8.)); _end=compress(put(to,8.)); drop i from to; run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gyambqt

09-12-2016 03:07 AM

If you check LOG you will see: NOTE: There were 0 observations read from the data set WORK.ANCIENT. That means you don't have any START value , you have loop . Therefore, you can't get any paths .

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-12-2016 07:28 PM

my mistake, yes it works if the data not in a infinite loop. just wondering is there any way to identify the shortest path for particular starting and end node as input data?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-12-2016 07:50 PM

I modified my question a bit further, If I want to check the shortest path between node A to B or mutiple node pairs, Is it possible to do in hash table?