BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ovaty
Fluorite | Level 6

Hi everybody!

Good afternoon.

 

Today I've a issue to solve in my job. I've a database that have 2 columns to ordinate them into a sequence.

The colum t0 (is the origin) and t1 (is the receptor).

I need to do a program to make a sequence linked t0 to t1.

 

Let's show you my example:

 

 t0t1
Line 1HG
Line 2BA
Line 3PO
Line 4CB
Line 5QP
Line 6IH
Line 7DC
Line 8RQ
Line 9JI

 

I need to create 3 sequences (in order origin-receptor) in this example.

Like this:

 

seq1ABCD
seq2GHIJ
seq3OPQR

 

 

Thaks a lot to everyone!!

Gustavo.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

OK. Here is . I change the code a little bit just for case of LOOP ,if you have it.

 

proc import datafile='c:\temp\test_base.txt' out=have(rename=(start=_start end=_end)) replace dbms=tab;
run;


proc sql;
create table x as
select distinct _start,_end
 from have
  where _start not in (select distinct _end from have);
quit;



 

data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found  ' _end;
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();   if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   *putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); 
   path=_path;
 end;  else output; /* <--Changed. It is a circle.*/
  rc=ha.find_next();
end;
end;
pa.clear();

run;

proc sql noprint;
select max(countw(path,'|')) into : n separated by ' '
 from want;
quit;

data final_want;
 set want;
 array col{*} $ 100 col1-col&n;
 do i=1 to countw(path,'|');
  col{i}=scan(path,i,'|');
 end;
 drop i;
run;

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

Looks like you are trying to traverse a graph.  Your output looks backwards.  You are moving from T1 to T0 .

 

Is it possible that your input data has any cycles?

Ovaty
Fluorite | Level 6
Hi Tom!

I'm not trying to traverse any graph.
The data is a trail, looks like a production sequence. To make a car, I've to do many steps and after all steps I've the production sequence.

Did you understood my explanation? Is it clear?
Tom
Super User Tom
Super User

The type of data you have is called a graph.  They are used in computer science and other fields.

The nodes of the graph are the values of your two variables (A, B, etc) and each observations represents an edge in the graph (the line between the nodes).  You appear to have directed lines that go from the node in T1 to the node in T0.

 

There are many examples of code on this site for finding connected components (or subgraphs).

If you have SAS/OR licensed there are procs that can help you.

Ksharp
Super User

data have(keep=t0 t1 rename=(t0=_end t1=_start));
infile cards expandtabs;
input x $ y	t0 $ t1 $;
cards;
Line 1	H	G
Line 2	B	A
Line 3	P	O
Line 4	C	B
Line 5	Q	P
Line 6	I	H
Line 7	D	C
Line 8	R	Q
Line 9	J	I
;
run;
proc sql;
create table x as
select distinct _start,_end
 from have
  where _start not in (select distinct _end from have);
quit;



 

data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 700 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found  ' _end;
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();   if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   *putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); 
   path=_path;
 end; 
else output; /* <--Changed. It is a circle.*/ rc=ha.find_next(); end; end; pa.clear(); run; proc print;run;
Ovaty
Fluorite | Level 6
Great Ksharp! Tks a lot.

One last question...
How can I put each values like A|B|C|D in each column?

Gustavo.
Ovaty
Fluorite | Level 6

Hi Ksharp, I've send a example file.

Ksharp
Super User

OK. Here is . I change the code a little bit just for case of LOOP ,if you have it.

 

proc import datafile='c:\temp\test_base.txt' out=have(rename=(start=_start end=_end)) replace dbms=tab;
run;


proc sql;
create table x as
select distinct _start,_end
 from have
  where _start not in (select distinct _end from have);
quit;



 

data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found  ' _end;
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();   if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   *putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); 
   path=_path;
 end;  else output; /* <--Changed. It is a circle.*/
  rc=ha.find_next();
end;
end;
pa.clear();

run;

proc sql noprint;
select max(countw(path,'|')) into : n separated by ' '
 from want;
quit;

data final_want;
 set want;
 array col{*} $ 100 col1-col&n;
 do i=1 to countw(path,'|');
  col{i}=scan(path,i,'|');
 end;
 drop i;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1979 views
  • 3 likes
  • 3 in conversation