Hi everybody!
Good afternoon.
Today I've a issue to solve in my job. I've a database that have 2 columns to ordinate them into a sequence.
The colum t0 (is the origin) and t1 (is the receptor).
I need to do a program to make a sequence linked t0 to t1.
Let's show you my example:
t0 | t1 | |
Line 1 | H | G |
Line 2 | B | A |
Line 3 | P | O |
Line 4 | C | B |
Line 5 | Q | P |
Line 6 | I | H |
Line 7 | D | C |
Line 8 | R | Q |
Line 9 | J | I |
I need to create 3 sequences (in order origin-receptor) in this example.
Like this:
seq1 | A | B | C | D |
seq2 | G | H | I | J |
seq3 | O | P | Q | R |
Thaks a lot to everyone!!
Gustavo.
OK. Here is . I change the code a little bit just for case of LOOP ,if you have it.
proc import datafile='c:\temp\test_base.txt' out=have(rename=(start=_start end=_end)) replace dbms=tab;
run;
proc sql;
create table x as
select distinct _start,_end
from have
where _start not in (select distinct _end from have);
quit;
data want(keep=path);
if _n_ eq 1 then do;
length path _path $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();
end;
set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found ' _end;
pa.add();
do while(hi_path.next()=0);
if n ne 1 then pa.remove(key:_n);_n=n;
_path=path;
_start=scan(path,-1,'|');
rc=ha.find(); if rc ne 0 then output;
do while(rc=0);
if not findw(path,strip(_end),'|') then do;
if length(path)+length(_end)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
*putlog 'WARNING:Found ' _end;
count+1;n=count;
path=catx('|',path,_end);
pa.add();
path=_path;
end; else output; /* <--Changed. It is a circle.*/
rc=ha.find_next();
end;
end;
pa.clear();
run;
proc sql noprint;
select max(countw(path,'|')) into : n separated by ' '
from want;
quit;
data final_want;
set want;
array col{*} $ 100 col1-col&n;
do i=1 to countw(path,'|');
col{i}=scan(path,i,'|');
end;
drop i;
run;
Looks like you are trying to traverse a graph. Your output looks backwards. You are moving from T1 to T0 .
Is it possible that your input data has any cycles?
The type of data you have is called a graph. They are used in computer science and other fields.
The nodes of the graph are the values of your two variables (A, B, etc) and each observations represents an edge in the graph (the line between the nodes). You appear to have directed lines that go from the node in T1 to the node in T0.
There are many examples of code on this site for finding connected components (or subgraphs).
If you have SAS/OR licensed there are procs that can help you.
data have(keep=t0 t1 rename=(t0=_end t1=_start));
infile cards expandtabs;
input x $ y t0 $ t1 $;
cards;
Line 1 H G
Line 2 B A
Line 3 P O
Line 4 C B
Line 5 Q P
Line 6 I H
Line 7 D C
Line 8 R Q
Line 9 J I
;
run;
proc sql;
create table x as
select distinct _start,_end
from have
where _start not in (select distinct _end from have);
quit;
data want(keep=path);
if _n_ eq 1 then do;
length path _path $ 700 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();
end;
set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found ' _end;
pa.add();
do while(hi_path.next()=0);
if n ne 1 then pa.remove(key:_n);_n=n;
_path=path;
_start=scan(path,-1,'|');
rc=ha.find(); if rc ne 0 then output;
do while(rc=0);
if not findw(path,strip(_end),'|') then do;
if length(path)+length(_end)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
*putlog 'WARNING:Found ' _end;
count+1;n=count;
path=catx('|',path,_end);
pa.add();
path=_path;
end;
else output; /* <--Changed. It is a circle.*/
rc=ha.find_next();
end;
end;
pa.clear();
run;
proc print;run;
Hi Ksharp, I've send a example file.
OK. Here is . I change the code a little bit just for case of LOOP ,if you have it.
proc import datafile='c:\temp\test_base.txt' out=have(rename=(start=_start end=_end)) replace dbms=tab;
run;
proc sql;
create table x as
select distinct _start,_end
from have
where _start not in (select distinct _end from have);
quit;
data want(keep=path);
if _n_ eq 1 then do;
length path _path $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();
end;
set x;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found ' _end;
pa.add();
do while(hi_path.next()=0);
if n ne 1 then pa.remove(key:_n);_n=n;
_path=path;
_start=scan(path,-1,'|');
rc=ha.find(); if rc ne 0 then output;
do while(rc=0);
if not findw(path,strip(_end),'|') then do;
if length(path)+length(_end)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
*putlog 'WARNING:Found ' _end;
count+1;n=count;
path=catx('|',path,_end);
pa.add();
path=_path;
end; else output; /* <--Changed. It is a circle.*/
rc=ha.find_next();
end;
end;
pa.clear();
run;
proc sql noprint;
select max(countw(path,'|')) into : n separated by ' '
from want;
quit;
data final_want;
set want;
array col{*} $ 100 col1-col&n;
do i=1 to countw(path,'|');
col{i}=scan(path,i,'|');
end;
drop i;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.