Hi Experts,
I am looking for a SAS program to solve the following issue (better to be hash table but any other program is good)
I have a SAS table below:
v1 and v2 are two variables and v1 is depending on v2 (e.g. v2 must run before v1 run)
v1 v2
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A
JOB_F JOB_C
JOB_G JOB_A
........
.......
for example, if JOB_E is depending on JOB_A to run and JOB_A is depending on JOB_B to run, then I want to show their relationship in one row as following table:
for example: if A-->B--->C--->D--->E then I want to create 5 variables and to show their relationship in one row.
The number of variables created in depending on number of JOB that has direct or indirect relations.
v1 v2 v3
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A JOB_B
JOB_F JOB_C JOB_D
JOB_G JOB_A JOB_B
...
...
how do I do that with hash table?
data have;
input (_end _start) ($) @@;
cards;
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A
JOB_F JOB_C
JOB_G JOB_A
;
run;
data ancestor;
if _n_ eq 1 then do;
if 0 then set have;
declare hash h(dataset:'have');
h.definekey('_end');
h.definedone();
end;
set have;
if h.check(key:_start) ne 0;
run;
data want;
if _n_ eq 1 then do;
length path _path $ 400 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'Y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'Y');
declare hiter hi_path('pa');
pa.definekey('count');
pa.definedata('path');
pa.definedone();
end;
set ancestor;
count=1;
path=catx(' ',_start,_end);
pa.add(); output;
do while(hi_path.next()=0);
_path=path;
_start=scan(path,-1,' ');
rc=ha.find();
do while(rc=0);
if not find(path,strip(_end)) then do;
count+1;
path=catx(' ',path,_end);
pa.add(); output;
path=_path;
end;
rc=ha.find_next();
end;
end;
pa.clear();
keep count path;
run;
HI Patrick,
I went through that post but I guess my problem is slightly more difficult as I need to show all their relationship in one single row (not only the first and last), so you need to create another variable for every dependency it has and you never how many JOBS are dependent on each other either directly or indirectly.
for example: if A-->B--->C--->D--->E then 5 variables should be created and to show their relationship in one row.
You explicitly asked for a hash-solution, but if you have access to the OR module, proc shortpath (or proc bom?) might save a lot of effort:
Data LinkSet;
Input From $ To $;
Datalines;
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A
JOB_F JOB_C
JOB_G JOB_A
;
Run;
Data NodeSet;
Set LinkSet;
Node=From; Source=1; Sink=0;
Output;
Node=To; Source=0; Sink=1;
Output;
Drop From To;
Run;
Proc Sort Data=NodeSet Nodupkey;
By Node;
Run;
Proc Optnet
Data_Nodes_Sub=NodeSet
Data_Links=LinkSet
Direction=Directed;
Shortpath Out_Paths=Result;
Run;
Proc Transpose Data=Result Out=Result_Trans (Rename=(Source=V0) Keep=Source V:) Prefix=V;
By source sink;
Var To;
ID order;
Run;
.. but hash is possible too ..
Data LinkSet;
Input V1 $ V2 $;
Datalines;
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A
JOB_F JOB_C
JOB_G JOB_A
JOB_Y JOB_Z
JOB_B JOB_C
JOB_D JOB_Y
;
Run;
%Let i=0;
%Let Left_overs=1;
%Macro HLoop;
Data Want;
Set Linkset;
Run;
%Do %While ((&Left_overs.^=0) & (&i.<=10));
%Let i=%Eval(&i.+1);
%Let Left_overs=0;
Data Want (Rename=(Link=V%Eval(&i.+1) To=V%Eval(&i.+2)));
If _N_ eq 1 Then Do;
Declare Hash H (Dataset:'LinkSet (Rename=(V1=Link V2=To))');
H.Definekey('Link');
H.Definedata('To');
H.Definedone();
If 0 Then Set LinkSet (Rename=(V1=Link V2=To));
End;
Set Want (Rename=(V%Eval(&i.+1)=Link));
rc=H.Find();
If rc & not Missing (Link) Then Call Symputx('Left_overs',rc);
If rc Then Call Missing (To);
Run;
%Put **&Left_overs.**&i.**;
%End;
Data Want;
Retain V1-V%Eval(&i.);
Set Want (Keep=V1-V%Eval(&i.));
Run;
%Mend;
%HLoop
data have;
input (_end _start) ($) @@;
cards;
JOB_A JOB_B
JOB_C JOB_D
JOB_E JOB_A
JOB_F JOB_C
JOB_G JOB_A
;
run;
data ancestor;
if _n_ eq 1 then do;
if 0 then set have;
declare hash h(dataset:'have');
h.definekey('_end');
h.definedone();
end;
set have;
if h.check(key:_start) ne 0;
run;
data want;
if _n_ eq 1 then do;
length path _path $ 400 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'Y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'Y');
declare hiter hi_path('pa');
pa.definekey('count');
pa.definedata('path');
pa.definedone();
end;
set ancestor;
count=1;
path=catx(' ',_start,_end);
pa.add(); output;
do while(hi_path.next()=0);
_path=path;
_start=scan(path,-1,' ');
rc=ha.find();
do while(rc=0);
if not find(path,strip(_end)) then do;
count+1;
path=catx(' ',path,_end);
pa.add(); output;
path=_path;
end;
rc=ha.find_next();
end;
end;
pa.clear();
keep count path;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.