BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi Community Members.

I have a network analysis problem.

The background is as follows: We have a lot of complicated DI Studio jobs creating multiple output tables. They are a mess of crossing arrows and very difficult to maintain, because the full canvas cannot fit into one screen without scrolling both vertical and horisontal. So we want to split them into smaller jobs if they contain independent structures. A structure is the set of transformations (steps) that contribute to one output table, and all structures that shares at least one transformation will go into one new job.

I have extracted metadata and created the "have" table as input to the analysis. (theoretical example, the real data set has more that 200.000 obs). So far there are no problems.

 

data have;
input JobID$ Structure TransID$;
datalines;
A 1 T11
A 1 T6
A 1 T1
A 2 T16
A 2 T12
A 2 T8
A 2 T7 
A 2 T2
A 2 T1
A 3 T13
A 3 T9
A 3 T4
A 3 T3
A 4 T17
A 4 T14
A 4 T9
A 4 T4
A 4 T3
A 5 T18
A 5 T15
A 5 T10
A 5 T5
;

 

The next step is to identify independent structures in this list. To simplify the problem, the transformations can be left out, because they can be merged back after analysis:

 

proc sql;
  create table w2 as
    select distinct a.JobID, a.Structure, b.Structure as ShareStructure
    from have as a
    left join have as b
    on a.JobID = b.JobID and a.TransID = b.TransID;
quit;

matrixanalysis.gif

 

 

 

 

 

 

 

The structure 5 occurs only once, so it could be an independent job, but how to group structures 1-2 and 3-4 as two independent jobs? - there can be up to 50 structures in a job. I am stuck there, though it seems a trivial problem, and I hope somebody can help with a solution that doesn't require products not found in Foundation.

Tre result should be this:

 

data want;
input JobID$ Structure NewJobID;
datalines;
A 1 1
A 2 1
A 3 2
A 4 2
A 5 3
;

Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data temp;
input JobID$ Structure TransID$;
datalines;
A 1 T11
A 1 T6
A 1 T1
A 2 T16
A 2 T12
A 2 T8
A 2 T7 
A 2 T2
A 2 T1
A 3 T13
A 3 T9
A 3 T4
A 3 T3
A 4 T17
A 4 T14
A 4 T9
A 4 T4
A 4 T3
A 5 T18
A 5 T15
A 5 T10
A 5 T5
;
proc sql;
  create table w2 as
    select distinct a.JobID, a.Structure, b.Structure as ShareStructure
    from temp as a
    left join temp as b
    on a.JobID = b.JobID and a.TransID = b.TransID;
quit;
















data have;
 set w2(rename=(Structure=from ShareStructure=to));
 run;

data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Just to be sure, you don't have an OR license right?

ErikLund_Jensen
Rhodochrosite | Level 12

@PeterClemmensen 

 

Right. Only Base, Graph and Stat + some acess modules.

 

I think it can be done with a two-dimensional array to hold Structures in one dimension and all shared structures in the other dimension within a job and a lot of array transversing, but that is complicated to build. So I hope somebody smarter than me can suggest a simple and elegant solution.  

Ksharp
Super User
data temp;
input JobID$ Structure TransID$;
datalines;
A 1 T11
A 1 T6
A 1 T1
A 2 T16
A 2 T12
A 2 T8
A 2 T7 
A 2 T2
A 2 T1
A 3 T13
A 3 T9
A 3 T4
A 3 T3
A 4 T17
A 4 T14
A 4 T9
A 4 T4
A 4 T3
A 5 T18
A 5 T15
A 5 T10
A 5 T5
;
proc sql;
  create table w2 as
    select distinct a.JobID, a.Structure, b.Structure as ShareStructure
    from temp as a
    left join temp as b
    on a.JobID = b.JobID and a.TransID = b.TransID;
quit;
















data have;
 set w2(rename=(Structure=from ShareStructure=to));
 run;

data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Ksharp 

 

Thank you very much.

 

It works perfect for one Job. Due to my poor understanding of the hash object i couldn't figure out how to obtain a separate analysis per JobID and get the JobID through the code to output. Bot I found a workaround by using concatenated keys, so my problem is solved.

Ksharp
Super User

"Bot I found a workaround by using concatenated keys, so my problem is solved."

Yes. you are right , concate JobID and Structure , as your wish :

 

data have;
 set w2;
 length from to $ 80;
 from=catx('|',JobID,Structure);
 to=catx('|',JobID,ShareStructure);
keep from to;
 run;

Ksharp_0-1662890877396.png

 

ErikLund_Jensen
Rhodochrosite | Level 12

@Ksharp 

Thanks again!

Up and running now with a weekly rapport. it seems the time has come to learn the use of hash objects , as I have a bad feeling about using code I don't understand. I also never got round to DS2 or Proc Fedsql, Maybe they too were worth looking into...

Ksharp
Super User
" as I have a bad feeling about using code I don't understand. "
Me too. I would test it far more enough before using it.

" I also never got round to DS2 or Proc Fedsql, Maybe they too were worth looking into..."
Me too. I never learned DS2 or Fedsql either, and don't know how to judge them .But if you are in SAS Viya environment, maybe you need them.

BTW, Hash Table is a must-have tool for sas programmer .

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 565 views
  • 1 like
  • 3 in conversation