BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Swapnil_21
Obsidian | Level 7

Hi ,

I have table which has flow name , job name and dependency as columns, I need to find if same job is present in any of the dependency column. 

 

Below is the Sample  

Flow Name Job Name Dependency
CE_B3_STACK 90-IDM-STK_FMTS-00001  
CE_B3_STACK 91-IDM-STK_FW-00001  
CE_B3_STACK 92-IDM-STK_PRTY-00001 91-IDM-STK_FW-00001
CE_B3_STACK 93-IDM-STK_PRXY-00001 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK 94-IDM-STK_B3_ASSET_CLASS-00001 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00001  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00002  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00003  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00004  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00005  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00006  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00007  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00008  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00009  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00010  
CE_BSE_CUST_SGB 21-BSE-CUST_SGB-00001 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008

 

 

Flow CE_B3_STACK is perfectly alright. However, CE_BSE_CUST_SGB has 2 issues. It has job name 

21-BSE-CUST_SGB-00001 entry twice and also 21-BSE-CUST_SGB-00001 job shows it is dependent on the same job 21-BSE-CUST_SGB-00001.  

 

So  I need to know and flag out such entries where job is dependent on same job within a flow . Something like below output is expected ..

Flow Name Job Name Dependency Flag
CE_B3_STACK 90-IDM-STK_FMTS-00001   ok 
CE_B3_STACK 91-IDM-STK_FW-00001    
CE_B3_STACK 92-IDM-STK_PRTY-00001 91-IDM-STK_FW-00001 ok 
CE_B3_STACK 93-IDM-STK_PRXY-00001 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001 ok 
CE_B3_STACK 94-IDM-STK_B3_ASSET_CLASS-00001 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001 ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00001   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00002   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00003   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00004   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00005   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00006   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00007   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00008   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00009   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00010   Not ok 
CE_BSE_CUST_SGB 21-BSE-CUST_SGB-00001 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008 Not ok 

 

 

Any help is really appreciated 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See this code:

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;

/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;

/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;

data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;

data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;

I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Please supply example data in usable form (correctly working data step with datalines) so we have something to easily play around with.

 

Without that, I can only give suggestions:

First, check for doubles in Job Name by sorting (BY "Flow Name" "Job Name") and using the NODUPKEY and DUPOUT= option.

Next, expand the Dependency into separate observations, so you can do a join with the Job Name column to see if there is a match for all entries.

Swapnil_21
Obsidian | Level 7
Here is the sample dataset

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
;
run;
Kurt_Bremser
Super User

See this code:

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;

/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;

/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;

data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;

data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;

I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 682 views
  • 0 likes
  • 2 in conversation