Hi ,
I have table which has flow name , job name and dependency as columns, I need to find if same job is present in any of the dependency column.
Below is the Sample
| Flow Name | Job Name | Dependency | 
| CE_B3_STACK | 90-IDM-STK_FMTS-00001 | |
| CE_B3_STACK | 91-IDM-STK_FW-00001 | |
| CE_B3_STACK | 92-IDM-STK_PRTY-00001 | 91-IDM-STK_FW-00001 | 
| CE_B3_STACK | 93-IDM-STK_PRXY-00001 | 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001 | 
| CE_B3_STACK | 94-IDM-STK_B3_ASSET_CLASS-00001 | 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001 | 
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00001 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00002 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00003 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00004 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00005 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00006 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00007 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00008 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00009 | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00010 | |
| CE_BSE_CUST_SGB | 21-BSE-CUST_SGB-00001 | 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008 | 
Flow CE_B3_STACK is perfectly alright. However, CE_BSE_CUST_SGB has 2 issues. It has job name
21-BSE-CUST_SGB-00001 entry twice and also 21-BSE-CUST_SGB-00001 job shows it is dependent on the same job 21-BSE-CUST_SGB-00001.
So I need to know and flag out such entries where job is dependent on same job within a flow . Something like below output is expected ..
| Flow Name | Job Name | Dependency | Flag | 
| CE_B3_STACK | 90-IDM-STK_FMTS-00001 | ok | |
| CE_B3_STACK | 91-IDM-STK_FW-00001 | ||
| CE_B3_STACK | 92-IDM-STK_PRTY-00001 | 91-IDM-STK_FW-00001 | ok | 
| CE_B3_STACK | 93-IDM-STK_PRXY-00001 | 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001 | ok | 
| CE_B3_STACK | 94-IDM-STK_B3_ASSET_CLASS-00001 | 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001 | ok | 
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00001 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00002 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00003 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00004 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00005 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00006 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00007 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00008 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00009 | Not ok | |
| CE_BSE_CUST_SGB | 20-BSE-CUST_SGB-00010 | Not ok | |
| CE_BSE_CUST_SGB | 21-BSE-CUST_SGB-00001 | 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008 | Not ok | 
Any help is really appreciated
See this code:
data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;
/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;
/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;
data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;
data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.
Please supply example data in usable form (correctly working data step with datalines) so we have something to easily play around with.
Without that, I can only give suggestions:
First, check for doubles in Job Name by sorting (BY "Flow Name" "Job Name") and using the NODUPKEY and DUPOUT= option.
Next, expand the Dependency into separate observations, so you can do a join with the Job Name column to see if there is a match for all entries.
See this code:
data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;
/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;
/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;
data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;
data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
