BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Swapnil_21
Obsidian | Level 7

Hi ,

I have table which has flow name , job name and dependency as columns, I need to find if same job is present in any of the dependency column. 

 

Below is the Sample  

Flow Name Job Name Dependency
CE_B3_STACK 90-IDM-STK_FMTS-00001  
CE_B3_STACK 91-IDM-STK_FW-00001  
CE_B3_STACK 92-IDM-STK_PRTY-00001 91-IDM-STK_FW-00001
CE_B3_STACK 93-IDM-STK_PRXY-00001 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK 94-IDM-STK_B3_ASSET_CLASS-00001 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00001  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00002  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00003  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00004  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00005  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00006  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00007  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00008  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00009  
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00010  
CE_BSE_CUST_SGB 21-BSE-CUST_SGB-00001 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008

 

 

Flow CE_B3_STACK is perfectly alright. However, CE_BSE_CUST_SGB has 2 issues. It has job name 

21-BSE-CUST_SGB-00001 entry twice and also 21-BSE-CUST_SGB-00001 job shows it is dependent on the same job 21-BSE-CUST_SGB-00001.  

 

So  I need to know and flag out such entries where job is dependent on same job within a flow . Something like below output is expected ..

Flow Name Job Name Dependency Flag
CE_B3_STACK 90-IDM-STK_FMTS-00001   ok 
CE_B3_STACK 91-IDM-STK_FW-00001    
CE_B3_STACK 92-IDM-STK_PRTY-00001 91-IDM-STK_FW-00001 ok 
CE_B3_STACK 93-IDM-STK_PRXY-00001 91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001 ok 
CE_B3_STACK 94-IDM-STK_B3_ASSET_CLASS-00001 90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001 ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00001   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00002   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00003   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00004   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00005   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00006   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00007   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00008   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00009   Not ok 
CE_BSE_CUST_SGB 20-BSE-CUST_SGB-00010   Not ok 
CE_BSE_CUST_SGB 21-BSE-CUST_SGB-00001 20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008 Not ok 

 

 

Any help is really appreciated 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

See this code:

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;

/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;

/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;

data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;

data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;

I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Please supply example data in usable form (correctly working data step with datalines) so we have something to easily play around with.

 

Without that, I can only give suggestions:

First, check for doubles in Job Name by sorting (BY "Flow Name" "Job Name") and using the NODUPKEY and DUPOUT= option.

Next, expand the Dependency into separate observations, so you can do a join with the Job Name column to see if there is a match for all entries.

Swapnil_21
Obsidian | Level 7
Here is the sample dataset

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
;
run;
Kurt_Bremser
Super User

See this code:

data have;
infile datalines delimiter=",";
input Flow_Name: $256. Job_Name: $256. Dependency: $1024.;
cards;
CE_B3_STACK,90-IDM-STK_FMTS-00001,
CE_B3_STACK,91-IDM-STK_FW-00001,
CE_B3_STACK,92-IDM-STK_PRTY-00001,91-IDM-STK_FW-00001
CE_B3_STACK,93-IDM-STK_PRXY-00001,91-IDM-STK_FW-00001|92-IDM-STK_PRTY-00001
CE_B3_STACK,94-IDM-STK_B3_ASSET_CLASS-00001,90-IDM-STK_FMTS-00001|92-IDM-STK_PRTY-00001|93-IDM-STK_PRXY-00001
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00002,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00003,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00004,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00005,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00006,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00007,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00008,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00009,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00010,
CE_BSE_CUST_SGB,20-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008
CE_BSE_CUST_SGB,21-BSE-CUST_SGB-00001,20-BSE-CUST_SGB-00001|20-BSE-CUST_SGB-00002|20-BSE-CUST_SGB-00003|20-BSE-CUST_SGB-00004|20-BSE-CUST_SGB-00005|20-BSE-CUST_SGB-00006|20-BSE-CUST_SGB-00007|20-BSE-CUST_SGB-00008|XXX
;

/* find duplicate job_names */
proc sort
  data=have
  out=dedup /* so we do not destroy "have" */
  dupout=dups (keep=flow_name job_name)
  nodupkey
;
by flow_name job_name;
run;

/* find missing job_names and recursive definitions */
data recursive_miss (keep=flow_name flag);
set have;
by flow_name;
if _n_ = 1
then do;
  declare hash f ();
  f.definekey("job_name");
  f.definedone();
end;
if first.flow_name then rc = f.clear();
rc = f.add();
if findw(dependency,job_name,"|","t")
then do;
  flag = "recursive";
  output;
end;
else do i = 1 to countw(dependency,"|","t");
  if f.find(key:scan(dependency,i,"|")) ne 0
  then do;
    flag = "missing";
    output;
    leave;
  end;
end;
run;

data n_ok (keep=flow_name);
set
  dups
  recursive_miss
;
by flow_name;
if first.flow_name;
run;

data want;
merge
  have
  n_ok (in=n)
;
by flow_name;
flag = ifc(n,"not ok","ok");
run;

I corrected the last dependency from your example data so that a duplicate and recursive job_name is actually created, and then added an additional observation to create a "missing" job.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 474 views
  • 0 likes
  • 2 in conversation