I have two input files. The first one, "failed_jobs", is a list of failed jobs. These will be anywhere from 1-64 characters and may have a period "." in the jobs name at different places. The second is "exceptions", these can also be from 1-64 characters with the possibility of a period in the name. I need to exclude any job that is on the exceptions list. The exceptions may be the entire jobs name, or it may anywhere from the first 2 charters to 64. Because of the variable length matching, I am having a had time figuring out how to make this happen. Any thought would be welcome.
In this example jobs PESPCD01, any PCAIM* jobs and PDUKCM01.WEB02 should be the ones excluded from the list.
data failed_jobs;
input error_code $ 1 - 8 jobs $ 9 - 73;
datalines;
ESPPS06 PESPCD01
ESPPS06 PESPCDAD
ESPPS06 POWBCMAA
ESPPS06 PCAICA00
ESPPS06 PCAICM02
ESPPS06 PCAICM03
ESPPS06 PCAICM04
ESPPS06 PCAICM05
ESPPS06 PDUKCM01
ESPPS06 PDUKCM01.WEB02
ESPPS06 PESPCD02
;
run;
data exceptions;
input error_code $ 1 - 8 job_exceptions $ 9 - 73;
datalines;
ESPPS06 PESPCD01
ESPPS06 PCAIM
ESPPS06 POWBCAXX
ESPPS06 PDUKCM01.W
;
run;
* final output of failed jobs;
ESPPS06 PESPCDAD
ESPPS06 POWBCMAA
ESPPS06 PCAICA00
ESPPS06 PESPCD02
ESPPS06 PDUKCM01
Hello @a079011,
Try this:
proc sql;
create table want as
select a.* from failed_jobs a
left join exceptions b
on /* a.error_code=b.error_code & */ jobs eqt job_exceptions & length(job_exceptions)<=length(jobs)
where job_exceptions=' ';
quit;
Feel free to use the additional condition on error_code, which I have commented out, if you want to include it.
Please note that your "final output" is not quite consistent with the description: The PCAICMxx jobs seem to have been eliminated by PCAIM.
Hello @a079011,
Try this:
proc sql;
create table want as
select a.* from failed_jobs a
left join exceptions b
on /* a.error_code=b.error_code & */ jobs eqt job_exceptions & length(job_exceptions)<=length(jobs)
where job_exceptions=' ';
quit;
Feel free to use the additional condition on error_code, which I have commented out, if you want to include it.
Please note that your "final output" is not quite consistent with the description: The PCAICMxx jobs seem to have been eliminated by PCAIM.
In addition to the @FreelanceReinh's offering, I'd suggest:
data failed_jobs ;
input error_code $ 1-8 jobs $ 9-73 ;
cards ;
ESPPS06 PESPCD01
ESPPS06 PESPCDAD
ESPPS06 POWBCMAA
ESPPS06 PCAICA00
ESPPS06 PCAICM02
ESPPS06 PCAICM03
ESPPS06 PCAICM04
ESPPS06 PCAICM05
ESPPS06 PDUKCM01
ESPPS06 PDUKCM01.WEB02
ESPPS06 PESPCD02
;
run ;
data exceptions ;
input error_code $ 1-8 job_exceptions $ 9-73 ;
cards ;
ESPPS06 PESPCD01
ESPPS06 PCAICM
ESPPS06 POWBCAXX
ESPPS06 PDUKCM01.W
;
run ;
option msglevel=i ;
proc sql _method ;
create table want as
select * from failed_jobs f
where not exists
(select 1 from exceptions x
where find (f.jobs, trim (x.job_exceptions)) = 1
and f.error_code = x.error_code
) ;
quit ;
One advantage of doing it using the EXISTS subquery clause instead of a subset left join (as @FreelanceReinh has suggested) is that the optimizer doesn't choose to create a Cartesian product and then subset it even if the AND condition above is not included in the query. Also, methinks that using the FIND function in the above manner (it basically asks if job_exceptions is part of jobs starting at position 1) is simpler than the extra comparison between the lengths of jobs and job_exceptions.
Kind regards
Paul D.
Thank you! I have tested a little, and this seems to be working as expected.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.