I have a data set with job titles and job codes that are 1 to 8 digits long. I have another data set with a study population that have job codes from 10 to 12 digits long. The first 1 to 8 digits will match the job codes from the first data set. My plan is to write code to match the first 8 digits. If it matches, done. If not, match the first 7 digits. If it matches, done. If not, match the first 6 digits. And keep going until it's done. My question is how to write this in SAS. Would the "or" operator achieve this way of matching on the highest number digits, stopping if it finds a match, and looking for the next number of digits if not? Should the code look like this? proc sql;
create table job_titles as
select A.*, B.job_title
from study_population as A
left join occupation_data as B
on A.industry = B.industry
and A.seniority = A.seniority
and (upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,8))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,7))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,6))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,5))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,4))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,3))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,2))
or upcase(A.job_code) = upcase(substr(B.Occupation_Code,1,1)) );
quit; You'll notice that I am matching by industry and seniority first because the code will depend on those 2 things as well.
... View more