sys_id company_name company_identifer state
1 ABC Co. 245679762 MI
2 XYZ Co. 876234T87 OH
3 WALDEN 88723O124 MA
4 GOOGLE 77823I457 CA
5 TALI 44321H887 AZ
I have about 10,000 or so obs similar to the ones above. My question is if I need to lookup a company based on the first 6 digits or chars of the company id since I'm not sure about the complete 9 digit, how would I go about implementing that. Also the company_id is passed dynamically rather than hard-coded.
So you have a flat file that has the following ids:
These ids are passed on to a macro var where the program then separates the list and takes one id at a time and do a comparison against the dataset. So if the first 6 digits or chars matches the ID then output the obs.
I know this can be done by implementing something like:
if company_identifier =: &6_digit_id then output;
I was only able to make this work for 1 id. However when I code in a do loop I only get the last ID (which is obvious) and I cant seem to find a way to retain the previous IDs that have been passed in. How do I modify the code so I can have it implemented for multiple ID's such that the incoming id value of 6 digits or chars is compared against the datasets 9 digit ID and if the first 6 matches then output those obs.
%do i = 1 to %count(&id);
%let q = %scan(&id,i);
if company_identifier =: &id then output;
I'm not sure what I'm missing. Your help is much appreciated.
As compared to IF/THEN with the IN operator, you should also consider the performance benefit with using a PROC FORMAT and loading it with the CNTLIN= option, presuming you don't have any string-prefix duplicates. The DATA step code then would have a single line assignment statement with a PUT function reference. Fewer moving parts and no macro variable coding necessary.
And, in addition to all the other suggestions, if all you want is an output file of matches, then depending on the sizes involved, either a DATA Step MERGE or an SQL JOIN would also produce the desired output without macro code involvement.
I don't actually see much need for macro coding as you have described your data and end result.
Yep, agree with Cynthia.
proc sql noprint;
create table OUTDATA as
select a.* from DATA as a, LOOKUP as b
where b.COMPANY_IDENTIFIER like cats(a.COMPANY_IDENTIFIER,'%');
Or you could also use the substr function to do the match.
where a.COMPANY_IDENTIFIER eq substr(b.COMPANY_IDENTIFIER,1,6);
Cheers from Portugal.