DATA Step, Macro, Functions and more

How would I search a variable for a variable?

Reply
Contributor
Posts: 27

How would I search a variable for a variable?

I have two datasets, file1 and file2. File1 contains a free form variable called 'jobname'. File2 contains an eight character variable called 'name'. It is likely that the variable 'name' from file2 is contained somewhere in the free form variable 'jobname'. What I would like to do is search jobname to see if I can find a match with name. What would be the best way to code something like this?
Super Contributor
Super Contributor
Posts: 3,174

Re: How would I search a variable for a variable?

It's unclear if you want to pass the entire file2 looking for all 'name' values in file1?

Investigate using the hash object to load 'name' values into memory and then you can use a DATA step function like INDEX to search for a match. The facility is explained in detail at the SAS support http://support.sas.com/ and use the SEARCH facility.

Scott Barry
SBBWorks, Inc.
Respected Advisor
Posts: 3,893

Re: How would I search a variable for a variable?

As the approach below results in a cartesian product using a hash table will be much better for performance (if the source tables are "big").

For small tables: I find the approach below easier to code.

data file1;
jobname='this is job abc001 and abc002'; output;
jobname='this is job abc003 '; output;
jobname='this is job abc004 '; output;
jobname='this is job abc005 '; output;
run;
data file2;
name='abc001';output;
name='abc002';output;
name='abc003';output;
name='abc004';output;
name='abc004';output;
name='job';output;
run;

proc sql;
select DISTINCT file1.*,ifc(file2.name ne '','1','0') as found
from file1 left join (select distinct name from file2) as file2
on find(file1.jobname,file2.name,'i');
quit;

HTH
Patrick Added a DISTINCT to the outer select to avoid multiplying the result.
In case that there can be duplicate values for "jobname" in file1 which should be kept backmerging of file1 (left join with result set) would be necessary.
If your data is like in the example I'm right now not so sure how this could be solved with a hash table as there is no key. The only way a can think of right now would be to create an artificial and not unique key and then "hash" over this key with only one value using the hash iterator object - but this sounds to me close to a cartesian product.


Message was edited by: Patrick
Respected Advisor
Posts: 3,893

Re: How would I search a variable for a variable?

I gave your question another thought. Here my prefered solution:

data file1;
jobname='this is job abc001 and abc002'; output;
jobname='this is job abc003 '; output;
jobname='this is job abc004 '; output;
jobname='this is job abc005 '; output;
run;
data file2;
name='abc001';output;
name='abc002';output;
name='abc003';output;
name='abc004';output;
name='abc004';output;
run;

proc sql;
create table File2Distinct as
select distinct name from file2;
quit;

data file1;
set file1;
do i=1 to nobs;
set File2Distinct point=i nobs=nobs;
flag= find(jobname,name,'i')>1;
if flag then leave;
end;
run;
Super Contributor
Posts: 359

Re: How would I search a variable for a variable?

Why not just


proc sql;
create table result as select a.*, b.*
from file2 a LEFT JOIN file1 b on
b.jobname like cats('%', a.name , '%');
quit;
Ask a Question
Discussion stats
  • 4 replies
  • 171 views
  • 0 likes
  • 4 in conversation