How would I search a variable for a variable?

supp · Posted 03-12-2010 05:14 PM

I have two datasets, file1 and file2. File1 contains a free form variable called 'jobname'. File2 contains an eight character variable called 'name'. It is likely that the variable 'name' from file2 is contained somewhere in the free form variable 'jobname'. What I would like to do is search jobname to see if I can find a match with name. What would be the best way to code something like this?

sbb · Posted 03-12-2010 05:28 PM

It's unclear if you want to pass the entire file2 looking for all 'name' values in file1?

Investigate using the hash object to load 'name' values into memory and then you can use a DATA step function like INDEX to search for a match. The facility is explained in detail at the SAS support http://support.sas.com/ and use the SEARCH facility.

Scott Barry
SBBWorks, Inc.

Patrick · Posted 03-12-2010 06:25 PM

As the approach below results in a cartesian product using a hash table will be much better for performance (if the source tables are "big").

For small tables: I find the approach below easier to code.

data file1;
jobname='this is job abc001 and abc002'; output;
jobname='this is job abc003 '; output;
jobname='this is job abc004 '; output;
jobname='this is job abc005 '; output;
run;
data file2;
name='abc001';output;
name='abc002';output;
name='abc003';output;
name='abc004';output;
name='abc004';output;
name='job';output;
run;

proc sql;
select DISTINCT file1.*,ifc(file2.name ne '','1','0') as found
from file1 left join (select distinct name from file2) as file2
on find(file1.jobname,file2.name,'i');
quit;

HTH
Patrick Added a DISTINCT to the outer select to avoid multiplying the result.
In case that there can be duplicate values for "jobname" in file1 which should be kept backmerging of file1 (left join with result set) would be necessary.
If your data is like in the example I'm right now not so sure how this could be solved with a hash table as there is no key. The only way a can think of right now would be to create an artificial and not unique key and then "hash" over this key with only one value using the hash iterator object - but this sounds to me close to a cartesian product.

Message was edited by: Patrick

Patrick · Posted 03-16-2010 05:08 AM

I gave your question another thought. Here my prefered solution:

data file1;
jobname='this is job abc001 and abc002'; output;
jobname='this is job abc003 '; output;
jobname='this is job abc004 '; output;
jobname='this is job abc005 '; output;
run;
data file2;
name='abc001';output;
name='abc002';output;
name='abc003';output;
name='abc004';output;
name='abc004';output;
run;

proc sql;
create table File2Distinct as
select distinct name from file2;
quit;

data file1;
set file1;
do i=1 to nobs;
set File2Distinct point=i nobs=nobs;
flag= find(jobname,name,'i')>1;
if flag then leave;
end;
run;

Flip · Posted 03-16-2010 08:06 AM

Why not just

proc sql;
create table result as select a.*, b.*
from file2 a LEFT JOIN file1 b on
b.jobname like cats('%', a.name , '%');
quit;

How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Catch up on SAS Innovate 2026

How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Re: How would I search a variable for a variable?

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away