Lets say I have a dataset that produces two columns
ATTY FIRM
Scott Scott and Wagner
Hampshire Mills and Walker
Beasley Jones and Beasley
In this case I want to have a flag that says if any part of the name DOES NOT MATCH (ATTY and FIRM) then put a 'Y' in a flag called Choose, else do not choose it. So in this case only Hampshire Mills and Walker would get chosen. Is there a way to do this?
ATTY FIRM choose
Scott Scott and Wagner
Hampshire Mills and Walker Y
Beasley Jones and Beasley
Use the findw() function:
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Beasley
;
run;
data want;
set have;
choose = ifc(findw(firm,strip(atty)),' ','Y');
run;
Note the function that strips the trailing (and/or leading) blanks from atty.
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Beasley
;
run;
data want;
set have;
if not indexw(firm,strip(atty)) then choose='Y';
run;
Lets say the last entry Beasley,Jones and Beasley was Beasley,Jones and Bond instead.
I would still get a 'Y' for a match.
Is there a way to get this to look at the first 2 or 3 characters. In that case the last entry would not be a match which is what I would want in this case. The code you provided is effective however if the first char is a match in either column we get a 'Y' which is not what I would want
Can you modify the sample input and the output for the sample plz
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Bond
;
run;
data want;
set have;
if not indexw(firm,strip(atty)) then choose='Y';
run;
So in this case because in the last entry the "B" in Beasley happens to match the "B" in Bonds, we would get a 'Y'. However is there a way to get the code to look at the first 2 or 3 char and apply the same logic. In this case since Beasley is not Bond I would want this to not show a 'Y' for a match, currently it does
Hi @Q1983 The Y is case of non matches both my code and Kurt's.
The result that you see below in the 2nd and 3rd record being 'Y' is a result of the condition "not indexw" (note the not or in other words false positives. So basically, Y is a result of non-matches and not matches.
This meets your original question where you wrote ="In this case I want to have a flag that says if any part of the nameDOES NOT MATCH(ATTY and FIRM) then put a 'Y' in a flag called Choose, else do not choose it" Kindly review
atty | firm | choose |
---|---|---|
Scott | Scott and Wagner | |
Hampshire | Mills and Walker | Y |
Beasley | Jones and Bond | Y |
@Q1983 wrote:
So in this case because in the last entry the "B" in Beasley happens to match the "B" in Bonds, we would get a 'Y'. However is there a way to get the code to look at the first 2 or 3 char and apply the same logic. In this case since Beasley is not Bond I would want this to not show a 'Y' for a match, currently it does
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.