Lets say I have a dataset that produces two columns
ATTY FIRM
Scott Scott and Wagner
Hampshire Mills and Walker
Beasley Jones and Beasley
In this case I want to have a flag that says if any part of the name DOES NOT MATCH (ATTY and FIRM) then put a 'Y' in a flag called Choose, else do not choose it. So in this case only Hampshire Mills and Walker would get chosen. Is there a way to do this?
ATTY FIRM choose
Scott Scott and Wagner
Hampshire Mills and Walker Y
Beasley Jones and Beasley
Use the findw() function:
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Beasley
;
run;
data want;
set have;
choose = ifc(findw(firm,strip(atty)),' ','Y');
run;
Note the function that strips the trailing (and/or leading) blanks from atty.
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Beasley
;
run;
data want;
set have;
if not indexw(firm,strip(atty)) then choose='Y';
run;
Lets say the last entry Beasley,Jones and Beasley was Beasley,Jones and Bond instead.
I would still get a 'Y' for a match.
Is there a way to get this to look at the first 2 or 3 characters. In that case the last entry would not be a match which is what I would want in this case. The code you provided is effective however if the first char is a match in either column we get a 'Y' which is not what I would want
Can you modify the sample input and the output for the sample plz
data have;
infile cards dlm=',';
input atty :$30. firm :$30.;
cards;
Scott,Scott and Wagner
Hampshire,Mills and Walker
Beasley,Jones and Bond
;
run;
data want;
set have;
if not indexw(firm,strip(atty)) then choose='Y';
run;
So in this case because in the last entry the "B" in Beasley happens to match the "B" in Bonds, we would get a 'Y'. However is there a way to get the code to look at the first 2 or 3 char and apply the same logic. In this case since Beasley is not Bond I would want this to not show a 'Y' for a match, currently it does
Hi @Q1983 The Y is case of non matches both my code and Kurt's.
The result that you see below in the 2nd and 3rd record being 'Y' is a result of the condition "not indexw" (note the not or in other words false positives. So basically, Y is a result of non-matches and not matches.
This meets your original question where you wrote ="In this case I want to have a flag that says if any part of the nameDOES NOT MATCH(ATTY and FIRM) then put a 'Y' in a flag called Choose, else do not choose it" Kindly review
atty | firm | choose |
---|---|---|
Scott | Scott and Wagner | |
Hampshire | Mills and Walker | Y |
Beasley | Jones and Bond | Y |
@Q1983 wrote:
So in this case because in the last entry the "B" in Beasley happens to match the "B" in Bonds, we would get a 'Y'. However is there a way to get the code to look at the first 2 or 3 char and apply the same logic. In this case since Beasley is not Bond I would want this to not show a 'Y' for a match, currently it does
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.