Hi everyone, thanks in advance for the help.
Essentially, I have an email address variable and a lastname variable in a SAS dataset. What I'm trying to do is count how many times the lastname appears in the email address for that row. As an example:
Lastname: Smith
Email: Smithtestsmith@test.test
Should return a count of two.
I've tried fiddling with a rudimentary array, but I can only get it to count once. I'm sure this is possible, but I can't think how for the life of me. Both these variables are upper case, so case matching won't be an issue.
data Sales_Te;
set Sales;
leng=length(lastname);
do i=1 to leng;
if substr(email,i,leng) = lastname then name=1;
drop leng;
end;
run;
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count =sum(count, 1);
end;
put count =;
run;
The code "then count+1" creates an implied Retain statement, which keeps the values across data step iteration boundaries. Using count = sum(count,1); instead adds 1 each time the statement is true but resets for each statement. If the condition is never true for a record the result will be missing. The Sum function only adds, or sums, non-missing values so does not have any oddities that + would have when count is undefined or missing such as the first time the condition is true.
Also:
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
count = count(m, s, 'i');
put count =;
run;
See if this approach works for you
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count + 1;
end;
put count =;
run;
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count =sum(count, 1);
end;
put count =;
run;
The code "then count+1" creates an implied Retain statement, which keeps the values across data step iteration boundaries. Using count = sum(count,1); instead adds 1 each time the statement is true but resets for each statement. If the condition is never true for a record the result will be missing. The Sum function only adds, or sums, non-missing values so does not have any oddities that + would have when count is undefined or missing such as the first time the condition is true.
You could use regular expressions this way:
data test;
name = "Smith";
length dum $200;
do email = "Smithtestsmith@test.test", "test@test.test";
id = prxParse(cats("s/",name,"//i"));
call prxChange(id, -1, email, dum, len, trunc, nb);
call prxFree(id);
output;
end;
drop dum id len trunc;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.