Hi everyone, thanks in advance for the help.
Essentially, I have an email address variable and a lastname variable in a SAS dataset. What I'm trying to do is count how many times the lastname appears in the email address for that row. As an example:
Lastname: Smith
Email: Smithtestsmith@test.test
Should return a count of two.
I've tried fiddling with a rudimentary array, but I can only get it to count once. I'm sure this is possible, but I can't think how for the life of me. Both these variables are upper case, so case matching won't be an issue.
data Sales_Te;
set Sales;
leng=length(lastname);
do i=1 to leng;
if substr(email,i,leng) = lastname then name=1;
drop leng;
end;
run;
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count =sum(count, 1);
end;
put count =;
run;
The code "then count+1" creates an implied Retain statement, which keeps the values across data step iteration boundaries. Using count = sum(count,1); instead adds 1 each time the statement is true but resets for each statement. If the condition is never true for a record the result will be missing. The Sum function only adds, or sums, non-missing values so does not have any oddities that + would have when count is undefined or missing such as the first time the condition is true.
Also:
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
count = count(m, s, 'i');
put count =;
run;
See if this approach works for you
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count + 1;
end;
put count =;
run;
data _null_;
s = "Smith";
m = "Smithtestsmith@test.test";
do i = 1 to length(m) - length(s);
if lowcase(substr(m, i, length(s))) = lowcase(s) then count =sum(count, 1);
end;
put count =;
run;
The code "then count+1" creates an implied Retain statement, which keeps the values across data step iteration boundaries. Using count = sum(count,1); instead adds 1 each time the statement is true but resets for each statement. If the condition is never true for a record the result will be missing. The Sum function only adds, or sums, non-missing values so does not have any oddities that + would have when count is undefined or missing such as the first time the condition is true.
You could use regular expressions this way:
data test;
name = "Smith";
length dum $200;
do email = "Smithtestsmith@test.test", "test@test.test";
id = prxParse(cats("s/",name,"//i"));
call prxChange(id, -1, email, dum, len, trunc, nb);
call prxFree(id);
output;
end;
drop dum id len trunc;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.