Hello,
I was removing the T, U and A off the front of some product codes using the code below, but then I found cases where the string may actually have no product code, just a word string instead. In which case, my code will just remove the T off TOWELS and leave it as OWELS. Does anyone know of a better way to remove the first letter of the string, but only if the next part of the string contains numbers? Product codes will always have either one letter followed by some numbers (like U0223) or two letters followed by numbers (TA223). If there is no product code, the field will just display the product name (like TOWELS). The whole database is in capitals. Thanks for any suggestions and ideas.
data HAVE; X='A0211 BEDDING LINEN'; output; X='T0223 CROCKERY'; output; X='TOWELS'; output; X='U49 ELECTRICAL'; output; run; data WANT; set HAVE; %let prefixes= 'A', 'T', 'U', 'TA'; X= substr(X,verify(X,&prefixes)); run;
Or
Y = prxchange('s/(A|T|U|TA)(\d+)/$2/', 1, X);
Try this
data want;
set have;
y = prxchange('s/([A T U])(\d+.*)/$2/', -1, x);
run;
Thank you, but what about my product codes where it has two letters at the front, like TA0223? If I use that code, it will give me A0223 after I run it. Is there a way to amend the code to get it to remove both cases of TA0223 and T0223? Thanks.
Like this?
Y = prxchange('s/([ATU]{1,2})(\d+)/$2/', -1, X);
Thank you. Can I ask what the $2 part means in the prxchange? I get that the first part gives it the strings we want to search for, and the second part means digits, but I don't get the $2 part? I've tried reading up on prxchange before, but I've found the examples always just do the same thing (like reversing two words) and I haven't been able to locate examples that explain this sort of thing where prefix letters are removed but only if they meet a certain criteria. Thanks.
Y = prxchange('s/(A|T|U|TA)(\d+)/$2/', 1, X);
$2 means the second matched group, (\d+) in this case. It could also be written \2
You could use the function anydigit to check if the chars must be remove:
%let prefixes= 'A', 'T', 'U', 'TA';
data WANT;
set HAVE;
if anydigit(x) then do;
code = substr(x, verify(x, &prefixes));
end;
run;
Thank you. Your version would return the products such as TOWELS as blank though. I'd have to put an 'else do' in there to get it to return the string if if has no digits. It works fine with the else do added in:
%let prefixes= 'A', 'T', 'U', 'TA'; data WANT; set HAVE; if anydigit(x) then do; code=substr(x, verify(x, &prefixes)); end; else do; code=X; end; run;
@Buzzy_Bee wrote:
Thank you. Your version would return the products such as TOWELS as blank though. I'd have to put an 'else do' in there to get it to return the string if if has no digits. It works fine with the else do added in:
%let prefixes= 'A', 'T', 'U', 'TA'; data WANT; set HAVE; if anydigit(x) then do; code=substr(x, verify(x, &prefixes)); end; else do; code=X; end; run;
My fault, when i started writing the solution i renamed "x" to "code" and forgot to fully undo that change before posting, so just replace "code" with "x" in the code i posted.
Thanks very much. I see what you mean now. That corrects it in one step if I just use x as the variable:
if anydigit(x) then x=substr(x, verify(x, &prefixes));
Or I can just write an "if then else" without the do to shorten it:
data WANT; set HAVE; if anydigit(x) then code=substr(x, verify(x, &prefixes)); else code=X; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.