- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I was removing the T, U and A off the front of some product codes using the code below, but then I found cases where the string may actually have no product code, just a word string instead. In which case, my code will just remove the T off TOWELS and leave it as OWELS. Does anyone know of a better way to remove the first letter of the string, but only if the next part of the string contains numbers? Product codes will always have either one letter followed by some numbers (like U0223) or two letters followed by numbers (TA223). If there is no product code, the field will just display the product name (like TOWELS). The whole database is in capitals. Thanks for any suggestions and ideas.
data HAVE; X='A0211 BEDDING LINEN'; output; X='T0223 CROCKERY'; output; X='TOWELS'; output; X='U49 ELECTRICAL'; output; run; data WANT; set HAVE; %let prefixes= 'A', 'T', 'U', 'TA'; X= substr(X,verify(X,&prefixes)); run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or
Y = prxchange('s/(A|T|U|TA)(\d+)/$2/', 1, X);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this
data want;
set have;
y = prxchange('s/([A T U])(\d+.*)/$2/', -1, x);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, but what about my product codes where it has two letters at the front, like TA0223? If I use that code, it will give me A0223 after I run it. Is there a way to amend the code to get it to remove both cases of TA0223 and T0223? Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Like this?
Y = prxchange('s/([ATU]{1,2})(\d+)/$2/', -1, X);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. Can I ask what the $2 part means in the prxchange? I get that the first part gives it the strings we want to search for, and the second part means digits, but I don't get the $2 part? I've tried reading up on prxchange before, but I've found the examples always just do the same thing (like reversing two words) and I haven't been able to locate examples that explain this sort of thing where prefix letters are removed but only if they meet a certain criteria. Thanks.
Y = prxchange('s/(A|T|U|TA)(\d+)/$2/', 1, X);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
$2 means the second matched group, (\d+) in this case. It could also be written \2
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could use the function anydigit to check if the chars must be remove:
%let prefixes= 'A', 'T', 'U', 'TA';
data WANT;
set HAVE;
if anydigit(x) then do;
code = substr(x, verify(x, &prefixes));
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. Your version would return the products such as TOWELS as blank though. I'd have to put an 'else do' in there to get it to return the string if if has no digits. It works fine with the else do added in:
%let prefixes= 'A', 'T', 'U', 'TA'; data WANT; set HAVE; if anydigit(x) then do; code=substr(x, verify(x, &prefixes)); end; else do; code=X; end; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Buzzy_Bee wrote:
Thank you. Your version would return the products such as TOWELS as blank though. I'd have to put an 'else do' in there to get it to return the string if if has no digits. It works fine with the else do added in:
%let prefixes= 'A', 'T', 'U', 'TA'; data WANT; set HAVE; if anydigit(x) then do; code=substr(x, verify(x, &prefixes)); end; else do; code=X; end; run;
My fault, when i started writing the solution i renamed "x" to "code" and forgot to fully undo that change before posting, so just replace "code" with "x" in the code i posted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much. I see what you mean now. That corrects it in one step if I just use x as the variable:
if anydigit(x) then x=substr(x, verify(x, &prefixes));
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or I can just write an "if then else" without the do to shorten it:
data WANT; set HAVE; if anydigit(x) then code=substr(x, verify(x, &prefixes)); else code=X; run;