BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
chinaski
Obsidian | Level 7

I want to Find the Length of a numeric expression in a string which has also alphabetic character.

For example i want to extract product code from product definition and i need the length of it. My product definition is like this:

"BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE"

from this i want the length of YHR4084L, so i need a function like anyalpha, or anydigit and i also need it to stop extracting rest of it.

substr(product_definition,ANYDIGIT(PRODUCT_NAME)) as kod2,

when i use this code it extracts "4084L NEW LINE SERIES TEA MACHINE" but i need YHR4084L. And this product definitions vary so i need something that detects the end of mixed character (numeric expression in a string which has also alphabetic character).

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@chinaski wrote:

I want to Find the Length of a numeric expression in a string which has also alphabetic character.

For example i want to extract product code from product definition and i need the length of it. My product definition is like this:

"BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE"

from this i want the length of YHR4084L, so i need a function like anyalpha, or anydigit and i also need it to stop extracting rest of it.

substr(product_definition,ANYDIGIT(PRODUCT_NAME)) as kod2,

when i use this code it extracts "4084L NEW LINE SERIES TEA MACHINE" but i need YHR4084L. And this product definitions vary so i need something that detects the end of mixed character (numeric expression in a string which has also alphabetic character).


From the documentation:

ANYDIGIT Function

Searches a character string for a digit, and returns the first position at which the digit is found.

 

Which indicates that since your desired "word" does not start with a digit that you want a different approach.

If there is going to be only one of these words in a given entry then extract word by word and report the first with any digits. If there are multiples then you need to provide better examples and how we know which output you want.

 

One possible:

data example;
   productname="BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE";
   do i=1 to countw(productname);
      if anydigit(scan(productname,i))>0 then do;
         kod2= scan(productname,i);
         leave;
      end;
   end;
   drop i;
run;

Make sure to define the length of your KOD2 variable before use.

This finds and returns the first word with any digit in it. The LEAVE instruction says to quit the do loop when found (how the first if multiple words with digits exist).

View solution in original post

5 REPLIES 5
Shmuel
Garnet | Level 18

You can check each substring in the input string does it contains digits, something like:

do i=1 to countw(<input string>);
  word = scan(<input string>,i);
  if compress(word,,'kd') ne ' ' then leave;
end;
putlog word=;
       
ballardw
Super User

@chinaski wrote:

I want to Find the Length of a numeric expression in a string which has also alphabetic character.

For example i want to extract product code from product definition and i need the length of it. My product definition is like this:

"BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE"

from this i want the length of YHR4084L, so i need a function like anyalpha, or anydigit and i also need it to stop extracting rest of it.

substr(product_definition,ANYDIGIT(PRODUCT_NAME)) as kod2,

when i use this code it extracts "4084L NEW LINE SERIES TEA MACHINE" but i need YHR4084L. And this product definitions vary so i need something that detects the end of mixed character (numeric expression in a string which has also alphabetic character).


From the documentation:

ANYDIGIT Function

Searches a character string for a digit, and returns the first position at which the digit is found.

 

Which indicates that since your desired "word" does not start with a digit that you want a different approach.

If there is going to be only one of these words in a given entry then extract word by word and report the first with any digits. If there are multiples then you need to provide better examples and how we know which output you want.

 

One possible:

data example;
   productname="BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE";
   do i=1 to countw(productname);
      if anydigit(scan(productname,i))>0 then do;
         kod2= scan(productname,i);
         leave;
      end;
   end;
   drop i;
run;

Make sure to define the length of your KOD2 variable before use.

This finds and returns the first word with any digit in it. The LEAVE instruction says to quit the do loop when found (how the first if multiple words with digits exist).

andreas_lds
Jade | Level 19

If your want the length only, and not the product code, try:

data want_b;
   set have;
   
   d = anydigit(product_definition);
   s = anyspace(product_definition, -1*d);
   e = anyspace(product_definition, d);
   code_length = e - s - 1;
run;

If you want the product code, too, just add substr using s as second parameter and code_length as third.

Kurt_Bremser
Super User

I see an issue here that will come to bite you in your behind some time in the future.

"4084L NEW LINE SERIES TEA MACHINE" should be turned into YHR4084L.

But what if the manufacturer adds a code YMR4084L (or similar)? Or already has it?

Patrick
Opal | Level 21

@chinaski 

"BRANDNAME YHR4084L NEW LINE SERIES TEA MACHINE"

In your real data can you define any pattern that would let us identify the product code in a string. Like: Is it always the 2nd word or does it always have 3 characters followed by some digits, ...or any other pattern.

 

Just looking for digits might not work given that there are alphanumeric brand names like Toys4Kids, 3M, 7 Up, etc.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 449 views
  • 3 likes
  • 6 in conversation