Desktop productivity for business analysts and programmers

SCAN(SUBSTRING)

Accepted Solution Solved
Reply
Contributor
Posts: 59
Accepted Solution

SCAN(SUBSTRING)

How do I write a function that will search for a character within a string then pull the 6 digits after the character including the character?

Ex:

0000M123450000

11M12345898989

222222m1234500

0M12345547

Each data line is different in length, but the output (work) should be

M12345

M12345

M12345

M12345

Keep in mind the character can be lowercase or uppercase.

In this situation the value is "M" and I'm looking for 6 digits including the M.

Thanks in advance.....



Accepted Solutions
Solution
‎04-15-2014 01:38 PM
Super User
Posts: 22,875

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

Use a combination of find and substring.

You may need to play with the -1 and or 6 to get what you want, but this should be close (untested).

There's also Perl Expressions, but I avoid those Smiley Happy

WANT = substr(word,findc(word, "M", 'TI')-1, 6);

View solution in original post


All Replies
Valued Guide
Posts: 2,190

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

Substr( inputstring, verify( inputstring, '012345679''), 6)

That VERIFY() function returns the position of the first character of the first parameter that is not in the second parameter. So in the example above it points to any non-numeric which means that M.

it is just about the most efficient of these string hunting functiins

(I am told)

Valued Guide
Posts: 2,190

Re: SCAN(SUBSTRING)

It shouldn't need conditional logic like IF or CASE when you are using FINDc or VERIFY.

Just extend the strings like

substr( inputstring !! "      ", verify( inputstring !! "M", '1234567890'), 6 )

Or,

Substr( t1.'PACKAGE ID'n !! '          ', findc(  t1.'PACKAGE ID'n !! 'M', 'ti' ), 9 )

by appending enough blanks and an M to the input string you can guarantee therr enough characters to extract from   and an M will always be found in the dearch    even if it is after the original 'package ID'

Nor should you need that -0 .

Was it 9 or 6 characters you wanted to extract?

Solution
‎04-15-2014 01:38 PM
Super User
Posts: 22,875

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

Use a combination of find and substring.

You may need to play with the -1 and or 6 to get what you want, but this should be close (untested).

There's also Perl Expressions, but I avoid those Smiley Happy

WANT = substr(word,findc(word, "M", 'TI')-1, 6);

Contributor
Posts: 59

Re: SCAN(SUBSTRING)


Adjusting the value to -0 will account for the first position.

substr(t1.'PACKAGE ID'n,findc(t1.'PACKAGE ID'n, "M", 'TI')-0, 9)

Thanks Everyone - very Helpful Smiley Wink

Super User
Posts: 10,626

Re: SCAN(SUBSTRING)

But I like Perl Regular Expression.

For every pure programmer like Java , C++ ,they all use it to process string type variables . I recommend you to learn it, you will get a powerful tool .

data have;
input x : $40.;
cards;
0000M123450000
11M12345898989
222222m1234500
0M12345547
;
run;
data want;
 set have;
 xx=prxchange('s/.*([Mm]\d{5}).*/$1/',-1,x);
run;

Xia Keshan

Contributor
Posts: 59

Re: SCAN(SUBSTRING)

Thanks again Gentlemen - i love learning new methods to parse data...

Keshan - I'll take a look at it, but the last time I touched Java was back in my college days.

Keep in mind - I'm new to this SAS thing, but love to learn..............

Super User
Posts: 10,626

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

You are going to love Perl Regular Expression.

Super Contributor
Posts: 578

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

newcol=upper(substring(oldcol,find(oldcol,'m',i),6));

Super User
Super User
Posts: 7,860

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

You might need to use IF/THEN logic to handle cases that do not meet the pattern.

For this simple case of only one letter you could use the INDEXC() function to handle matching case.

loc = indexc(STRING,'mM');

if 0 < loc <= length(string)-5 then NEW = substr(STRING,loc,6) ;

Super Contributor
Posts: 275

Re: SCAN(SUBSTRING)

Posted in reply to Dsrountree

It seems that 'M/m' is only alpha character in string, so:

substr(string,anyalpha(string),6)

Super User
Posts: 6,544

Re: SCAN(SUBSTRING)

I hate to add to answers that are already plentiful and working, but this approach seemed interesting:

length newvar $ 6;

newvar = 'M' || scan(word, 2, 'Mm');

It will, however, return "M" with no digits following if the incoming word doesn't contain any M/M.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 560 views
  • 6 likes
  • 8 in conversation