Hi! I have a string made up of digits which is 18 characters long. I need to count the number of repeating digits starting from the beginning of the string. That is, given substr(myString,1,1), how many repeating characters are in the rest of the string starting at substr(myString,2,18).
Example:
'00000' should output 4 'coz there are 4 0s following the one at the beginning of the string
'100000' should output 0 'coz there are no following 1s after the first one
'1100000222' should output 1 'coz there is one following 1 after the first one
'02200000' should output 0 'coz there are no 0s following the one at the beginning of the string
'002200000' should output 1 'coz there is one 0 following the one at the beginning of the string
Can anyone help me with this? I'm in sas eg 7 12.
It can get tricky if you have 18 identical digits. Here's a way to work with that:
data want;
set have;
length firstchar $ 1;
firstchar = myString;
count=0;
do i=2 to 18;
if substr(myString, i, 1) = firstchar then count+1;
else i=20; /* could also try leave instead */
end;
drop firstchar i;
run;
Useful functions CHAR() and COUNTC()
This may help you get started:
data have;
input string $18.;
cards;
00000
100000
1100000222
02200000
002200000
;
run;
data want;
set have;
count = 0;
do while (substr(string,count+2,1) = substr(string,1,1));
count + 1;
end;
run;
It can get tricky if you have 18 identical digits. Here's a way to work with that:
data want;
set have;
length firstchar $ 1;
firstchar = myString;
count=0;
do i=2 to 18;
if substr(myString, i, 1) = firstchar then count+1;
else i=20; /* could also try leave instead */
end;
drop firstchar i;
run;
Since both the substr() and the char() functions simply return a blank when the index lies outside the size of the string variable, a complete sequence of identical digits poses no problem for my code; run this for a test:
data have;
input string $18.;
cards;
00000
100000
1100000222
02200000
002200000
111111111111111111
;
run;
data want;
set have;
count = 0;
do while (char(string,count+2) = char(string,1));
count + 1;
end;
run;
The result of the 18-digit string is 17, as (IMO) intended by the OP.
Kurt,
I didn't test CHAR, but SUBSTR can have problems.
4 data test1;
5 test='12345';
6 test2 = substr(test, 6, 1);
7 put _all_;
8 run;
NOTE: Invalid second argument to function SUBSTR at line 6 column 9.
test=12345 test2= _ERROR_=1 _N_=1
test=12345 test2= _ERROR_=1 _N_=1
NOTE: The data set WORK.TEST1 has 1 observations and 2 variables.
@Astounding you're right, char() is more robust than substr() when the index flows over.
Thank you so much for your help! This worked fine. No performance issue.
A pure function orientated (i.e.not looping) version:
data have; input string $18.; cards; 00000 100000 1100000222 02200000 002200000 111111111111111111 ; run; data want; set have; num_before=lengthn(scan(string,1,char(string,1),"k")); num_after=lengthn(string)-num_before; run;
It simply uses the first character as a delimiter and length's scan(1).
data have; input string $18.; cards; 00000 100000 1100000222 02200000 002200000 ; run; data want; set have; pid=prxparse('/^(\d)\1*/'); call prxsubstr(pid,string,p,l); want=l-1; drop pid p l; run;
This one works fine as well. Thanks!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.