BookmarkSubscribeRSS Feed
OS2Rules
Obsidian | Level 7
Hi All:

I'm having a huge brain cramp today.

I want to remove any repeated characters in a character string. I've gone through all the character functions in the doc and I can't find anything to do what I want.

For example:

change: YYYMMDD
to: YMD

Sounds simple, but I just can't seem to wrap my head around this one.

Thanks in advance.
6 REPLIES 6
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
This macro should work for you:

%macro strnodup(strvar);
drop __i;
__i=1;
do until(__i = length(&strvar));
if substr(&strvar,__i,1) = substr(&strvar,__i+1,1) then
&strvar = substr(&strvar,1,__i)!!substr(&strvar,__i+2);
else __i + 1;
end;
%mend strnodup;

data _null_;
retain str "YYYMMDD";
str_old = str;
%strnodup(str);
putlog _all_;
run;


Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Repeated characters:
change: "YMYDD"
to: ?
Patrick
Opal | Level 21
HTH
Patrick

data _null_;
RId = prxparse('s/(.)\1+/$1/io');
text = 'YYYMMDD__YDDDM';
call prxchange(RId, -1, text);
put text;
run;
Peter_C
Rhodochrosite | Level 12
just another way ..... using substr() in 2 ways (as source and as destination)[pre] data ;
set whatever ;
do p= 2 to length( string) while( p le length( string ) ) ;
if substr( string, p, 1) = substr( string, p-1, 1)
then substr( string, p ) = substr( string, p+1 ) ;
end ;
drop p ;
run ;[/pre]
PeterC

sorry, on testing, I found this failed to handle triples. So here are some test data and a working approach[pre]data whatever ;
input string $ ;
cards ;
qweeertyui
opassassaa
dfgggfggh
jklsxcvb
nm,.1234
5678tttt
ttfghj56
data ;
set whatever ;
do p= 2 to length( string) while( p le length( string ) ) ;
if substr( string, p, 1) = substr( string, p-1, 1) then
do ;
substr( string, p ) = substr( string, p+1 ) ;
p = p-1 ;
end ;
end ;
drop p ;
run ;[/pre] 17:04 11Oct08 BST


Message was edited by: Peter.C
az_rahman
Calcite | Level 5

Hi Patrick,

 

i managed to use your prxchange and prxparse within my proc sql query and it works like a dream. Just wanted to say thank you for this.

 

Would you mind explaining the arguments for prxparse('s/(.)\1+/$1/io')?

LeonidBatkhan
Lapis Lazuli | Level 10

DISCLOSURE: I realize that this is not a timely reply, but since this post comes up in Google searches for "remove duplicate characters in a SAS string", I have decided to inform readers who stumble upon this post through their search about the following new development:

 

I recently published a blog post Removing repeated characters in SAS strings in which I create a new user-defined function UNDUPC that removes duplicate characters from SAS strings effectively expanding functionality of the COMPBL function to all other characters.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 7096 views
  • 5 likes
  • 7 in conversation