Removing occurring

librasonali · Posted 12-13-2020 08:09 AM

Hello Everyone,

PFB input and desired o/p data where the repeated/duplicate alphabet gets removed. let's suppose there is huge data then how can we do? for few records it can be done via tranwrd but how for huge data. ?? Thanks in advance !

	input data
Field_Id	Name
1	Josse
2	Aangeyy
3	Suusa
4	Henryy
5	Johnn
6	Joose

	Output data
Field_Id	Name
1	Jose
2	Angey
3	Susa
4	Henry
5	John
6	Jose

PaigeMiller · Posted 12-13-2020 09:06 AM

What to do when there are legitimate repeated letters in a name, such as the name Matthew?

--
Paige Miller

andreas_lds · Posted 12-13-2020 12:41 PM

The whole task looks like homework, because in a real-world-scenario names exist that have repeated chars, like Otto or Lynn.

A regular expression in prxchange can solve the problem:

data want;
	set have;
	Name = prxchange('s/(.)\1+/$1/', -1, Name);
run;

Maybe the expressions needs to be explained 😉

(.) matches one char, the parentheses create a capture group.

\1+ matches the the first capture group

$1 return the first capture group - one of the duplicated chars

-1 repeat changing until no more match is found

hhinohar · Posted 12-14-2020 12:16 AM

Hi @andreas_lds

It doesn't give the correct output. Didn't you mean something like this?

data want;
	set have;
	*search string in lowcase character and then do pattern matching and finally convert it into propcase;
	Name = propcase(prxchange('s/(.)\1+/$1/', -1, lowcase(Name)));
run;

andreas_lds · Posted 12-14-2020 01:27 AM

@hhinohar: Seems that i forgot to add the "i" option to prxchange:

data want;
   set have;
   Name = prxchange('s/(.)\1+/$1/i', -1, Name);
run;

PeterClemmensen · Posted 12-13-2020 03:05 PM

Read Removing repeated characters in SAS strings by @LeonidBatkhan .

The DATA to DATA Step Macro
Blog: SASnrd

Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Re: Removing occurring

Registration is open

SAS Training: Just a Click Away