BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
a_ramos
Obsidian | Level 7

I would like to know how I can count distinct characters in a string

example:

 

NAME   PHONE   CHAR_DISTINCT

Ana       21212      2

Pablo    234234    3

Hana    1111111    1

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Or using PRXCHANGE Function

 

data want;
   set have;
   array t(100) $  _temporary_;
   call missing(of t(*));
   do _n_=1 to length(PHONE);
      t(_n_)=char(PHONE,_n_);
   end;
   call sortc(of t(*));
   CHAR_DISTINCT=length(prxchange('s/([0-9])\1+/$1/', -1, cats(of t(*))));
run;

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20
data have;
input NAME $ PHONE $;
datalines;
Ana 21212
Pablo 234234
Hana 1111111
;

data want(keep=NAME PHONE CHAR_DISTINCT);
   set have;
   newphone=substr(PHONE, 1, 1);
   do i=2 to length(PHONE);
      found=find(newphone, substr(PHONE, i, 1), 'it');
      if found=0 then newphone=cats(newphone, substr(PHONE, i, 1));
   end;
   CHAR_DISTINCT=length(newphone);
run;
PeterClemmensen
Tourmaline | Level 20

Or using PRXCHANGE Function

 

data want;
   set have;
   array t(100) $  _temporary_;
   call missing(of t(*));
   do _n_=1 to length(PHONE);
      t(_n_)=char(PHONE,_n_);
   end;
   call sortc(of t(*));
   CHAR_DISTINCT=length(prxchange('s/([0-9])\1+/$1/', -1, cats(of t(*))));
run;
a_ramos
Obsidian | Level 7

Thank you! 🙂
Kurt_Bremser
Super User

Just for completeness, a solution with SQL.

data have;
input NAME $ PHONE $;
datalines;
Ana 21212
Pablo 234234
Hana 1111111
;
run;

data trans;
set have;
length digit $1;
do i = 1 to length(phone);
  digit = substr(phone,i,1);
  output;
end;
drop i;
run;

proc sql;
create table want as
select
  name,
  phone,
  count(distinct digit) as char_distinct
from trans
group by name, phone;
quit;

proc print data=want noobs;
run;

Result:

                      char_
NAME      PHONE     distinct

Ana      21212          2   
Hana     1111111        1   
Pablo    234234         3   

@PeterClemmensen's first solution is the best, as it solves everything in one sequential pass through have.

Ksharp
Super User

Just for completeness, a solution with Hash Table.

 

data have;
input NAME $ PHONE $;
datalines;
Ana 21212
Pablo 234234
Hana 1111111
;
run;
data want;
 if _n_=1 then do;
  length k $ 1;
  declare hash h();
  h.definekey('k');
  h.definedone();
 end;
set have;
 do i = 1 to length(phone);
  k = char(phone,i);
  h.ref(); 
end;
count=h.num_items;
h.clear();
drop k i;
run;
KachiM
Rhodochrosite | Level 12

Here is another version. Look at the first character of PHONE. Count it. Compress the first character from the Phone. Repeat the above step until the phone has left with one or no character.

data have;
input NAME $ PHONE $;
datalines;
Ana 21212
Pablo 234234
Hana 1111111
;
run;

data want;
   set have;
   ph = phone;
   count = 0;
   do until(len <= 1);
      d = char(ph,1);
      count + 1;
      ph = compress(ph,d);
      len = length(ph);
   end;
drop ph len d;
run;

Obs 	NAME 	PHONE 	count
1 	Ana 	21212 	2
2 	Pablo 	234234 	3
3 	Hana 	1111111 	1
FreelanceReinh
Jade | Level 19

Note that among the previous solutions only @Kurt_Bremser's produces a count of zero in the case of an empty input string. Of course, you could adapt the other solutions correspondingly if needed. For example, using COMPRESS (like in @KachiM's solution):

data want(drop=_t);
set have;
_t=phone;
do CHAR_DISTINCT=0 by 1 while(_t ne ' ');
  _t=compress(_t,first(_t));
end;
run;

To obtain CHAR_DISTINCT=1 for missing PHONE replace the DO statement with:

do CHAR_DISTINCT=1 by 1 until(_t=' ');

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4769 views
  • 13 likes
  • 6 in conversation