BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cecillia_Mao
Obsidian | Level 7

Hi everybody. I'm confused with the n and d delimiter in Countw funtion. Would someone explain a little bit? Any thoughts would be appreciated!
For countw function, 
1. d or D adds digits to the list of characters. 
2. n or N adds digits, an underscore, and English letters(that is, the characters that can appear after the first characterin a SAS variable name using VALIDVARNAME=V7) to the list of characters. Thus, for the code below,I think the result should be:

obs    n    d 
  1      3    4 
  2      1    5

data test;
	input string $char60.;
	datalines;
'_'2+2=4
\\"Windows"  "\Path\Names\Use\Backslashes
;
run;
data test1;
    set test;
	d=countw(string,,'d');
	n=countw(string,,'n');
run;
proc print data=test1;run;

However,the result is different, as shown in the picture below.

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Hi,

 

Remember that character variables are padded with trailing blanks. When you specify the delimiters and do not include a blank as a delimiter, the trailing blanks counts as a word.  You can use the TRIMN function to remove trailing blanks.  

 

Below log shows the result you expect:

26   data test1;
27       set test;
28     d=countw(trimn(string),,'d');
29     n=countw(trimn(string),,'n');
30
31     put (string d n)(=) ;
32   run;

string='_'2+2=4 d=3 n=4
string=\\"Windows"  "\Path\Names\Use\Backslashes d=1 n=5
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

View solution in original post

10 REPLIES 10
Quentin
Super User

Hi,

 

Remember that character variables are padded with trailing blanks. When you specify the delimiters and do not include a blank as a delimiter, the trailing blanks counts as a word.  You can use the TRIMN function to remove trailing blanks.  

 

Below log shows the result you expect:

26   data test1;
27       set test;
28     d=countw(trimn(string),,'d');
29     n=countw(trimn(string),,'n');
30
31     put (string d n)(=) ;
32   run;

string='_'2+2=4 d=3 n=4
string=\\"Windows"  "\Path\Names\Use\Backslashes d=1 n=5
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Cecillia_Mao
Obsidian | Level 7
Dear Quentin, that totally makes sense! Thanks so much! But I do not quite understand why the variable value with the showest length (in this case would be string 1) will also be padded with a trailing space?
Quentin
Super User

When you read the data into SAS, you specified that the variable STRING was 60 characters long. So if the value assigned to string is less than 60 characters long, it will be padded with blanks.

 

If you have a variable that is one character long, and you put one character in it, it is not padded with blanks.

 

Below example shows String1 has a length of one character, and String2 has a length of two characters.

 

data test;
	input string1 : $1. string2 : $2.;
	datalines;
A 1
1 11
;
run; 
data test1;
  set test;
	d1=countw(string1,,'d');
  d2=countw(string2,,'d');
  put string1= d1= string2= d2=;
run;
The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Cecillia_Mao
Obsidian | Level 7
Totally understood! Thanks so much, Quentin, I really appreciate it!
Astounding
PROC Star

I think the questions you have are actually basic ones:  What is a word?  What is a delimiter?  The answers affect how COUNTW counts, and the number of words that it finds.

 

For example, in the data lines that you illustrated, should "+" be a delimiter, or should it be a character within a word?  By default, it is a character within a word.

 

Assuming you are working on an ASCII-based system, these characters are delimiters rather than part of a word:

 

blank ! $ % & ( ) * + , - . / ; < ^ |

 

You have control over that, however.  Using the "n" or "d" modifiers (and there are additional choices as well) switches the meaning of some characters.  They now become delimiters, rather than being characters within a word.

Cecillia_Mao
Obsidian | Level 7
Dear Astounding, thanks so much for your reply! Based on my understanding, ‘+’ is one of default delimiter on the ASCII-based system, but if I specified a specific delimiter(Let's say d or D adds digits to the list of characters in this case) then it will override the default, and '+' would be a word. Please free me to correct me if I understand wrong.
Astounding
PROC Star

You're getting closer.

 

Adding "n" or "d" does not change the default set of delimiters.  It modifies the default set by adding characters that also function as delimiters.  So "+" remains a delimiter when you add "n" or "d".  If you want to replace the default set of delimiters, you can do that.  You would need to add the delimiters that you want as the second parameter, between what is now the two consecutive commas.

Cecillia_Mao
Obsidian | Level 7
Dear Astounding, Thanks so much! I think FreelanceReinhard added some thoughts regarding delimiters below. In my case, the default blank was overridden and counted as a word.
FreelanceReinh
Jade | Level 19

Hi @Cecillia_Mao,

 

I think the default delimiters are not applied if more than one argument is used (see documentation). This includes the case of an empty second argument as in your examples (or in countw(string,,)where also the third argument is empty or in  countw(string,) where the third argument is not used). That is, with two or three arguments (empty or not) the list of delimiters is constructed from the second argument and, if applicable, modified by the third argument. In particular, an empty second argument means an empty list of delimiters (i.e. no character serves as a delimiter), until the modification by the third argument is applied, if any. So, in your examples the digits ('d') or "name characters" ('n'), respectively, are added to the empty list (not to the list of default delimiters), i.e., they are eventually the only delimiters. To add them to the list of default delimiters you would need to specify the default delimiters explicitly in the second argument, e.g., d=countw(string,' !$%&()*+,-./;<^|','d') in an ASCII environment (resulting in different values for variable d than with the empty second argument).

Cecillia_Mao
Obsidian | Level 7
Got it! Thanks a ton!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1704 views
  • 1 like
  • 4 in conversation