BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cecillia_Mao
Obsidian | Level 7

Hi everybody. I'm confused with the n and d delimiter in Countw funtion. Would someone explain a little bit? Any thoughts would be appreciated!
For countw function, 
1. d or D adds digits to the list of characters. 
2. n or N adds digits, an underscore, and English letters(that is, the characters that can appear after the first characterin a SAS variable name using VALIDVARNAME=V7) to the list of characters. Thus, for the code below,I think the result should be:

obs    n    d 
  1      3    4 
  2      1    5

data test;
	input string $char60.;
	datalines;
'_'2+2=4
\\"Windows"  "\Path\Names\Use\Backslashes
;
run;
data test1;
    set test;
	d=countw(string,,'d');
	n=countw(string,,'n');
run;
proc print data=test1;run;

However,the result is different, as shown in the picture below.

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Hi,

 

Remember that character variables are padded with trailing blanks. When you specify the delimiters and do not include a blank as a delimiter, the trailing blanks counts as a word.  You can use the TRIMN function to remove trailing blanks.  

 

Below log shows the result you expect:

26   data test1;
27       set test;
28     d=countw(trimn(string),,'d');
29     n=countw(trimn(string),,'n');
30
31     put (string d n)(=) ;
32   run;

string='_'2+2=4 d=3 n=4
string=\\"Windows"  "\Path\Names\Use\Backslashes d=1 n=5
BASUG is hosting free webinars ! Check out recordings of our past webinars: https://www.basug.org/videos. Save the date for our in person SAS Blowout on Oct 18 in Cambridge, MA. Registration opens in September.

View solution in original post

10 REPLIES 10
Quentin
Super User

Hi,

 

Remember that character variables are padded with trailing blanks. When you specify the delimiters and do not include a blank as a delimiter, the trailing blanks counts as a word.  You can use the TRIMN function to remove trailing blanks.  

 

Below log shows the result you expect:

26   data test1;
27       set test;
28     d=countw(trimn(string),,'d');
29     n=countw(trimn(string),,'n');
30
31     put (string d n)(=) ;
32   run;

string='_'2+2=4 d=3 n=4
string=\\"Windows"  "\Path\Names\Use\Backslashes d=1 n=5
BASUG is hosting free webinars ! Check out recordings of our past webinars: https://www.basug.org/videos. Save the date for our in person SAS Blowout on Oct 18 in Cambridge, MA. Registration opens in September.
Cecillia_Mao
Obsidian | Level 7
Dear Quentin, that totally makes sense! Thanks so much! But I do not quite understand why the variable value with the showest length (in this case would be string 1) will also be padded with a trailing space?
Quentin
Super User

When you read the data into SAS, you specified that the variable STRING was 60 characters long. So if the value assigned to string is less than 60 characters long, it will be padded with blanks.

 

If you have a variable that is one character long, and you put one character in it, it is not padded with blanks.

 

Below example shows String1 has a length of one character, and String2 has a length of two characters.

 

data test;
	input string1 : $1. string2 : $2.;
	datalines;
A 1
1 11
;
run; 
data test1;
  set test;
	d1=countw(string1,,'d');
  d2=countw(string2,,'d');
  put string1= d1= string2= d2=;
run;
BASUG is hosting free webinars ! Check out recordings of our past webinars: https://www.basug.org/videos. Save the date for our in person SAS Blowout on Oct 18 in Cambridge, MA. Registration opens in September.
Cecillia_Mao
Obsidian | Level 7
Totally understood! Thanks so much, Quentin, I really appreciate it!
Astounding
PROC Star

I think the questions you have are actually basic ones:  What is a word?  What is a delimiter?  The answers affect how COUNTW counts, and the number of words that it finds.

 

For example, in the data lines that you illustrated, should "+" be a delimiter, or should it be a character within a word?  By default, it is a character within a word.

 

Assuming you are working on an ASCII-based system, these characters are delimiters rather than part of a word:

 

blank ! $ % & ( ) * + , - . / ; < ^ |

 

You have control over that, however.  Using the "n" or "d" modifiers (and there are additional choices as well) switches the meaning of some characters.  They now become delimiters, rather than being characters within a word.

Cecillia_Mao
Obsidian | Level 7
Dear Astounding, thanks so much for your reply! Based on my understanding, ‘+’ is one of default delimiter on the ASCII-based system, but if I specified a specific delimiter(Let's say d or D adds digits to the list of characters in this case) then it will override the default, and '+' would be a word. Please free me to correct me if I understand wrong.
Astounding
PROC Star

You're getting closer.

 

Adding "n" or "d" does not change the default set of delimiters.  It modifies the default set by adding characters that also function as delimiters.  So "+" remains a delimiter when you add "n" or "d".  If you want to replace the default set of delimiters, you can do that.  You would need to add the delimiters that you want as the second parameter, between what is now the two consecutive commas.

Cecillia_Mao
Obsidian | Level 7
Dear Astounding, Thanks so much! I think FreelanceReinhard added some thoughts regarding delimiters below. In my case, the default blank was overridden and counted as a word.
FreelanceReinh
Jade | Level 19

Hi @Cecillia_Mao,

 

I think the default delimiters are not applied if more than one argument is used (see documentation). This includes the case of an empty second argument as in your examples (or in countw(string,,)where also the third argument is empty or in  countw(string,) where the third argument is not used). That is, with two or three arguments (empty or not) the list of delimiters is constructed from the second argument and, if applicable, modified by the third argument. In particular, an empty second argument means an empty list of delimiters (i.e. no character serves as a delimiter), until the modification by the third argument is applied, if any. So, in your examples the digits ('d') or "name characters" ('n'), respectively, are added to the empty list (not to the list of default delimiters), i.e., they are eventually the only delimiters. To add them to the list of default delimiters you would need to specify the default delimiters explicitly in the second argument, e.g., d=countw(string,' !$%&()*+,-./;<^|','d') in an ASCII environment (resulting in different values for variable d than with the empty second argument).

Cecillia_Mao
Obsidian | Level 7
Got it! Thanks a ton!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1183 views
  • 1 like
  • 4 in conversation