BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AlexMoreton
Obsidian | Level 7

Hi, I am trying to create a code that counts every character in the string. For some reason I keep on getting 0 for my total count.

it can read the input a perfectly fine. But it just goes to 0 when I use countc to find the total characters in the string (should be 8 in this case). Where am I going wrong?

 

Libname Alex;
Data count;
input a;
datalines;
12324564
;
run;
Data CountIt;
set count;
total = countc(a , "*");
proc print data=CountIt ;
run;

 

Also, I would like to be able to make a string that contained alphabets and numbers and special characters like ><.

However, when I do this the proc print doesn't even show values for a. Does anyone have any idea why?

Thank you

 

Libname Alex;
Data count;
input a;
datalines;
4091254ik3jlqf/.
;
run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please update your code so that is reads correctly.  You are also missing a length statement, so SAS will default to 8 character length:

data count;
  length a $100; 
  input a $;
datalines;
12324564lkahrli3q2hi.
;
run;

data countit;
  set count;
  total = lengthn(a);
run;

View solution in original post

12 REPLIES 12
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I will start with point 2, this:

Data count;
input a;

 

Means read a variable data in as number - so anything not numeric will not get stored.  Put a $ to indicate character read:

Data count;
input a $; 

 

Now on your other point, I am not sure, what do you want out, is it just the number of characters in the string, then;

data want;
  set count;
  total=lengthn(a);
run;

Do note how I use the code window (its the {i} above post), and keep consitent formatting, indentations, and finish all code blocks with the appropriate statement (run; in this case).

AlexMoreton
Obsidian | Level 7

RW9, Thank you again for your help! 

The lengtha did the trick! However, the input a $; did not work as the output basically returned as .  for "a". The lengtha function still worked but only counted numbers as it returned as 12. I've made some adjustments from the suggestions in this thread and it now looks like this. If possible could you have a look and see where it is going wrong?

Thank you, (ps) I am trying to do an assignment that can read the total number of characters in a thread from a data that I call so this would be a starting point for me. I just thought I'd try by making a data with a variable containing random numbers and characters and then try and count its total length.

 

Libname Alex;
Data count;
Infile datalines truncover ;
input a ;
datalines;
12324564lkahrli3q2hi
;
run;
Data CountIt;
set count;
total = lengthn(a);
proc print data=CountIt ;
run;

Tom
Super User Tom
Super User

I have no idea what you are trying to count, but your INPUT statements are trying to read in numbers, not strings.

If you want to read in the whole line into a single character variable then you want something more like this.

data count;
  infile datalines truncover ;
  input string $100.;
datalines;
12324564
;

So that will create a variable named STRING that is character with a length of 100 and read each line into a single observation.

If you want to know how long the string is you can use the LENGTH() or LENGTHN() function.  Both will ignore trailing spaces.

If you want to know how long the input line of text was, including any trailing spaces then use the LENGTH= option on the infile statement.  (Note that when reading from DATALINES (aka CARDS) the line length will always be a multiple of 80.)

data count;
  infile datalines truncover length=len1;
  input string $char100.;
  line_length=len1 ;
  string_length=length(string);
  string_length0 = lengthn(string);
datalines;
12324564

abcd 1234 xyz c
;
                           line_    string_    string_
Obs    string             length     length    length0

 1     12324564             80          8          8
 2                          80          1          0
 3     abcd 1234 xyz c      80         15         15
AlexMoreton
Obsidian | Level 7

Hi, I am trying to create a code that counts every character in the string. For some reason I keep on getting 0 for my total count.

it can read the input a perfectly fine. But it just goes to 0 when I use countc to find the total characters in the string (should be 8 in this case). Where am I going wrong?

 

Libname Alex;
Data count;
input a;
datalines;
12324564
;
run;
Data CountIt;
set count;
total = countc(a , "*");
proc print data=CountIt ;
run;

 

Also, I would like to be able to make a string that contained alphabets and numbers and special characters like ><.

However, when I do this the proc print doesn't even show values for a. Does anyone have any idea why?

Thank you

 

Libname Alex;
Data count;
input a;
datalines;
4091254ik3jlqf/.
;
run;

AlexMoreton
Obsidian | Level 7

Thank you Tom,

Is there a way of not setting a limit for the string? like input string $max;

or something like that?

Lengthn is perfect! thank you!.

However, when I try and make the string input a $, I was unable to use the total=lengthn(a) as it only read the numbers and left out the alphabets. Once I try adding the $ sign into the brackets an error shows up. 

 

Libname Alex;
Data count;
input a $;
datalines;
12324564lkahrli3q2hi.
;
run;
Data CountIt;
set count;
total = lengthn(a);
proc print data=CountIt ;
run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please update your code so that is reads correctly.  You are also missing a length statement, so SAS will default to 8 character length:

data count;
  length a $100; 
  input a $;
datalines;
12324564lkahrli3q2hi.
;
run;

data countit;
  set count;
  total = lengthn(a);
run;
AlexMoreton
Obsidian | Level 7

Thank you RW9! it now works perfectly!

quick question, is it possible to not set a limit for length? 

like from length a $100; 

to length a $(max);

Something like that?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

The question should by why.  Do you not know what the maximum length for data will be?  What is the specification for the data (you have one right, i.e. a document which defines the metadata or structure of the data, wether is character/numeric, lengths, formats etc.).

Reeza
Super User

@AlexMoreton wrote:

Thank you RW9! it now works perfectly!

quick question, is it possible to not set a limit for length? 

like from length a $100; 

to length a $(max);

Something like that?


That's a bad idea, you should have a max length, otherwise you're using a lot of storage space to try and save a variable that may only be 8 characters long. 

 

For example, modifying a data set to trim the length of 3 character variables from $200 to $50 dropped my file size by 300MB. It's a big space and efficiency consideration. 

art297
Opal | Level 21

Not sure if this is what you are looking for, but you can always use an extra data step to identify the maximum length of your variable. The following code would capture the string (and its length) even if the string included a space or multiple spaces:

 

filename FT15F001 temp;
data _null_;
   infile FT15F001 end=eof;
   retain maxl;
   input;
   maxl=max(maxl,lengthn(_infile_));
   if eof then call symput('maxl',maxl);
   parmcards;
12324564lkahrl i3q2hi.
12324564lkahrli3q2hi.ab    cdefghijklmnop
1234
12324564lkahrli 3q2hi.abc    defghijklmnopqrstuvwxyz
abcd    efg
;;;;

data count;
  length a $&maxl.;
  input;
  total = lengthn(_infile_);
  a=_infile_;
  datalines;
12324564lkahrl i3q2hi.
12324564lkahrli3q2hi.ab    cdefghijklmnop
1234
12324564lkahrli 3q2hi.abc    defghijklmnopqrstuvwxyz
abcd    efg
;

Art, CEO, AnalystFinder.com

 

jmhorstman
Obsidian | Level 7

 

The problem is that you've told COUNTC to count the number of asterisks in the string, which is 0.  The purpose of the COUNTC function is to count the number of occurrences of a specific character (or characters).  If all you care about is the total number of characters, try the LENGTHN function instead:

 

     total = lengthn(a);

 

Also, I should mention that your first data step creates a numeric variable, not a character variable.  If you actually want a character variable, it would be preferable to explicitly create one:

     input a $;

 

Of course, the COUNTC function will do an automatic type conversion when you give it a numeric variable instead of a character variable.  Not a recommended practice, but it works.

AlexMoreton
Obsidian | Level 7

Thank you jmhorstman,

The lengthn trick works! But unfortunately the $ sign didnt seem to work... It instead made the observation of the string a " . "

Not sure whats the problem because lengthn can still read the numbers fine... just not the characters... also the lengthn result came back as 12 when there are only 10. Is that because the alphabetic characters in between them act as a space?

Thank you

 

Libname Alex;
Data count;
Infile datalines truncover ;
input a $;
datalines;
12324564lkahrli3q2hi
;
run;
Data CountIt;
set count;
total = lengthn(a);
proc print data=CountIt ;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 27274 views
  • 1 like
  • 6 in conversation