BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tjsari
Fluorite | Level 6

I have a character field with length 3 in a data set. Some sample values are: 001, 234, 546,1, 25, 025.

I am using tabulate to identify valid and invalid values in that field. The problem is that this code is not differentiating between the value 001 and 1 or between 25 and 025.

 

proc format; 

   value $code (notsorted)

           " " = "blank"

           "000"-"999" = "3 digits";

run;

 

proc tabulate data=dataset1 missing;

   format code $code.;

   class code / preloadformat;

   table code, n / printmiss;

run; 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@tjsari wrote:

The variable "code" is character length 3


Length is a property, formats work on values.

 

Unfortunately when doing character comparisons '1' is greater than '000' because in the character by character comparison the 1 as the first character is greater than the 0 of 000.

 

You would have to list all of the values to apply the format two and not use a range: '000', '001', '002' ... '999'. Not impossible

 

data work.cntl;
   fmtname='Code';
   type='C';
   length label $ 10 start $ 3;
   start=''; label='blank';output;
   do i= 0 to 9;
      label='1 digit';
      start= put(i,z1.);
      output;
   end;
   do i= 0 to 99;
      label='2 digits';
      start= put(i,z2.);
      output;
   end;
   do i= 0 to 999;
      label='3 digits';
      start= put(i,z3.);
      output;
   end;
run;

proc format library=work cntlin=work.cntl;
run;

data junk;
   length x $ 3;
   x=''; put x $code.;
   x='1'; put x $code.;
   x='01';put x $code.;
   x='101';put x $code.;
   x='001';put x $code.;
run;

If you have ' 1' values, with an actual leading space then the above code will need something to create values with leading spaces.

 

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Sounds like you have accidentally loaded the values into a numeric variable instead of a character variable.

Can you re-read the source data?

tjsari
Fluorite | Level 6

The variable "code" is character length 3

ballardw
Super User

@tjsari wrote:

The variable "code" is character length 3


Length is a property, formats work on values.

 

Unfortunately when doing character comparisons '1' is greater than '000' because in the character by character comparison the 1 as the first character is greater than the 0 of 000.

 

You would have to list all of the values to apply the format two and not use a range: '000', '001', '002' ... '999'. Not impossible

 

data work.cntl;
   fmtname='Code';
   type='C';
   length label $ 10 start $ 3;
   start=''; label='blank';output;
   do i= 0 to 9;
      label='1 digit';
      start= put(i,z1.);
      output;
   end;
   do i= 0 to 99;
      label='2 digits';
      start= put(i,z2.);
      output;
   end;
   do i= 0 to 999;
      label='3 digits';
      start= put(i,z3.);
      output;
   end;
run;

proc format library=work cntlin=work.cntl;
run;

data junk;
   length x $ 3;
   x=''; put x $code.;
   x='1'; put x $code.;
   x='01';put x $code.;
   x='101';put x $code.;
   x='001';put x $code.;
run;

If you have ' 1' values, with an actual leading space then the above code will need something to create values with leading spaces.

 

Astounding
PROC Star

Before you go too far down this path, you should be aware of a pitfall.  As character strings, the range "000" - "999" includes many values that are not three digits.  All of these values are included in that range:

 

5 4

3AD

68

1234 (perhaps not applicable, if your variable is defined as character length 3)

3.1

 

You may want to rethink the plan here.

 

tjsari
Fluorite | Level 6

Yes, that is the problem. Is there another way to identify the observations that have 3 digits and those who do not have 3 digits?

Astounding
PROC Star

In a DATA step:

 

if length(var) = 3 and (100 <= input(var, ??3.) <= 999);

 

A WHERE statement becomes tricky since it does not permit the "??" to suppress messages about invalid data.  It generates a warning when you remove the "??" but WHERE is legal in a PROC step if you can live with the warning.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1726 views
  • 2 likes
  • 4 in conversation