BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tjsari
Fluorite | Level 6

I have a character field with length 3 in a data set. Some sample values are: 001, 234, 546,1, 25, 025.

I am using tabulate to identify valid and invalid values in that field. The problem is that this code is not differentiating between the value 001 and 1 or between 25 and 025.

 

proc format; 

   value $code (notsorted)

           " " = "blank"

           "000"-"999" = "3 digits";

run;

 

proc tabulate data=dataset1 missing;

   format code $code.;

   class code / preloadformat;

   table code, n / printmiss;

run; 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@tjsari wrote:

The variable "code" is character length 3


Length is a property, formats work on values.

 

Unfortunately when doing character comparisons '1' is greater than '000' because in the character by character comparison the 1 as the first character is greater than the 0 of 000.

 

You would have to list all of the values to apply the format two and not use a range: '000', '001', '002' ... '999'. Not impossible

 

data work.cntl;
   fmtname='Code';
   type='C';
   length label $ 10 start $ 3;
   start=''; label='blank';output;
   do i= 0 to 9;
      label='1 digit';
      start= put(i,z1.);
      output;
   end;
   do i= 0 to 99;
      label='2 digits';
      start= put(i,z2.);
      output;
   end;
   do i= 0 to 999;
      label='3 digits';
      start= put(i,z3.);
      output;
   end;
run;

proc format library=work cntlin=work.cntl;
run;

data junk;
   length x $ 3;
   x=''; put x $code.;
   x='1'; put x $code.;
   x='01';put x $code.;
   x='101';put x $code.;
   x='001';put x $code.;
run;

If you have ' 1' values, with an actual leading space then the above code will need something to create values with leading spaces.

 

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Sounds like you have accidentally loaded the values into a numeric variable instead of a character variable.

Can you re-read the source data?

tjsari
Fluorite | Level 6

The variable "code" is character length 3

ballardw
Super User

@tjsari wrote:

The variable "code" is character length 3


Length is a property, formats work on values.

 

Unfortunately when doing character comparisons '1' is greater than '000' because in the character by character comparison the 1 as the first character is greater than the 0 of 000.

 

You would have to list all of the values to apply the format two and not use a range: '000', '001', '002' ... '999'. Not impossible

 

data work.cntl;
   fmtname='Code';
   type='C';
   length label $ 10 start $ 3;
   start=''; label='blank';output;
   do i= 0 to 9;
      label='1 digit';
      start= put(i,z1.);
      output;
   end;
   do i= 0 to 99;
      label='2 digits';
      start= put(i,z2.);
      output;
   end;
   do i= 0 to 999;
      label='3 digits';
      start= put(i,z3.);
      output;
   end;
run;

proc format library=work cntlin=work.cntl;
run;

data junk;
   length x $ 3;
   x=''; put x $code.;
   x='1'; put x $code.;
   x='01';put x $code.;
   x='101';put x $code.;
   x='001';put x $code.;
run;

If you have ' 1' values, with an actual leading space then the above code will need something to create values with leading spaces.

 

Astounding
PROC Star

Before you go too far down this path, you should be aware of a pitfall.  As character strings, the range "000" - "999" includes many values that are not three digits.  All of these values are included in that range:

 

5 4

3AD

68

1234 (perhaps not applicable, if your variable is defined as character length 3)

3.1

 

You may want to rethink the plan here.

 

tjsari
Fluorite | Level 6

Yes, that is the problem. Is there another way to identify the observations that have 3 digits and those who do not have 3 digits?

Astounding
PROC Star

In a DATA step:

 

if length(var) = 3 and (100 <= input(var, ??3.) <= 999);

 

A WHERE statement becomes tricky since it does not permit the "??" to suppress messages about invalid data.  It generates a warning when you remove the "??" but WHERE is legal in a PROC step if you can live with the warning.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1138 views
  • 2 likes
  • 4 in conversation