BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mirisage
Obsidian | Level 7

Hi Collegues;


      I have a continuous variable called pcntge.
      I want discrete categoiries of it as shown below.

    data a;
      input pcntge;
      datalines;
       0.0001    /*any value GT zero should be categoirsed as 14*/
        .    /*missing values should come as missing (which means a dot)*/
        0   /*all zeros should be categorized as 1*/
      -9   /* LT 0 to GT -10 should be categorized as 2*/
      -19  /* LT -10 to GT -20 should be categoorized as 3*/
      -29  /* LT -20 to GT -30 should be categorized as 4*/
      -39  /* LT -30 to GT -40 should be categorized as 5*/
      -49  /* LT -40 to GT -50 should be categorized as 6*/
      -59  /* LT -50 to GT -60 should be categorized as 7*/
      -69  /* LT -60 to GT -70 should be categorized as 8*/
      -79  /* LT -70 to GT -80 should be categorized as 9*/
      -89  /* LT -80 to GT -90 should be categorized as 10*/
      -99  /* LT -90 to GT -100 should be categorized as 11*/
      -100 /*all -100 should be categoirsed as 12*/
      -100.001 /*any value LT -100 should be categoirsed as 13*/
      ;
      Run;

This is the code I have attempted but didn't work. Any help would be really apprecaited.


      Data b;
             set a  ;

                   if (. <pcntge <= 0) then category=1;
              else if (0 <PCNTGE<= -10) then category=2;
              else if (-10 < PCNTGE<=  -20)  then category=3;
              else if (-20 < PCNTGE<= -30  ) then category=4;
              else if (-30 < PCNTGE<= -40  ) then category=5;
              else if (-40 < PCNTGE<= -50  ) then category=6;
              else if (-50 < PCNTGE<= -60  ) then category=7;
              else if (-60 < PCNTGE<=  -70)  then category=8;
              else if (-70 < PCNTGE<= -80  ) then category=9;
              else if (-80 < PCNTGE<= -90  ) then category=10;
              else if (-90 < PCNTGE<= -100  ) then category=11;

              else if (PCNTGE=-100) then category=12;
              else if (-100 < PCNTGE  ) then category=13;
      run;

Thanks

Mirisage

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I may not have correctly captured your rules, but I think that the following is closer to what you want:

data a;

  input pcntge;

  cards;

.2

.0001

.

-.0001

-1

-11

-70

;

Data b;

  set a  ;

  if missing(pcntge) then do;

    call missing(category);

  end;

  else if pcntge > 0 then category=14;

  else if pcntge = 0 then category=1;

  else if pcntge > -10 then category=2;

  else if pcntge > -20 then category=3;

  else if pcntge > -30 then category=4;

  else if pcntge > -40 then category=5;

  else if pcntge > -50 then category=6;

  else if pcntge > -60 then category=7;

  else if pcntge > -70 then category=8;

  else if pcntge > -80 then category=9;

  else if pcntge > -90 then category=10;

  else if pcntge > -100 then category=11;

  else if pcntge > -100 then category=12;

run;

View solution in original post

7 REPLIES 7
art297
Opal | Level 21

I may not have correctly captured your rules, but I think that the following is closer to what you want:

data a;

  input pcntge;

  cards;

.2

.0001

.

-.0001

-1

-11

-70

;

Data b;

  set a  ;

  if missing(pcntge) then do;

    call missing(category);

  end;

  else if pcntge > 0 then category=14;

  else if pcntge = 0 then category=1;

  else if pcntge > -10 then category=2;

  else if pcntge > -20 then category=3;

  else if pcntge > -30 then category=4;

  else if pcntge > -40 then category=5;

  else if pcntge > -50 then category=6;

  else if pcntge > -60 then category=7;

  else if pcntge > -70 then category=8;

  else if pcntge > -80 then category=9;

  else if pcntge > -90 then category=10;

  else if pcntge > -100 then category=11;

  else if pcntge > -100 then category=12;

run;

Mirisage
Obsidian | Level 7

Hi Art,

Thank you very much for this code which works correctly after revising the last two statements (below is the revised one).

data b;

    SET a;

    if missing(pcntge) then do;

       call missing(category);

     end;

     else if pcntge > 0 then category=14;

     else if pcntge = 0 then category=1;

     else if pcntge > -10 then category=2;

     else if pcntge > -20 then category=3;

     else if pcntge > -30 then category=4;

     else if pcntge > -40 then category=5;

     else if pcntge > -50 then category=6;

     else if pcntge > -60 then category=7;

     else if pcntge > -70 then category=8;

     else if pcntge > -80 then category=9;

     else if pcntge > -90 then category=10;

     else if pcntge > -100 then category=11;

     else if pcntge = -100 then category=12;

     else if pcntge < -100 then category=13;

   run;

Thanks

Mirisage

Patrick
Opal | Level 21

Your conditions are never. Instead of:

else if (-10 < PCNTGE<= -20) then category=3;

it should be:

else if (-10 > PCNTGE>= -20) then category=3;

Besides of if statements you could also use a format like below:

proc format;
  value _recode (min=17)
    <0 - high   = 14
     0          = 1
   -10 -<  0    = 2
   -20 -< -10   = 3
   -30 -< -20   = 4
   -40 -< -30   = 5
   -50 -< -40   = 6
   -60 -< -50   = 7
   -70 -< -60   = 8
   -80 -< -70   = 9
   -90 -< -80   = 10
   -100<-< -90  = 11
   -100         = 12
   low -< -100  = 13
;
run;

data a;
  input pcntge;
  format pcntge category best32.;
  category=input(put(pcntge,_recode.),best32.);
  datalines;
0.0001 /*any value GT zero should be categoirsed as 14*/
. /*missing values should come as missing (which means a dot)*/
0 /*all zeros should be categorized as 1*/
-9 /* LT 0 to GT -10 should be categorized as 2*/
-19 /* LT -10 to GT -20 should be categoorized as 3*/
-29 /* LT -20 to GT -30 should be categorized as 4*/
-39 /* LT -30 to GT -40 should be categorized as 5*/
-49 /* LT -40 to GT -50 should be categorized as 6*/
-59 /* LT -50 to GT -60 should be categorized as 7*/
-69 /* LT -60 to GT -70 should be categorized as 8*/
-79 /* LT -70 to GT -80 should be categorized as 9*/
-89 /* LT -80 to GT -90 should be categorized as 10*/
-99 /* LT -90 to GT -100 should be categorized as 11*/
-100 /*all -100 should be categoirsed as 12*/
-100.001 /*any value LT -100 should be categoirsed as 13*/
;
Run;

Mirisage
Obsidian | Level 7

Hi Patrick,

This is great!

Thank you very much.

This works well only when I revise the first line of category definition under "proc format" as follows.

Instead of  "  <0 - high   = 14", as you have suggested, I had to revise it to " 0 - high   = 14". Then only code works. However, logically it has to be

>0 - high   = 14, isn't it? But when I incorporate >0 - high   = 14, the code doesn't work?

If you have time, could you please shed some light "how come 0 - high   = 14 works while it has to be >0 - high   = 14   logically, which doesn't work.

Thanks again

Mirisage

Tom
Super User Tom
Super User

The syntax you used for eliminating the lower bound from the range was wrong.  Also SAS will automatically assign a value that is the upper and lower bounds of two ranges to the lower range.

See the manual pages for PROC FORMAT.

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473474.htm

You can use the less than (<) symbol to exclude values from ranges. If you are excluding the first value in a range, then put the < after the value. If you are excluding the last value in a range, then put the < before the value. For example, the following range does not include 0:

   0<-100

Likewise, the following range does not include 100:

   0-<100

If a value at the high end of one range also appears at the low end of another range, and you do not use the < noninclusion notation, then PROC FORMAT assigns the value to the first range. For example, in the following ranges, the value AJ is part of the first range:

'AA'-'AJ'=1 'AJ'-'AZ'=2

In this example, to include the value AJ in the second range, use the noninclusive notation on the first range:

   'AA'-<'AJ'=1 'AJ'-'AZ'=2

Patrick
Opal | Level 21

Tom is of course right. It should be:  0 <- high = 14

Mirisage
Obsidian | Level 7

Hi Tom and Patrick,

Wish you a happy 2012!

Tom, I clearly understood the logic by your nice explanation.

Thank you very much.

Mirisage

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 11297 views
  • 3 likes
  • 4 in conversation