The SAS Output Delivery System and reporting techniques

Variable Buckets creation in sas

Accepted Solution Solved
Reply
Contributor
Posts: 52
Accepted Solution

Variable Buckets creation in sas

Hi Everyone,

Please can someone let me know the ways in which a continuous variable can be converted into bins (buckets)?

Suggest alternative coding ways either using proc sql, data step or sas macro.

Thank you!


Accepted Solutions
Solution
‎10-08-2015 10:23 AM
Super User
Posts: 10,466

Re: Variable Buckets creation in sas

Big advantages of using the format approach for this:

1) you don't have to add any variables, so no additional passes through the entire data

2) you can assign the format as needed for specific analysis

3) one format can be assigned to multiple variables that use the same rule

4) multiple formats to regroup data as needed, just change the format for the proc at run time.

5) as long as values of only a single variable are considered it can be much easier to construct then if/then/else logic and checks to ensure that one start value is treated the same all the time (exception: multilabel but that's a bit more advanced and only useable for some procs)

6) used with special missing assignments .A - .Z you can get descriptions for what that missing means such as not asked, original value out of bounds (and differentiate between high and low), instrument failure, skip pattern and a plethora of conditions. And the missing values are still excluded from calculations such as means.

View solution in original post


All Replies
Super User
Super User
Posts: 7,392

Re: Variable Buckets creation in sas

And your example data in the form of a datastep, plus required output is?  At a guess:

data have;

  do i=1 to 1000;

    output;

  end;

run;

data want;

  set have;

  bucket=floor(_n_ / 50)+1;

run;

Contributor EH
Contributor
Posts: 32

Re: Variable Buckets creation in sas

My preferred way is format.

proc format;

   value age low-17 = '< 18 yrs'

                   18-64 = '18-65 yrs'

                   65-high = '> 65 yrs';

run;

data want;

   set have;

   length agegrp $ 15;

   agegrp = put(age, age.);

run;

Hth,

Eric

Contributor
Posts: 43

Re: Variable Buckets creation in sas

I like Proc Format for continuous numeric data.

You can use the LT symbol to get up to a boundary

Proc format ;

     value agegrp

     low - <18 = '0-18'

     18-<25 = '18-25'

     25-<35 = '25-35'

...

     85-high = '85 and over' ;

run ;

Ha!  Eric beat me to it!  Smiley Happy

Solution
‎10-08-2015 10:23 AM
Super User
Posts: 10,466

Re: Variable Buckets creation in sas

Big advantages of using the format approach for this:

1) you don't have to add any variables, so no additional passes through the entire data

2) you can assign the format as needed for specific analysis

3) one format can be assigned to multiple variables that use the same rule

4) multiple formats to regroup data as needed, just change the format for the proc at run time.

5) as long as values of only a single variable are considered it can be much easier to construct then if/then/else logic and checks to ensure that one start value is treated the same all the time (exception: multilabel but that's a bit more advanced and only useable for some procs)

6) used with special missing assignments .A - .Z you can get descriptions for what that missing means such as not asked, original value out of bounds (and differentiate between high and low), instrument failure, skip pattern and a plethora of conditions. And the missing values are still excluded from calculations such as means.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 1305 views
  • 1 like
  • 5 in conversation