DATA Step, Macro, Functions and more

Categorizing so that SAS assigns an equal number of data to each group

Reply
Occasional Contributor
Posts: 16

Categorizing so that SAS assigns an equal number of data to each group

Hi,

How do you code so that SAS assigns an equal number of data into each of the 4 different categories? I am working with AGE variable (continuous ratio variable), and I just want to categorize them into 4 categories that would give me the same sample size in each group. Sometimes, I'd like to have an equal number of data in 5, 6, 7 categories as well. People have told me that I could possibly use the statement PROC RANK but I really don't know how to use it even after I looked it up. How do I need to code so that I can achieve this? I'd like to have an output like this:

      Age group                # of people in each group

----------------------------       ----------------------------------------

0 to ?? years old                2000 people

?? to ?? years old              2000 people

?? to ?? years old              2000 people

?? to 100 years old            2000 people

Thank you!

GF

Respected Advisor
Posts: 3,777

Re: Categorizing so that SAS assigns an equal number of data to each group

GROUPS = 5 or 6 or 7

VAR AGE;

RANKS AGEGroup;

Occasional Contributor
Posts: 16

Re: Categorizing so that SAS assigns an equal number of data to each group

Does this code go between the data step? Do I add the run statement after RANKS AGEGroup? Does it work for 4 groups too? I just ran this

GROUPS = 4

VAR AGE;

RANKS AGEGroup;

run;

My log gave me the following error

1415  GROUPS = 4

      ------

      180

ERROR 180-322: Statement is not valid or it is used out of proper order.

1416

1417  VAR AGE;

1418  RANKS AGEGroup;

      -----

      180

ERROR 180-322: Statement is not valid or it is used out of proper order.

1419  run;

I'd appreciate any additional comments regarding the error. Thank you.

GF

Respected Advisor
Posts: 3,777

Re: Categorizing so that SAS assigns an equal number of data to each group

Occasional Contributor
Posts: 16

Re: Categorizing so that SAS assigns an equal number of data to each group

Yeah, I already checked it before posting my question but was of no help. I need to see how the code is being written.

Super User
Posts: 17,776

Re: Categorizing so that SAS assigns an equal number of data to each group

You need to look in the example section of the documentation when looking to see how the code is written, though the link looks the same as Data _null_'s its not.

proc rank data=have out=want ties=low groups=4;

rank age;

ranks age_rank;

run;

Base SAS(R) 9.2 Procedures Guide

Occasional Contributor
Posts: 16

Re: Categorizing so that SAS assigns an equal number of data to each group

I tried

proc rank data=xxx.xxxxx out=WANT ties=low groups=4;

rank AGE;

ranks AGE_rank;

Run;

I am getting the log errors for the following:

1. SAS doesn't like out=WANT. Instead of WANT, what do I need to insert here?

2. Also, it doesn't define AGE_rank because it was my first stating in SAS. How am I supposed to define AGE-rank?

3. The RANK in my second line is in red. I don't know why.

I have been trying to figure out how to do this since Saturday. Please help.....

Thanks

GF

Super User
Posts: 17,776

Re: Categorizing so that SAS assigns an equal number of data to each group

You should post your log when you have an error.

1) out=want isn't the issue, 2) I don't know what you mean, 3) that means you did something wrong.

The following works for me.

proc rank data=sashelp.class out=want groups=3 ties=low;

    var age;

    ranks age_rank;

run;

proc print; run;

Occasional Contributor
Posts: 16

Re: Categorizing so that SAS assigns an equal number of data to each group

I ran the following:

proc rank data=xxx.xxx out=want groups=4 ties=low;

var AGE;

ranks AGE_rank;

run;

proc freq data=xxx.xxx;

table AGE_rank;

run;

Then, I see the following error on my log

527

528  proc freq data=bmi.new;

529  table AGE_rank;

ERROR: Variable AGE_RANK not found.

530  run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE FREQ used (Total process time):

      real time           0.00 seconds

      cpu time            0.00 seconds

So, SAS is asking me to define AGE_RANK. What do I need to code to resolve this error?

GF

Respected Advisor
Posts: 3,887

Re: Categorizing so that SAS assigns an equal number of data to each group

That's because variable AGE_RANK is created by Proc Rank. So you need to use the output data set in Proc Freq (so "want" instead of "xxx.xxx").

If you've got Proc Surveyselect licensed then you could also use Proc Rank to create a strata variable which you then use as part of Proc Surveyselect.

There is also a data step approach under this link: http://support.sas.com/resources/papers/proceedings09/058-2009.pdf

Ask a Question
Discussion stats
  • 9 replies
  • 876 views
  • 0 likes
  • 4 in conversation